Auto-collect-index

Auto-collect

Farmknowledge uses a variety of auto-collection techniques including Sparql, RSS and REST-API when offered, and scraping where needed.
Our harvest is a raw product list that is cleaned of amongst others academic material, conference papers, project reports. The next step is to index the cleaned harvest with the authorized terms from Cabi, Agrovoc and Eppo.

Auto-index

FarmKnowledge.info has a solid foundation on the Cabi Thesaurus, Agrovoc, Eppo Database. These widely used and broadly recognised thesauri have been combined with language functionalities from Wikidata, resulting in very powerfull search functionality.

Combined thesauri allow our automatic indexation pipeline to recognise synonyms, language versions, singular or plural spelling as well as misspellings, and and add content with preferred terms from Cabi, Agrovoc, Eppo

For instance, a document with the title "Colorado Potato Beetle: Organic Control Options" is automaticlly tagged with the Solanum tuberosum (preferred scientific name for Potatoes), Leptinotarsa decemlineata (commonly known as Colorado potato beetle or Colorado beetle) and Organic farming. See the image on the right.

Next, if a user searches for კოლორადოს კარტოფილის ხოჭო, he will be directed to the page Leptinotarsa decemlineata. The search for پەتاتە will lead to the page on Potatoes (Solanum tuberosum). Both pages list the document download "Colorado Potato Beetle: Organic Control Options".