Find relevant knowledge and discover new patterns
UNSILO Core extracts the most important semantic concepts of a document. Built upon Machine Learning technologies and Natural Language Processing, UNSILO Core captures phrases and precise meaning and understand semantic and syntactic variations within a document. It does not require any document metadata to analyze and establish connections inside a content collection. By automatic extraction of the semantic fingerprint of documents, our technology is purely based on unsupervised learning from the content itself and not by document popularity or other external sources.
Our technology is based on extraction of complex key concepts and term relationships from natural language text instead of individual words and is therefore considered as a more precise technology than those that are based on individual words. Furthermore the technology can work with or without guidance from an existing taxonomy or ontology.
UNSILO outperforms Google, IBM and Microsoft
In comparison to similar technologies from Google, IBM and Microsoft, UNSILO outperforms those in terms of returning more precise concepts and much less noise.
More information can be found in our white paper “Comparing UNSILO concept extraction to leading NLP cloud solutions”
The world leading concept-extraction technology from UNSILO has a widespread application for multiple industries in a variety of use cases spanning from finding related content in academic papers to finding precedent in case law archives.
UNSILO technology enables content owners to obtain a consistent way to tag natural language text files, whereby they save valuable time compared to curating contents by hand.
A: UNSILO requires no external list of terms to extract concepts. This is a major advance compared with many earlier machine-learning tools, which necessitated the creation of a list of subject terms before the system could index documents. However, if you have a taxonomy or ontology in a domain, UNSILO can use the terms in that list and identify them where they appear.
A: A common way of “understanding” a document is to identify entities contained within that document, for example names of places, people and things. However, this is considerably less precise than UNSILO’s multiple-word significant phrases. “Paris”, the city in France, is not the same thing as the “Treaty of Paris” or “Paris Accord”, but most simple entity extraction tools cannot disambiguate such multiple terms. In contrast, UNSILO’s emphasis on phrases rather than on individual words provides a much more accurate indicator of meaning.
A: We usually guarantee to process new documents for existing clients within 24 hours. However, the processing time (rather than the elapsed time) is considerably less than this, and we can provide a faster turn-round time where speed to publication is essential.
A: UNSILO provides tools for combining automatic and human processing. For example, the Package Manager uses machine learning when classifying documents by subject to identify those documents that would benefit from manual curation. In this way we estimate we can reduce the time taken to manually create a subject collection by more than two-thirds. This means the publisher has both lower costs and improved quality.
Very occasionally an error is identified in the core concept extraction process. We encourage all users to let us know when this happens, so we can update the engine.
A: Yes, we can do this without difficulty. If a publisher has, say, two subscription-based collections, A and B, we can provide related content links to every article in either collection or to both, as the publisher wishes.