Cambridge trials UNSILO related content

Cambridge University Press is currently running a trial of UNSILO related content on Cambridge Core, their main book and journal platform.

Building an entirely new platform for hosting scholarly content is not for the faint-hearted. Cambridge Core is one of the great success stories of academic software development. It was built by the extended in-house Cambridge team over a two-year period, replacing a journals platform that had been running for 18 years, and a separate books platform that had been running for more than six years.

Cambridge Core was designed and built from the ground up around customer needs, following a global research effort with over 10,000 customers participating, and using a wholly new architecture, which integrated books and journals alongside other content types. The Platform was also designed to allow for further enhancements, such as services identifying related content, to be added later.

However, adding links to related content would be challenging to carry out manually. The new Cambridge Core platform stores all books as self-contained chapters, which is great for providing rapid and precise access to relevant content, but academic books do not have as much meaningful metadata as you would find, for example, in a journal abstract, making it harder to use as a summary.

To get around this problem, Cambridge is investigating the use of machine-learning tools such as UNSILO, which can identify relevant concepts from any document automatically, and then enable links to other relevant content in the Cambridge collection. The advantages of such a methodology are:

  • UNSILO requires no pre-existing taxonomy, which is especially important for a publisher such as Cambridge that has a very wide-ranging content collection, from Anglo-Saxon studies to international law and to neurology;
  • UNSILO is genuinely interdisciplinary, which means a history article can be linked to a relevant science article about, say, climate change in the 15th century;
  • UNSILO connections within Cambridge Core content are continuously updated; As new content is published, the key concepts in the text are automatically extracted, and the new content is immediately connected to the most relevant existing research in the same field.
  • UNSILO connections are based on the actual content, and does not require publishers to track their users. And while traditional technologies have cold-start problems that make it especially difficult to recommend the newest content,  UNSILO can immediately recommend a new article to users looking at similar content.
  • UNSILO connections are not limited by granularity; While human-created domain taxonomies are often limited to a few thousand concepts, UNSILO automatically extracts hundreds of concepts for every chapter or article, allowing more precise and detailed connections between documents to be identified than any human effort ever could.
  • UNSILO connections can be explained and verified by humans. Some A.I. algorithms are effectively black boxes, but the connections identified by UNSILO are based on verifiable overlap between the concepts and relationships discussed in several documents, even when authors use slightly different terminology to describe the same phenomena.

The UNSILO related content feature is being tested on a number of Journals in academic domains that traditionally aren’t well covered by ontologies and tightly curated taxonomies, including American Antiquity, the premier journal of North American archaeology. Check out UNSILO recommendations for some of their most cited papers

Of course, the proof will come from the users themselves. During the next few months, some Cambridge Core users will see links to related content on the site, and Cambridge will monitor the usage and assess the usefulness of these links. Whatever the result of the trial, one thing is certain: A.I. can provide a way of identifying content relatedness at scale, in a way that simply is not feasible using human indexers. Cambridge University Press is using leading-edge technology to solve age-old problems.

 


Receive an email every time we publish a new blog post