A Community-based Framework for Ontology Evaluation
Marzieh Talebpour, Thomas Jackson and Martin Sykora (Loughborough)
There are many systems supporting ontology discovery and selection. She reviewed 40 systems in the literature and came up with a generic framework to describe them. They all have a collection of ontologies gained by various means and then they receive added curation. She wanted to evaluate the quality of ontologies and aid the selection process through metrics. There are three groups of such metrics – internal, metadata and social metrics. Although you can group them in this way, do knowledge and ontology engineers actually consider social metrics when evaluating the ontologies?
She interviewed ontologists to discover what they saw as important. After getting the initial list of metrics from the interviews, she did a survey of a larger group to rank the metrics.
Towards a harmonised subject and domain annotation of FAIRsharing standards, databases and policies
Allyson Lister, Peter Mcquilton, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra, Milo Thurston, Massimiliano Izzo and Susanna-Assunta Sansone (Oxford)
(This was my talk so I didn’t take any notes, so here’s a summary)
FAIRsharing (https://www.fairsharing.org) is a manually-curated, cross-discipline, searchable portal of three linked registries covering standards, databases and data policies. Every record is designed to be interlinked, providing a detailed description not only of the resource itself, but also its relationship to other resources.
As FAIRsharing has grown, over 1000 domain tags across all areas of research have been added by users and curators. This tagging system, essentially a flat list, has become unwieldy and limited. To provide a hierarchical structure and richer semantics, two application ontologies drawn from multiple community ontologies were created to supplement these user tags. FAIRsharing domain tags are now divided into three separate fields:
- Subject Resource Application Ontology (SRAO) – a hierarchy of academic disciplines that formalises the re3data subject list (https://www.re3data.org/browse/by-subject/). Combined with subsets of six additional ontologies, SRAO provides over 350 classes.
- Domain Resource Application Ontology (DRAO) – a hierarchy of specific research domains and descriptors. Fifty external ontologies are used to provide over 1000 classes.
- Free-text user tags. A small number of FAIRsharing domain tags were not mappable to external ontologies and are retained as user tags. Existing and new user tags may be promoted to either application ontology as required.
From the initial user tags to the development of the new application ontologies, our work has been led by the FAIRsharing community and has drawn on publicly-available resources. The FAIRsharing application ontologies are
- Community driven – our users have created the majority of the terms, providing the initial scope for DRAO and SRAO.
- Community derived – to describe the wide range of resources available in FAIRsharing, we imported subsets of over fifty publicly-available ontologies, many of which have been developed as part of the OBO Foundry.
- Community accessible – with over 1400 classes described, these cross-domain application ontologies are available from our Github repositories (https://github.com/FAIRsharing/subject-ontology, https://github.com/FAIRsharing/domain-ontology) and are covered by a CC BY-SA 4.0 licence.
Guidelines for the Minimum Information for the Reporting of an Ontology (MIRO)
Nicolas Matentzoglu (EMBL-EBI), James Malone (SciBite), Christopher Mungall (The Lawrence Berkeley National Laboratory) and Robert Stevens (Manchester)
Ontologies need metadata, and we need a minimal list of required metadata for ontologies. They started with a self-made list, and then created a survey that was widely dispersed. The stats from that survey were then used to discover what was most important to you. Reporting items include: Basics, Motivation, Scope, Knowledge acquisition, ontology content, managing change, quality assurance.
What was surprising was the amount of items that were considered very important and ended up with a MUST in MIRO. The ones with the highest score were URL, name, owner and license (clearly). The bottom three were less obvious: content selection, source knowledge location and development environment.
They then tested retrospective compliance by looking through publications – ended up with 15 papers. The scope and coverage, need, KR language, target audience, and axiom patterns were very well represented. Badly represented were ontology license, change management, testing, sustainability, and entity deprecation policy.
Testing was both not reported and not considered important. Allyson note: I think that this is self fulfilling – there is no really good way to test other than running a reasoner, so something like Tawny OWL allows this, and therefore create an interest in actually doing so.
Tawny-OWL: A Richer Ontology Development Environment
Phillip Lord (Newcastle)
Tawny OWL is a mature environment for Ontology development. It provides a very different model than other existing methods. It allows for literate ontology development. Most people use Protege, others use the OWL API. The driving use case was the development of an ontology of the human chromosomes – complex to describe, but regular. 23 chromosomes, 1000 bands, and the Protege UI can’t really handle the number of classes required.
Tawny OWL is an interactive environment built on Clojure and you can use any IDE or editor that knows about Clojure / leiningen. You can then replace a lot of the ontology-specific tools and use more generic ones – versioning with git, unit testing with clojure, dependency management with Maven, continuous integration with Travis-CI.
It allows for literate development because it allows for fully descriptive documentation / implementation comment (stuff you’d put in code that isn’t meant to be user facing) which wasn’t really possible in the past. Version 2.0 has regularization and reimplementation of the core, patternization support (gems and tiers), a 70 page manual, project templates with integrated web-based IDE, and is internationalizable.
Automating ontology releases with ROBOT
Simon Jupp (EBI), James Overton (Knocean), Helen Parkinson (EBI) and Christopher Mungall (The Lawrence Berkeley National Laboratory)
Why do we automate ontology releases? When you have a regular release cycle which triggers release of other services. You also have the creation of various versions of the ontology. What happens as part of the release? Pull desired sections of various ontologies – desired terms are kept in a TSV file.
ROBOT is an ontology release toolkit. It is both a library and a command-line tool. Commands can be chained together to create production workflows. Within EFO, the ROBOT commands are added to the EFO makefile, where the ontology release is treated as a compile step. This allows testing to happen prior to release.
ROBOT commands include merging, annotation, querying, reasoning, template (TSV -> OWL), and verification.
Bioschemas Community: Developing profiles over Schema.org to make life sciences resources more findable
Alasdair Gray (Heriot-Watt) and The Bioschemas Community (Bioschemas)
They are asking for developers to add 6 minimum properties. The specification is added on top of the schema.org specification. Over 200 people involved in a number of workshops. To create bioschemas, they identify use cases and then map to existing ontologies. Then a specification is created, tested and then applied.
They’ve had to create a few new types which schemas.org didn’t have (e.g. Lab Protocol, biological entity). 16 sites have deployed this, including FAIRsharing. Will other search engines respect this? The major 7 search engines are using schema markup.
Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!