UKON 2016: Identifying Basic Level Entities in a Data Graph

These are my notes for the Marwan Al-Tawil, Vania Dimitrova, Dhaval Thakker, Brandon Bennett talk at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

What makes some entities more important, and provide better paths? He observed in his study that central entities with many subclasses are good potential anchors, recognition is a key enabler for knowledge expansion, and encourage connections to discover new entities linked to recognized ones. How can we develop automatic ways to identify knowledge anchors? Category objects (commonly-used objects from daily life) carry the most information, possess the highest category cue validity and are, therefore, the most differentiated from one another.

They have two approaches: distinctiveness (identifies the most differentiated entities whose cues link to its members, and not to other entities) and homogeneity. Distinctiveness metrics were adopted from formal concept analysis, and applied to the ontology. The homogeneity metrics were created with set-based similarity metrics.

Experiment and evaluation. Images of all taxonomical entities linked via subClassOf were presented in 10 different surveys. Benchmarking sets were used to determine accuracy and frequency. Questions were of two types: accurate naming of a category entity (parent) when a leaf entity is seen, and accurate naming with its exact name, child or parent.

When analysing the data, they found that precision values were poor. Inspecting the false positives, they noticed two reasons: picking entities with a low number of subclasses, and returning FP entities which had long label names.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

UKON 2016 Short Talks IX

These are my notes for the ninth session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

Aligning and MErging Ontology in Al-Quran Domain
Mohammad Alqahtani, Eric Atwell

Speaker wasn’t available.

A formal ontological model of creativity for supporting and sharing creative insights of ‘digital living artifacts’
Patricia Charlton

Part of MC Squared, a project combining mathematics and creativity. They defined creativity via 5 concepts: elaboration, fluency, flexibility, originality, usefulness. The first three are what you find in the process, but the last two are what you find in the product/result.

They worked with the people who were making the resources for the students, and found out what terminology and processes they use (the learning pathway). Then ontology concepts were created and fed back to the original group until the ontology was finished. This work was then added back into the MC Squared project.

The BioSharing Registry: connecting data standards, policies and databases in the life sciences
Allyson Lister, Alejandra Gonzalez-Beltran, Eamonn Maguire, Peter McQuilton, Philippe Rocca-Serra, Milo Thurston, Susanna-Assunta Sansone

This is my talk, and therefore I couldn’t take any notes! But you can find us on https://www.biosharing.org.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

UKON 2016 Short Talks VIII

These are my notes for the eighth session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

A workflow of developing biological ontologies using a document-centric approach
Aisha Blfgeh, Catharien Hilkens, Phillip Lord

Ontologists know how to use domain-specific tools and applications to develop ontologies, while the biologist uses a completely different set of tools. Can we break through this wall and enable both groups to work together while still using the tools they like to use?

So, this brings us to the Ontologist and the Excel Spreadsheet. There are existing tools which transform an excel sheet once, but don’t allow further updates from that sheet. With this method (which will use Tawny OWL), the spreadsheet remains and becomes part of the ontology. It works almost the same way with a Word document, but here the ontology remains the master.

The PROHOW ontology: a semantic web of human tasks, instructions and activities
Paolo Pareti

A common-sense approach to describing human activities such as How to Make a Pancake.  They transformed existing instructions from wikiHow and snapGuide, extracting 200,000 procedures into a KB. What to do with this knowledge? Activity recognition is one thing (find out an ultimate goal from a few intermediate steps). They’ve been creating links between the different graphs. Often this allows the user of one activity to access extra information (a subgraph, for example) from another graph. Once this works well, you could implement a method to have a machine perform an activity for you (if it is a computer-based activity, e.g. Send an Email).

http://w3id.org/prohow

Combining Ontologies and Machine Learning to Capture Tacit Knowledge in Complex Decision Making
Yiannis Gatsoulis, Owais Mehmood, Vania Dimitrova, Anthony Cohn

This project has been created to help diagnose tunnels, as maintenance operations and the impact of a rail tunnel malfunction can be costly and catastrophic. Factors include tunnel age and external influences (traffic, weather). Tunnel diagnosis is a complex process for which there are few experts with a large amount of tacit knowledge.

PADTUN goal is a decision support system for the engineers. It is a highly complex knowledge set, and hard to describe. The system diagnosis possible tunnel illnesses based on a set of data. The current ontological model may not be enough. There are two key challenges: validation (inaccurate or missing rules, takes a long time to identify these problematic rule, and some rules are more reliable than others), and extension (identification of rules that cannot be articulated by the experts, addition of crucial aspects of the model).

They have survey data and contextual data to support maintenance planning and identify risk levels. Decision trees can be derived from this data to determine when they decide to close and repair the tunnel.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

UKON 2016: Assessing the Underworld Ontology for Integrated Inter-asset Management

These are my notes for the Saisakul Chernbumroong, Heshan Du, Derek Magee, Vania Dimitrova, Anthony Cohn talk at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

There is a need to manage different types of assets together, e.g. assets buried under the ground and assets on the surface of the ground. The ATU ontology aims to define main concepts such as buried assets, soil, land cover, the environment, human activities and how data can be collected. ATU is intended to be used as part of a decision support system. Here, there is a deterioration model and a cost model which is connected to a KB (which includes datasets and the ATU). The UI then interacts with the KB.

The ontology of soil properties and processes is an important part of this project as soil is a medium in many interactions between assets. Within this ontology, there are two main classes, SoilProperty and SoilProcess. The former has 176 subclasses, and the latter has 113 subclasses. The main relations are hasImpactOn <-> influencedBy (and are defined as transitive properties).

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

UKON 2016 Short Talks VII

These are my notes for the seventh session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

DReNIn_O: An application ontology for drug repositioning
Joseph Mullen, Anil Wipat, Simon Cockell

Drug repositioning is identifying new uses for existing drugs. Most marketed examples are found due to luck, and therefore there is a need for more systematic approaches to enable a more holistic view of drug interactions. To do this automatically, you would want to infer the link from drug to disease by looking at the targets for the drug.

A very high-level ontology was used in the project with 25 classes (http://drenin.ncl.ac.uk). The database has over 8 million triples, and there is a SPARQL endpoint.

Pharmacological Data Integration for Drug-Drug Interactions: Recent Developments and Future Challenges
Maria Herrero-Zazo, Isabel Segura-Bedmar, Paloma Martinez

DDI information resources are numerous, but hard to keep everything updated and integrated. There is a need for a tool that can predict DDIs and that can filter and find the desired information. Therefore they created DINTO (The DDI Ontology). They followed the Neon methodology to create the new ontology. They have integrated ontologies such as ChEBI and PKO, BRO, OAE. They also used SWRL (over 100 rules) to help them infer new interactions. DINTO is the largest and most comprehensive ontology in the DDI domain.

Exploring modelling choices: using non-functional characteristics to test an ontology add-on
Jennifer Warrender, Phillip Lord

Developed the Karyotype Ontology to address problems of karyotype representation. Karyotypes are complicated, not computationally amenable (images), and there are lots of them.

Karyotype Ontology was built using Tawny-OWL. It was built with a pattern-driven approach, allowing them to rapidly change the ontology even if it’s very large. One complex question was how to model the “affects” relationship, as it is difficult to determin a priori which representation would work the best. They investigated 3 “affects” models. With each, you implement the model in Tawny-OWL, generate multiple versions (1600 of various sizes) of the Karyotype ontology and then reason over them and examine how the ontologies scaled.

In this way you can determine which model for “affects” is best for the ontology wrt reasoning and scalability. Obvious result – increase in karyotypes == increase in reasoning time. Each model for “affects” is better for different purposes. Therefore, with Tawny-OWL, you can allow your users to choose which model is best suited to their needs.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

UKON 2016: The Straightened Mouse: translating spatial relations between ontologies and geometric models

These are my notes for the talk at the UK Ontology Network Meeting on 14 April, 2016 by Albert Burger, Kenneth McLeod, Chris Armit, Bill Hill, Richard Baldock.

Source http://www.emouseatlas.org/dj/media/images/modelSelector/large/EMA49.png 14 April 2016

EMAP is used to connect information to visual atlases (Biomedical Atlases) which provide virtual cross-sections of embryos, gene expression etc. Then should be a good link between the visual and the ontology (Anatomy Ontology). EMAP is not as formal as the FMA. BSPO is the Biological Spatial Ontology was created to define spacial direction for describing an organism. WHS (Waxholm Space) is a project dealing with atlas-based data integration. Other atlases include EMA and ABA Brain Atlas.

Integrating across atlases can be tricky. The Straightened Mouse is a project which examines the cartesian vs. natural coordinates. So, people take the 3D model and straighten it, then add axes. From there you can create an EMA 3D model with axes. This means you can now have an up-down direction that actually makes sense. Therefore, even though a curled up point may be below another point, biologically they can store the information that it is actually above it on the organism.

Once you have this information, you can apply these spatial relations in all other sorts of ways (you can say “lateral to the heart” and have it clearly mean something on the model).

So, they are taking information out of the text and pull it out into the information in the atlas. Equally, you can ask the atlas if any papers talk about a particular region.

Ontological challenges include standardization of the ontological-to-geometric space mapping. They also need to discover the most effective spatial descriptions. They wish to see if they can learn from human-to-human communication (versus a computer intermediary). What are the best KR languages to use (e.g. OWL, Prolog)? What are the best spatial reasoning solutions (e.g. RCC)?

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

UKON 2016 Short Talks VI

These are my notes for the sixth session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

What’s in a Name: identifiers in ontologies
Phillip Lord

Ontologies consist of lots of terms, and we need to be able to refer to them in an unambiguous way. There are lots of identifier schemes, so what are the characteristics of a good one? Should be semantics free, and they are often numeric and incremented. Is a numeric a good scheme? No…

…because there will be a guaranteed collision if two authors work on the ontology at once. You could use URIgen or similar, but a better solution is to use a random number generator.

Numeric identifiers are also difficult to remember, and therefore very easy to get wrong. And your wrong identifier might also be valid, but for a different class. So you could fix this by using a check-digit.

Numeric identifiers are hard to remember and hard to pronounce. Though the number can be hidden with tools, it doesn’t solve the problem. You could use Proquint, which is a bidirectional transformation to letters which alternate consonant and vowels and have vaguely pronouncable words to help people remember.

So, solutions: randomness, checked, pronouncable: https://github.com/phillord/identitas

A New Ontology Lookup Service at EMBL-EBI
Simon Jupp, Tony Burdett, Catherine Leroy, Thomas Liener, Olga Vrousgou, Helen Parkinson

OLS Bug Hunt: http://goo.gl/BKVCIE and http://www.ebi.ac.uk/ols/beta

The original OLS has an old codebase (nearly 10 years old in places), and was built around the OBO format (hence an outdated parser). Also built around an assumption that ontologies are available in a public VCS. Assumes that a term only exists in one ontology. Uses Oracle RDBMS and SQL for querying, which is suboptimal. API was SOAP, users want REST.

It has been rebuilt from scratch. Ontologies are polled by URL and not just VCSs. RESTful API, makes use of the Java OWL API behind the scenes, and has multiple indexes for scalable querying. There are 147 ontologies and 4.5 million terms.

Can load any OWL or SKOS file. Open source project at http://github.com/EBISPOT/OLS

An ontology-supported approach to predict automatically the proteases involved in the generation of peptides
Mercedes Arguello Casterleiro, Julie Klein, Robert Stevens

Peptides are useful biomarkers. The PxO (Proteasix Ontology) reuses other ontologies e.g. GO, NCBI Taxonomy, PRO. UniProtKB proteins are organized by Taxons and annotated with GO. They are trying to model the cleavage site patterns. To use peptides as biomarkers, you need lots of data and data linkages. They are using SPARQL queries to query their data.

TopFIND2 and Proteasix can help to automatically predict modification of protease activity.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!