UKON 2016: The Workflows of Ontology Authoring: Controlled vs. Naturalistic Settings

These are my notes for a talk at the UK Ontology Network Meeting on 14 April, 2016 by Markel Vigo, Nicolas Matentzoglu, Caroline Jay, Robert Stevens.

Source 14 April 2016

4 years ago we knew very little about how ontologist actually perform their authoring. They wanted to know about typical authoring workflows and the effectiveness of the current tool support. There were interviews with ontologists at UKON 2014 and a user study in the lab at UKON 2015 which helped identify the workflows for exploration, editing and reasoning. This work does have implications for tool design.

In a lab study, the external validity is at risk. Tasks were predefined, with a provided ontology, and they could be in an unfamiliar environment. Here, there were 16 users and times ranged between 30-75 minutes and used a modified version of Protege4US and eye tracking. Therefore they also did a remote study where people worked in their own environment and on their own ontologies. They got 7 users for this part.

They collect the raw event data from the users as log files. Then the data is cleaned and put into a CSV file. Then the same consecutive events are merged. Then they performed workflow mining through N-gram analysis. There were 9K events in the lab and 30K events remotely (doesn’t include mouse hovering events). Lab study had a dominance of entity selection, while in remote study the vast majority are the hierarchy extending events (people’s remote ontologies are larger). There was more variety in the remote setting (more heavy editing, more uncertainty in how we want to model things, more searching, more individuals and annotations). They also looked at how workflows linked together, and if one commonly preceded or followed another.

The remote study does corroborate lab study, but also extends it. The next step is to evaluate the inference inspector, and to explore other avenues, e.g. task difficulty estimation using pupillometry. Also, they’d like to cross-compare data from more than 6 independent studies.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!


