Distinct effects of ontology visualizations in entailment and consistency checking
Yuri Sato (Brighton), Gem Stapleton (Brighton), Mateja Jamnik (Cambridge) and Zohreh Shams (Cambridge)
When describing world knowledge, a choice must be made about its representation. We explore the manner in which ontological knowledge is expressed in ways accessible to humans in general. The compare novice users’ performance when logical task solving using two distinct notations.
SOVA is a full ontology visualization tool for OWL – you can build syntactic units which create a SOVA graph. Other graph tools (OWLViz, VOWL, Jambalaya) are insufficient to express all OWL constructs. Also, existing systems of hygraph, compounded dygraph, and constraint diagrams are not expressive enough to deal with ontologies. Stapleton et al (2017, VL/HCC 2017) describe concept diagrams. So therefore we have two methods to explore: topo-spatial and topological. In consistency checking tasks, Topological representations were better, while in entailment judgement Topo-spatial representation performed better. In summary, topology representations are suitable for most existing ontologies representations, but there is a need to design new ontology visualizations.
Users and Document-centric workflow for ontology development
Aisha Blfgeh and Phillip Lord (Newcastle)
Ontology development is a collaborative process. Tawny OWL allows you to develop ontologies in the same way as you write programs. You can use it to build a document-centric workflow for ontology development. You start with users editing an excel spreadsheet, which is then used as input in Tawny OWL to ultimately generate OWL. This will also generate a Word Document that users can see the changes in.
But how successful would the ontological information in the form of a word document actual be? Depends on the users – there are two types – the users of the ontology and the developers of the ontology. They started by classifying their users into: newbies, students, experts and ontologists and they worked on the pizza ontology.
The participants saw both the word document and in Protege. Errors were introduced and the participants were asked to find them. Reading text in Word helps explain the structure of the ontology especially for newbies. However, the hierarchy is very useful in Protege. The ability to edit the text in the word document is quite important for non-experts.
The generation of the word document is currently not fully automated, and therefore this is one of the things they plan to do. They also want to develop a Jupyter notebook for the work. Finally, they’d like to repeat this work with ontologists rather than just newbies.
DUO: the Data Use Ontology for compliant secondary use of human genomics data
Melanie Courtot (EBI) – on behalf of The Global Alliance For Genomics And Health Data Use Workstream
Codifying controlled data access consent. Data use restrictions originate from consent forms – and as a researcher, to get the data you have to go via data access committees. The current protocol for data access is: there are data depositors and data requestors. The data access committee sits between the data and the requestors and tries to align the requestors’ needs with the data use limitations. All of this is done manually, and is quite time consuming, Often there isn’t the human capacity to go through all requests. Therefore if we can encode consent codes into an ontology, perhaps the data access process could be more automated.
The use cases for this system would include data discovery, automation of data access, and standardization of data use restrictions and research purposes forms. DUO lives in a GitHub repo where they tag each release. They aim to keep DUO small and to provide clear textual definitions augmented with examples of usage. In addition, DUO provides automated machine-readable coding.
W3C Data Exchange Working Group (DXWG) – an update
Peter Winstanley (The Scottish Government) and Alejandra Gonzalez-Beltran(Oxford)
Peter co-chairs this working group and is one of their Invited Experts. He shares the burden of chairing and ensures that the processes are adhered to. These processes involve making sure there is openness, adequate minutes, and sensible behaviour. The working group is a worldwide organization, which makes it difficult to organize the weekly meetings (time zones etc). There are also subgroups, which means two awkwardly-timed meetings. This is the context in which the work is done.
The DCAT (Data Catalog Vocabulary) has been around since 2014 as a W3C recommendation. Once people really started using it, issues became apparent. There were difficulties with describing versioning, API access, relationships between catalogs, relations between datasets and temporal aspects of datasets etc. Therefore the way that people have used it is by mixing it with other things as part of an “application profile”. Examples include DCAT-AP, GeoDCAT-AP, HCLS Dataset description, DATS. Different countries have also already started creating their own application profiles as part of a wider programme of AP development (e.g. Core Public Service Vocabulary (CPSV-AP)).a
The mission of the DXWG is to revise the DCAT and then to define and publish guidance on the use of APs, and content negotiation when requesting and serving data. There have been a few areas where reduced axiomatisation is being proposed in the re-working of DCAT to increase the flexibility of the model.
You can engage with DXWG via github, the w3c meetings and minutes, the mailing lists, and provide feedback.
Robert Stevens introduced the panel. He stated that one of the reason he likes this network is its diversity. Panellists: Helen Lippell, Allison Gardener, Melanie Courtot, and Peter Andras. The general area for discussion is: in the era of Big Data and I, what type of knowledge representation do we need?
Melanie Courtot: It depends on what you’re calling KR… Ontologies are time consuming and take a lot of time, and they’re typically not funded. If we’re talking about KR other than ontologies, then you want to ensure that you keep any KR solution lightweight. She liked that a lot of the talks were very practically oriented.
Helen Lippell: She doesn’t work on funded projects at the moment, but instead going into private sector companies. They have lots of projects on personalization and content filtering. You can’t really do these things without ontologies / domain models / terminologies, and without ensuring these are all referring to the same thing. She’s like to see more people in the private sector working with ontologies – shouldn’t be just academics – go out and spread your knowledge!
Allison Gardener: From the POV of a biologist coming into Computer Science, she’s primarily concerned with high quality data rather than just lots of data. What features she chose and how she defined these features was really important. Further, how you define a person (and their personal data) would determine how they are treated in a medical environment. Ontologies are really important in the context of Big Data.
Peter Andras: If you look how KR works in the context of Image Analysis – transformation of images and fed into a Neural Network – you get statistical irregularities in the data space. Your KR should look at these irregularities and structure those in a sensible way that you can use for reasoning. This works for images, but is more difficult / much less clear when you’re looking at text instead. However, if you can add semantics into the text data, perhaps you can more meaningfully derive what transformations make sense to get those high quality irregularities from your analysis. Sociologists have several million documents of transcribed text from interviews – how you analyse this, and get out a meaningful representation of the information contained therein, is difficult and ontologies could be helpful. How can you structure theories and sociological methodologies such that you add more semantics?
Q: Have ontologies over-promised? Did we think it could do more than it has turned out that it could do? Melanie: What are we trying to do here? Trying to make sense of a big bunch of data… As long as the tools work, it doesn’t really matter if we don’t use ontologies. Phil: “Perfection is the enemy of the good.” Peter: There hasn’t been really an over-hype problem. Perhaps you’ll see the development of fewer handcrafted ontologies and more automated ontologies via statistical patterns. But what kind of logic should we use? Alternative measures of logic might apply more – the weighting of logic changes.
Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!