Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016: Identifying Basic Level Entities in a Data Graph

These are my notes for the Marwan Al-Tawil, Vania Dimitrova, Dhaval Thakker, Brandon Bennett talk at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

What makes some entities more important, and provide better paths? He observed in his study that central entities with many subclasses are good potential anchors, recognition is a key enabler for knowledge expansion, and encourage connections to discover new entities linked to recognized ones. How can we develop automatic ways to identify knowledge anchors? Category objects (commonly-used objects from daily life) carry the most information, possess the highest category cue validity and are, therefore, the most differentiated from one another.

They have two approaches: distinctiveness (identifies the most differentiated entities whose cues link to its members, and not to other entities) and homogeneity. Distinctiveness metrics were adopted from formal concept analysis, and applied to the ontology. The homogeneity metrics were created with set-based similarity metrics.

Experiment and evaluation. Images of all taxonomical entities linked via subClassOf were presented in 10 different surveys. Benchmarking sets were used to determine accuracy and frequency. Questions were of two types: accurate naming of a category entity (parent) when a leaf entity is seen, and accurate naming with its exact name, child or parent.

When analysing the data, they found that precision values were poor. Inspecting the false positives, they noticed two reasons: picking entities with a low number of subclasses, and returning FP entities which had long label names.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016 Short Talks IX

These are my notes for the ninth session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

Aligning and MErging Ontology in Al-Quran Domain
Mohammad Alqahtani, Eric Atwell

Speaker wasn’t available.

A formal ontological model of creativity for supporting and sharing creative insights of ‘digital living artifacts’
Patricia Charlton

Part of MC Squared, a project combining mathematics and creativity. They defined creativity via 5 concepts: elaboration, fluency, flexibility, originality, usefulness. The first three are what you find in the process, but the last two are what you find in the product/result.

They worked with the people who were making the resources for the students, and found out what terminology and processes they use (the learning pathway). Then ontology concepts were created and fed back to the original group until the ontology was finished. This work was then added back into the MC Squared project.

The BioSharing Registry: connecting data standards, policies and databases in the life sciences
Allyson Lister, Alejandra Gonzalez-Beltran, Eamonn Maguire, Peter McQuilton, Philippe Rocca-Serra, Milo Thurston, Susanna-Assunta Sansone

This is my talk, and therefore I couldn’t take any notes! But you can find us on https://www.biosharing.org.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016 Short Talks VIII

These are my notes for the eighth session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

A workflow of developing biological ontologies using a document-centric approach
Aisha Blfgeh, Catharien Hilkens, Phillip Lord

Ontologists know how to use domain-specific tools and applications to develop ontologies, while the biologist uses a completely different set of tools. Can we break through this wall and enable both groups to work together while still using the tools they like to use?

So, this brings us to the Ontologist and the Excel Spreadsheet. There are existing tools which transform an excel sheet once, but don’t allow further updates from that sheet. With this method (which will use Tawny OWL), the spreadsheet remains and becomes part of the ontology. It works almost the same way with a Word document, but here the ontology remains the master.

The PROHOW ontology: a semantic web of human tasks, instructions and activities
Paolo Pareti

A common-sense approach to describing human activities such as How to Make a Pancake.  They transformed existing instructions from wikiHow and snapGuide, extracting 200,000 procedures into a KB. What to do with this knowledge? Activity recognition is one thing (find out an ultimate goal from a few intermediate steps). They’ve been creating links between the different graphs. Often this allows the user of one activity to access extra information (a subgraph, for example) from another graph. Once this works well, you could implement a method to have a machine perform an activity for you (if it is a computer-based activity, e.g. Send an Email).

http://w3id.org/prohow

Combining Ontologies and Machine Learning to Capture Tacit Knowledge in Complex Decision Making
Yiannis Gatsoulis, Owais Mehmood, Vania Dimitrova, Anthony Cohn

This project has been created to help diagnose tunnels, as maintenance operations and the impact of a rail tunnel malfunction can be costly and catastrophic. Factors include tunnel age and external influences (traffic, weather). Tunnel diagnosis is a complex process for which there are few experts with a large amount of tacit knowledge.

PADTUN goal is a decision support system for the engineers. It is a highly complex knowledge set, and hard to describe. The system diagnosis possible tunnel illnesses based on a set of data. The current ontological model may not be enough. There are two key challenges: validation (inaccurate or missing rules, takes a long time to identify these problematic rule, and some rules are more reliable than others), and extension (identification of rules that cannot be articulated by the experts, addition of crucial aspects of the model).

They have survey data and contextual data to support maintenance planning and identify risk levels. Decision trees can be derived from this data to determine when they decide to close and repair the tunnel.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016: Assessing the Underworld Ontology for Integrated Inter-asset Management

These are my notes for the Saisakul Chernbumroong, Heshan Du, Derek Magee, Vania Dimitrova, Anthony Cohn talk at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

There is a need to manage different types of assets together, e.g. assets buried under the ground and assets on the surface of the ground. The ATU ontology aims to define main concepts such as buried assets, soil, land cover, the environment, human activities and how data can be collected. ATU is intended to be used as part of a decision support system. Here, there is a deterioration model and a cost model which is connected to a KB (which includes datasets and the ATU). The UI then interacts with the KB.

The ontology of soil properties and processes is an important part of this project as soil is a medium in many interactions between assets. Within this ontology, there are two main classes, SoilProperty and SoilProcess. The former has 176 subclasses, and the latter has 113 subclasses. The main relations are hasImpactOn <-> influencedBy (and are defined as transitive properties).

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016 Short Talks VII

These are my notes for the seventh session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

DReNIn_O: An application ontology for drug repositioning
Joseph Mullen, Anil Wipat, Simon Cockell

Drug repositioning is identifying new uses for existing drugs. Most marketed examples are found due to luck, and therefore there is a need for more systematic approaches to enable a more holistic view of drug interactions. To do this automatically, you would want to infer the link from drug to disease by looking at the targets for the drug.

A very high-level ontology was used in the project with 25 classes (http://drenin.ncl.ac.uk). The database has over 8 million triples, and there is a SPARQL endpoint.

Pharmacological Data Integration for Drug-Drug Interactions: Recent Developments and Future Challenges
Maria Herrero-Zazo, Isabel Segura-Bedmar, Paloma Martinez

DDI information resources are numerous, but hard to keep everything updated and integrated. There is a need for a tool that can predict DDIs and that can filter and find the desired information. Therefore they created DINTO (The DDI Ontology). They followed the Neon methodology to create the new ontology. They have integrated ontologies such as ChEBI and PKO, BRO, OAE. They also used SWRL (over 100 rules) to help them infer new interactions. DINTO is the largest and most comprehensive ontology in the DDI domain.

Exploring modelling choices: using non-functional characteristics to test an ontology add-on
Jennifer Warrender, Phillip Lord

Developed the Karyotype Ontology to address problems of karyotype representation. Karyotypes are complicated, not computationally amenable (images), and there are lots of them.

Karyotype Ontology was built using Tawny-OWL. It was built with a pattern-driven approach, allowing them to rapidly change the ontology even if it’s very large. One complex question was how to model the “affects” relationship, as it is difficult to determin a priori which representation would work the best. They investigated 3 “affects” models. With each, you implement the model in Tawny-OWL, generate multiple versions (1600 of various sizes) of the Karyotype ontology and then reason over them and examine how the ontologies scaled.

In this way you can determine which model for “affects” is best for the ontology wrt reasoning and scalability. Obvious result – increase in karyotypes == increase in reasoning time. Each model for “affects” is better for different purposes. Therefore, with Tawny-OWL, you can allow your users to choose which model is best suited to their needs.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016: The Straightened Mouse: translating spatial relations between ontologies and geometric models

These are my notes for the talk at the UK Ontology Network Meeting on 14 April, 2016 by Albert Burger, Kenneth McLeod, Chris Armit, Bill Hill, Richard Baldock.

Source http://www.emouseatlas.org/dj/media/images/modelSelector/large/EMA49.png 14 April 2016

EMAP is used to connect information to visual atlases (Biomedical Atlases) which provide virtual cross-sections of embryos, gene expression etc. Then should be a good link between the visual and the ontology (Anatomy Ontology). EMAP is not as formal as the FMA. BSPO is the Biological Spatial Ontology was created to define spacial direction for describing an organism. WHS (Waxholm Space) is a project dealing with atlas-based data integration. Other atlases include EMA and ABA Brain Atlas.

Integrating across atlases can be tricky. The Straightened Mouse is a project which examines the cartesian vs. natural coordinates. So, people take the 3D model and straighten it, then add axes. From there you can create an EMA 3D model with axes. This means you can now have an up-down direction that actually makes sense. Therefore, even though a curled up point may be below another point, biologically they can store the information that it is actually above it on the organism.

Once you have this information, you can apply these spatial relations in all other sorts of ways (you can say “lateral to the heart” and have it clearly mean something on the model).

So, they are taking information out of the text and pull it out into the information in the atlas. Equally, you can ask the atlas if any papers talk about a particular region.

Ontological challenges include standardization of the ontological-to-geometric space mapping. They also need to discover the most effective spatial descriptions. They wish to see if they can learn from human-to-human communication (versus a computer intermediary). What are the best KR languages to use (e.g. OWL, Prolog)? What are the best spatial reasoning solutions (e.g. RCC)?

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016 Short Talks VI

These are my notes for the sixth session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

What’s in a Name: identifiers in ontologies
Phillip Lord

Ontologies consist of lots of terms, and we need to be able to refer to them in an unambiguous way. There are lots of identifier schemes, so what are the characteristics of a good one? Should be semantics free, and they are often numeric and incremented. Is a numeric a good scheme? No…

…because there will be a guaranteed collision if two authors work on the ontology at once. You could use URIgen or similar, but a better solution is to use a random number generator.

Numeric identifiers are also difficult to remember, and therefore very easy to get wrong. And your wrong identifier might also be valid, but for a different class. So you could fix this by using a check-digit.

Numeric identifiers are hard to remember and hard to pronounce. Though the number can be hidden with tools, it doesn’t solve the problem. You could use Proquint, which is a bidirectional transformation to letters which alternate consonant and vowels and have vaguely pronouncable words to help people remember.

So, solutions: randomness, checked, pronouncable: https://github.com/phillord/identitas

A New Ontology Lookup Service at EMBL-EBI
Simon Jupp, Tony Burdett, Catherine Leroy, Thomas Liener, Olga Vrousgou, Helen Parkinson

OLS Bug Hunt: http://goo.gl/BKVCIE and http://www.ebi.ac.uk/ols/beta

The original OLS has an old codebase (nearly 10 years old in places), and was built around the OBO format (hence an outdated parser). Also built around an assumption that ontologies are available in a public VCS. Assumes that a term only exists in one ontology. Uses Oracle RDBMS and SQL for querying, which is suboptimal. API was SOAP, users want REST.

It has been rebuilt from scratch. Ontologies are polled by URL and not just VCSs. RESTful API, makes use of the Java OWL API behind the scenes, and has multiple indexes for scalable querying. There are 147 ontologies and 4.5 million terms.

Can load any OWL or SKOS file. Open source project at http://github.com/EBISPOT/OLS

An ontology-supported approach to predict automatically the proteases involved in the generation of peptides
Mercedes Arguello Casterleiro, Julie Klein, Robert Stevens

Peptides are useful biomarkers. The PxO (Proteasix Ontology) reuses other ontologies e.g. GO, NCBI Taxonomy, PRO. UniProtKB proteins are organized by Taxons and annotated with GO. They are trying to model the cleavage site patterns. To use peptides as biomarkers, you need lots of data and data linkages. They are using SPARQL queries to query their data.

TopFIND2 and Proteasix can help to automatically predict modification of protease activity.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016: The EPSRC ICT Theme Update

These are my notes for Miriam Dowie’s talk at the UK Ontology Network Meeting on 14 April, 2016.

 

Source https://www.epsrc.ac.uk/files/aboutus/logos-and-indentity/colour-sponsorship-logo-high-resolution/ 14 April 2016

In real terms, the funding situation for the EPSRC is flat, which is good. There are a number of factors in how the budget is built, including research council baselines, prior committments, and other factors. There will be an RCUK communication soon regarding the budget, but the bottom line is that the funding will not be stopping.

EPSRC ICT covers research into computer science, user-interface tech, communications, electronics and photonics around the common thread of new ways to transmit, present, manage, analyse, and process data. The main cross ICT priorities are TI3 (Towards an intelligent information infrastructure), MACDES (Many-core architectures and concurrentcy in  distributed and embedded systems, and others.

The EPSRC ICT Theme is in the middle of their work on refreshing their positions on individual research areas and cross-research area priorities. Why the “refresh”? There are finite resources, and the need to allow new areas to emerge and to achieve balance between priorities, flavors of resources, themes, mechanisms etc.

There aren’t any final conclusions yet, so any and all useful input is welcome. The announcement of the conclusions in December 2016. There will be sessions and workshops to assist with communication.

There is a call for evidence for universities, businesses and recognized professional bodies right now against the following headings: quality, national importance, capacity and further information.

There are Strategic Advisory Team nominations coming up soon. There are about 3000 college members for the EPSRC Peer Review College. Expression of interest are now being invited from candidates who wish to join the Associate College (there is also a Full College). Deadline is 10 May 2016, more details https://www.epsrc.ac.uk/funding/calls/associatepeerreviewcollege/ .

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016 Short Talks V

These are my notes for the fifth session of talks at the UK Ontology Network Meeting on 14 April, 2016.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

The OntoEnrich Platform: using workflows for quality assurance and axiomatic enrichment of  ontologies
Manuel Quesada-Martinez, Jesualdo Tomas Fernandez-Breis, Robert Stevens, Mathalie Aussenac-Gilles, Daniel Karlsson

A lexical regularity (LR) is a group of consecutive ordered words that appear in more than one class of an ontology. An OntoEnrich workflow combines different types of filters, metrics and steps to support the user in the inspection of LRs, and in deciding how interesting they are.

A workflow might start with the calculation of the lexical analysis. Then you filter on select LRs that contain adjectives. Then you manually inspect the LRs and calculate two metrics and sort the set of LRs by that. Then you explore the LRs guided by the metrics. Try it at http://sele.inf.um.es/ontoenrich .

Probabilistic Annotation Framework: Knowledge Assembly at Scale with Semantic and Probabilistic Techniques
Szymon Kiarman, Laris Soldatova, Robert Stevens, Ross King

The methodology for knowledge assembly in this research is Reading-Assembly-Explanation. When inferences are extracted from papers, it’s not clear if it has been extracted correctly. They applied probabilistic reasoning over this and recorded evidence to improve the inferences. The PAF (Probabilistic Annotation Framework) has an ontology covering event-related concepts, metadata concepts and probability types.

The growing scope of the environmental ontology
Pier Luigi Buttigieg

EnvO is a community ontology for the environment and works with OBO and some classes mined from EOL. Environmental systems (biomes, ecoregions, ecozones, habitates), features (coral reefs, hospitals, guts, kimchi), materials and processes (carbon fixation, volcanic eruptions) are all modelled in EnvO. They remain orthologous by sharing with PCO, and using BFO, ChEBI, etc.

Because humans think they’re different from everything else, they are building an human ecosystem section and are working with SDGI in association with UNEP. SDGI is a semantic interface for the sustainable development goals.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies

UKON 2016: The Workflows of Ontology Authoring: Controlled vs. Naturalistic Settings

These are my notes for a talk at the UK Ontology Network Meeting on 14 April, 2016 by Markel Vigo, Nicolas Matentzoglu, Caroline Jay, Robert Stevens.

Source https://lh3.googleusercontent.com/-Ql7RFDSgvxQ/AAAAAAAAAAI/AAAAAAAAAFA/pnoDTCze85Q/s120-c/photo.jpg 14 April 2016

4 years ago we knew very little about how ontologist actually perform their authoring. They wanted to know about typical authoring workflows and the effectiveness of the current tool support. There were interviews with ontologists at UKON 2014 and a user study in the lab at UKON 2015 which helped identify the workflows for exploration, editing and reasoning. This work does have implications for tool design.

In a lab study, the external validity is at risk. Tasks were predefined, with a provided ontology, and they could be in an unfamiliar environment. Here, there were 16 users and times ranged between 30-75 minutes and used a modified version of Protege4US and eye tracking. Therefore they also did a remote study where people worked in their own environment and on their own ontologies. They got 7 users for this part.

They collect the raw event data from the users as log files. Then the data is cleaned and put into a CSV file. Then the same consecutive events are merged. Then they performed workflow mining through N-gram analysis. There were 9K events in the lab and 30K events remotely (doesn’t include mouse hovering events). Lab study had a dominance of entity selection, while in remote study the vast majority are the hierarchy extending events (people’s remote ontologies are larger). There was more variety in the remote setting (more heavy editing, more uncertainty in how we want to model things, more searching, more individuals and annotations). They also looked at how workflows linked together, and if one commonly preceded or followed another.

The remote study does corroborate lab study, but also extends it. The next step is to evaluate the inference inspector, and to explore other avenues, e.g. task difficulty estimation using pupillometry. Also, they’d like to cross-compare data from more than 6 independent studies.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!