Semantics and Ontologies Software and Tools

Summer Ontology Work: The Software Ontology

In the summer of 2012, I worked with Robert Stevens and Duncan Hull (among others) on additions to the Software Ontology. As it was a short-term appointment, I kept detailed notes on the Software Ontology blog. A summary of those notes is available as its own post.

I had loads of fun doing the work, and would love to head back over there and do some more work on it. We had gotten most of the way through a merge of EDAM and SWO, and it would have been nice to finish that off, if time constraints had not been what they were.  Thanks very much to Robert Stevens for giving me the chance to do such interesting work!

Meetings & Conferences

Panel Session on the emergence, use and success of wikis for collaborative knowledge capture for biology (ISMB Bio-Ont SIG)

Panelists: Dawn Field, Andrew Su, Robert Stevens, Barend Mons

Question 1: Wikis work in general for widely accepted knowledge where there are many readers. Can crowd-sourcing work

  • RS: Yes, already present. Crowd-sourcing is not new – a publication has a crowd (albeit a small one) of readers
  • AS: Yes. In small pockets, wikis in science do work
  • BM: Yes, especially where data are controversial, it will work (e.g. Jesus and Adolf Hitler were the most heavily-edited entries in wikipedia). We need to federate all the wikis.
  • DF: Yes, as long as you’ve got a critical mass. Pick your targets well and you’ll succeed admirably.
  • Judy: Yes, it works in a general sense, and possibly can work in the specific science sense. I would like a common agreement as to what should be promoted in wikis/gene wikis, and what the source of the information is: evidence.
    • RS: This requirement Judy mentioned that we need to comply with scientific best-practice even within wikis is not a new idea. (Judy knows this).
  • Wikis are not machine readable, though the semantic wiki is (almost). To make them machine readable, they have to be annotated with controlled vocabularies.

Question 2: Wikis are great for reading, but are generally unstructured. Does the growth of the bio-ontology community not suggest this is the wrong way forward?

You need structured data, but wikis do not necessarily give you that

  • BM: It’s OK if people don’t structure it – you can use something like Peregrine and pull everything out and convert automatically into triples.
  • RS: All of this is predicated on people actually doing the work. We could cut all funding to human annotators 😉
  • How would you get this structured information at the top of the wiki page?
    • AS: You don’t ask your domain experts to structure your data: you value them for what you know, and not for their knowledge of structuring and ontologies.
  • Would each page then be a class or an instance? Wouldn’t that imply that they all have the same attributes (which they don’t)?
    • AS: This is why I believe we shouldn’t do away with human curators. Gene Wiki is not a replacement for traditional curation.
    • BM: I also agree we will always need human curation. This is why you need hypothetical/observational/curated triples.
  • Force people to submit grants in RDF 🙂
  • No-one expects people to submit structured data in RDF – biologists don’t submit information in raw HTML/XML. It’s all about the interface.
  • DF: What really caught her ear was the idea of nano-publication. Start getting people thinking about other ways to contribute – even the smallest addition to this comprehensive catalogue would be useful – and credited.

Question 3: The data is mine! Scientists will fiddle with wikipedia because they don’t care. But they won’t provide valuable knowledge without coercion. So much for wikis!

If I can’t list my 1000 edits in this wiki as publications or for impact assessments, then I won’t do any edits in wikis.

  • RS: We could all give it up and live in a commune 🙂
  • Helen Parkinson: Sanger are deleting some data because they don’t have enough room. Knowledge is a different thing, and treated differently. people have to share 🙂
  • BM: We have to change the way grants are written. Just like the way they did it with OA: if you want to promote open access, why not give grant money to help them out. It is irresponsible to give money to research and not make people put the data out in in an open, structured way (Allyson: I’m paraphrasing here, because I can’t remember exactly what was said).
  • AS: Wiki markup is not a requirement for contribution. One solution could be to enable semantic markup, but not require it in the same way as wiki markup.
  • There must be a strong ego incentive.

FriendFeed discussion:

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!