Categories
Meetings & Conferences Semantics and Ontologies Standards

Morning Use-Case Talks (CBO 2009)

Nick Monk, Sheffield/Nottingham – wants to develop a formalism for multicellular models of plant roots. There are many model types out there – they’re all encoding the same thing: the way cells interact with each other and with the environment. He’s familiar with this type of problem via the history of dealing with reaction kinetics. We need to write down information about reaction kinetics in a simulation-independent manner. Therefore they need to write down the multicell models in a way that does not depend on the simulation environment. For reaction kinetics, it was fairly straightforward to do this as there was already a good list of terms describing reaction kinetics.

For cell behaviours, when we talk about them we tend to talk about them in a subjective / qualitative level. Humans using their pattern recognition skills to identifying the behaviours – there are no real quantitative metrics for determining behaviour. What would be most useful is a way to abstract out information from images of cells that would allow us to determine the behaviours they’re exhibiting.

If we generate time-course image data, what are we going to do with them? Therefore we need a way to annotate these images == the annotation case study. They want to have a session on multicellular modelling standards at the next international systems biology conference (ICSB, Edinburgh summer 2010).

Then Rusty Lansford (CalTech) described a set of images he had put up on the screens. They’ve generated some modified quail (FP_expressing Tg quail) that they’re using – the eggs are easy to work with. They put different fluorescent proteins into different quail, and then breed them together. He had a very nice video of quail development with endothelial cells marked. Brighter cells are those about to enter M phase. There are also some great “4D” video that track the movements of the cells to form tha aorta. They’re pretty confident that they can follow cell division and cell orientation and shortly cell polarity. They’re happy to know what interests other people would be interested in terms of data and they’ll collect it for your models. Some words he used was: rolling, flow, differentiating, kiss-and-fuse, formation of organs and more. Many of the terms were subcellular and others were higher up (e.g. organs or tissue-level).

Nadine Peryeras (CNRS) from then discussed the Embryonics EC project, which reconstructs the cell lineage tree as the core of the “embryome”. They looked at 4 organisms, including the zebrafish. When computationally determining the cell shape, you (virtually) cut the embryo into bits to figure out the size. They have a number of algorithmic strategy for determining the position of each cell in the xyz axes. They can convert from total cell number to cell density if they have volume information. They have a video of the zebrafish virtual embryo, where the color shows the direction of migration. Very nice.

There were two other presentations about what their use-cases would be, but I was working on the list of terms from CBO.

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies Standards

Use Cases and top level: Afternoon Discussion (CBO 2009)

After the presentations finished, the discussion of what use cases to use started. What is the scope of the cell behaviour. Specifically,  it describes how cells behave seen as agents (deliberately neglecting subcellular and tissue/organ details). For example, cell adhesion, cell-cell adhesion and others.

I had a lot of fragmented notes during this discussion, but discovered that I was contributing so much that I didn’t have good notes. Luckily, Benjamin has been taking excellent notes which I think should shortly appear on the CBO wiki. I’ll link them from this post as soon as I have it.

There was a really interesting discussion within today’s session about whether or not cell shapes should be included in CBO. It wasn’t so much the cell shape example that was interesting to me (as it is my opinion that shapes are not behaviours – it is the changing from one shape to the next that is a behaviour), but it was the way that it exposed the differences in thinking in the members of the workshop had about what constituted a behaviour, and hence what the scope of the ontology should be.

This is my interpretation of the top level and a very rough binning of the other terms with respect to that top level

  • division
  • cell fragmentation
  • cell movement (linear rate, persistence)
    • movement of single cell
    • movement of clusters of cells
    • movement of sheets of cells
    • follow field (chemotaxis, haptotaxis)
    • polarize – movement within oneself?
    • cell traction
      • between other cells
      • ecm (rearrange ecm); basement membrane
  • shape change
    • cell contraction (apical, area change, in epithelia)
    • shape changes that result in a reduction in volume (defined class?)
    • shape changes that result in an increase in volume (defined class?)
    • shape changes that do not result in a volume change
    • assembly of ecm
    • cell protrusion (life timel orientation; duration; lamellipodia (directed; random); filopodia;retraction fibers)
    • length, width, anterior, posterior changes
    • cilia direction
    • flagella, microvilli
    • ruffle membrane
    • restructure cytoskeleton?
  • exert force / pull
  • delaminate
  • interact with other cells
    • the process of contacting with another cell
    • the process of contacting with something that isn’t a cell
    • cell-cell communication
  • secrete (export)
    • vesicle secretion
    • molecule secretion
  • excrete (export)
  • absorb (import)
    • digestion (e.g. osteoclast)
  • adsorb (import)
  • cell rearrangement – is this always with >1 cell?
    • change neighbours
    • directed rearrangement
    • random rearrangement
  • disappear
    • cell death
    • extinction
  • fusion
  • give off heat
  • change electrical field
  • interact with ecm
    • pull on ecm (also a child of force / pull)

    Alter subcellular distribution

  • alter extracellular distribution
  • alter mechanical properties
  • remodel EC environment

Personal Note: There needs to be a delination between the behavior of a single cell and the behaviors that are only relevant in the context of other cells. Many of the above should probably become defined classes to prevent multiple asserted hierarchy. This is just a representation of what was discussed this afternoon, and is not how it is meant to be in a final form. Particularly, some things that are presented as a top-level (e.g. the two types of interactions) are actually children of a not-yet-extant parent term.

Some terms that didn’t fit in these lists but which were suggested: live cell, cell activation, response to external stimuli, cell metabolism. They may not belong, or they may belong but haven’t been binned.

Categories
Meetings & Conferences Semantics and Ontologies Standards

Dan Cook (U Washington): Ontology design: added value of organizing principles (CBO 2009)

Theory-based ontologies (FMA, OPB, SemSin semantic biosimulation models, GO, Cell Type ontology) for multiscale structure. The OPB is the Ontology of Physics for Biology, where domains include fluids, solids, chemical kinetics, electrochemistry, diffusion, heat transfer.

They have created OPB:physical_property as a child of continuant. These include terms like force, resistance, flow, etc.

Personal Note: I was really glad to see someone else using BFO for their terms in a practical sense like OBI does, rather than in a theoretical sense like most of the other OBO Foundry ontologies.

They have developed SemSim, which is a lightweight mapping schema in OWL from the physics biosimulation code to the semantic knowledge (Gennari JH et al 2008 PSB: Integration of multiscale biosimulation models via lightweight semantics; and another PSB article from 2009 about merging/recombining models – Neal et al.). Very interesting.

You could say that any change in property values is a consequence of a thermodynamically driven state property changes. A change in property can have structural and existential consequences.

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies Standards

Alexander Diehl (GO/MGI): Biological Processes subtree in GO (CBO 2009)

Gene Ontology (GO) has developed over time to become more of a true ontology. Its purpose is: as a common language to share knowledge, to support cross-referencing.

Terms of interest for CBO in GO include cellular processes and their regulation, cell differentiation, cellular extravasation, among other things. Cellular process, multicellular organismal process and multi-organism process are disjoint from each other. Some of these terms can be problematic: for instance, localization is a subtype of multi-organism process, which could also be one of the other types, depending on the definition: it all comes down to the definition…

Terms in GO may have multiple parents, some of which are from other ontologies such as the Cell Ontology. These links to external ontologies will not be present in the standard download, but you can download a version of GO that has the links (you’ll probbaly have to download the additional external ontologies separately).

There are 16419 terms in the biological process ontology. They don’t just develop GO as annotators need them or users request them. They also have domain workshops that focus on getting a particular type of domain covered (e.g. lung development and muscle development). GO developers use OBO-Edit 2.0, which isn’t as fully-functional as Protege and OWL, but which is useful for people only developing in OBO.

Annotations of gene products to GO are genome specific. With regards to the CBO and GO, we shouldn’t reinvent the wheel. We also need to think very carefully about the definition of behaviour, which in GO means “the specific actions or reactions of an organims in response to external or internal stimuli…”

Basically, they are just cellular processes, which might be a little more restrictive than we want.It would be really useful to make as much use of GO as possible because you get a lot of benefits: you get automatic linking to all the rest of GO, and all the analyses etc that people do and then annotate with GO. You might also want to look into the extracellular matrix organisation terms.

Question: why did you decide to put cell type outside of GO? Well originally, it was created to describe aspects of particular gene products, and cell type doesn’t seem to be within scope. In a longer term, they want to bring the Cell Ontology under the auspices of the GO Consortium.

MGI website.

In other news, during lunch James Glazier has said that CBO should only be for behaviour of cells, not behaviour in cells, and that we will not be attempting to hang CBO underneath the biological process section of GO.

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies Standards

Herbert Sauro/Michael Galdzicki (Washington): Building Ontologies and Standards in Systems Biology (CBO 2009)

Herbert Sauro and others in the systems biology communities started with the modelling language and then went into ontologies. SBML is used to represent homogeneous multi-compartmental biochemical systems. You can have discrete events that either come from the outside or are generated internally. SBML started in ’99/’00, and now over 160 tools support SBML, and SBML files are accepted at a number of journals including Nature, Science and PLoS. CellML is philosophically different from SBML, as the former is math-centric and the latter is biology-centric.

In systems biology, SBML and related tools have allowed useful collaborations that were not available before. However, SBML is a common syntax, and what was also needed was a common semantics. The SemGen Annotator software is used to attach meaning to mathematical models, which can be loaded into a database such as BioModels.

Galdzicki had a reference to SED-ML, which would allow semantically-enriched publications to aid the interpretation of results. For instance, you could click on a figure of a model and be taken to a web application that can run the simulation for you. (Personal Note: there is an interesting paper about semantically marking-up publications: .)

In conclusion, remember that the use of an ontology must be an important criterion in its design.

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies Standards

Olivier Bodenreider (NLM): Best-Practices, pitfalls and positives (CBO 2009)

If ontologies are the solutions, what is the problem? Think use cases. Uses of biomedical ontologies include knowledge management (annotating data, accessing information, mapping across ontologies), data integration and exchange, semantic interoperability and decision support (Bodenreider YBMI 2008).

The ontology you’re going to build will be different depending on your use cases: different structure, different focus, etc. Finding an agreement and settling on what your use cases are is an important part of the meeting. Collection and prioritization is very important.

Showed an image of the “ontology spectrum”, available at http://www.mathiswebs.com/ontology.htm. The amount of semantics you want to put in an ontology varies along a spectrum. At the “weak semantics” end you have taxonomies and Thesauri, whereas at the “strong semantics” end you have Conceptual Models and Logical Theory (with Description Logics being the formalism du jour).

MeSH is a hierarchical controlled vocabulary – it is not an ontology. MeSH provides descriptors for indexing biomedical literature. Here, the “entry terms” may or may not be synonymous with the MeSH heading. What the entry terms mean is that anything talking about these terms will get classified according to those terms’ MeSH heading. This is enough for particular goals, such as annotation of literature. However, it may not be enough, depending on your use case. You need to figure out your level of granularity. The hierarchical in MeSH states if you’re interested in term X (e.g. cell movement), you might also be interested in X’s child terms (e.g. ). It is NOT an “IS A” hierarchy, more of a “IS RELATED TO” hierarchy. In GO, synonyms are either exact or related. Cell movement in GO is a child of cellular process and also of localization of cell. GO is more precise.

When defining use cases, you need to think about typical situations in which the resource to be created is expected to contribute to the solution (resource annotation, resource classification, inference based on attributes of biological entities). You need to think about competency questions. The rule is usually to go with the minimal ontological commitment. The last thing you want to do is to put too much into your ontology.

“Ontologies are for ontologists.” What is the difference between an ontology and a car? You wouldn’t think of building a car, but you do think about building an ontology. Eventually, you’ll run into roadblocks, e.g. trying to deal with terms from upper-level ontologies (ULOs) such as the BFO dependent continuants and the differences between function, role and disposition. He then used SNOMED as an example knowledge representation.

From the OntoClean people, he mentions that you shouldn’t have a single class with more than one IS A relationship. E.g. if you use apple and place it under both food and fruit, then you run into problems when trying to describe that an apple is toxic to another animal. Another example is “lmo-2 interacts with Elf-2”. There are many possible understandings of this statement: one individual lmo-2 molecule interacts with one individual Elf-2 molecule”, or any other number of instances or groupings.

CBO is a domain ontology, a low-level ontology. ULOs can have lower-level ontologies hung off them, but you won’t be developing ULOs. There are lots of power tools for ontologies: Protege and OBO-Edit, but these tend to be more complex than biologists wish to use. Semantic wikis are more simplified, intermediate representations that allow collaborative development. They hide part of the complexity.

You can collect terms from experts, textual corpora, and from existing terminologies and ontologies. One good resource is NCBO’s bioportal http://bioportal.bioontology.org and the UMLS semantic navigator. You should try to link to and borrow from existing ontologies. On the other hand, by borrowing terms you are also borrowing the ontological commitment from these ontologies, and therefore may or may not align with your goals/scope.

With the help of experienced ontologists, you should decide on: the knowledge representation (e.g. OWL-DL), what to use as an editor (e.g. Protege), and what the ontological commitment should be (e.g. top-level ontologies). You could consider the OBO Foundry.

BiomedGT is from the NCI and they use a semantic wiki. The IDO uses the OBO Foundry approach. The Int’l Classification of Diseases uses a semantic wiki approach combined with a Protege background. A final example is the Neuroscience Information Framework (NIF).

Conclusions. Start by defining use cases, not ontologies. You should also define how you would measure success. Also, let the biologists be biologists, and seek out ontologists where needed. Follow experience/guidelines, not gurus. Finally, think prospectively, such as maintenance and funding.

Olivier’s website: http://mor.nlm.nih.gov
IDO imports many terms from GO.

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies Standards

James Glazier (Indiana University): Goals of CBO Workshop (CBO 2009)

Dedicated code versus modelling environments: microsoft word is not a novel! You need to separate the model definitions from the coding, which has advantages in reproducibility, clarity, and collaboration. The emphasis, when creating modelling environments, is to ensure that the emphasis is on the biology. It follows that you should hide the complexity of the implementation from the users. These things increase the usability for the biologists.

Bob Murphy suggested then that biology is not a finite space. Additionally, you shouldn’t assume that the biologists understand the biology. The whole point of computational models is to create a level of understanding higher than the existing understanding. Finally, models predict observations and model building starts with observations. Putting a human between the observations and the model building isn’t necessarily the right thing to do, or that describing in words using an ontology is not necessarily the right thing to do. He thinks a traditional ontology is at best a stopgap that captures a snapshot of the knowledge at that point in time, and that we should think about non-word-based ontologies. Personal note: biological ontologies are, almost by definition, not snapshots: the state of our knowledge is always changing. If you don’t want just words, make use of the more complex first-order logic statements such as those available to people developing in OWL.Les Loew also made the important point that you shouldn’t take biologists out of the equation: we must keep the biologists as the focus of the meeting, as they are some of the most important groups ones we’re doing this for.

Back to James Glazier now. An ontology is not a model (in the sense of a computational biological model). However, it is a model of the domain you’re interested in. Neither is an ontology a syntax. An ontology is a logical structure that facilitates the model development and analysis. Their use case is for model sharing standards.

The virtual tissues discussed by Imran Shah can also be seen as multiscale modelling. His vision is a language for a specification for multiscale models. Want to start with CBO, but may not be limited to an ontologies. This should integrate easily with existing standards.

Glazier has a mock-up of a CBO-based ML, where each element name matches a CBO term. Personal Note: This could be problematic, for two main reasons: firstly, the example used labels rather than IDs (which was probably just for clarity), and secondly  mock-ups of XML aren’t required, if the decision is instead to create instances of an ontology using RDF a la BioPAX, OBI or similar.

For CBO, Glazier is looking for something which provides agreed-upon hierarchical terms which we can then use to structure other applications/language. What cell behaviour ontologies exist now? GO, and Cell Physiology/Histology Ontologies, most of which are fairly fragmentary. There is a huge amount of behaviour in GO.GO stops fairly high up. We can fill in more specific terms under our remit.

Personal Note: We should be asking ourselves: Do we want to hang CBO under GO? If so, what about pathological terms? What about if CBO ends up more sematically rich than GO? Would it still be appropriate to align them even if the hierarchy looks different?

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies Standards

Imran Shah, EPA: Virtual tissues and the importance of standards (CBO 2009)

Tissue histopathology gold-standard for disease diagnosis/prognosis. But, it is difficult to relate molecular changes/pathway to tissue-level outcomes, or to relate tissue effects to molecular perturbations. We need to describe the relevant biology to simulate dynamic dose/time-dependent phenotypes. Their goal is really to understand the molecular dymanics.

The goal of their work is to develop better in vitro systems and predict pathological outcomes. However, molecular perturbations/pathways alone are insufficient to understand functional alterations in tissues/organs. As an example of these tissue-level changes, take cell aterations based on toxicity: swelling, steatosis, macroves or microves, necrosis, hyperplasia, and carcinoma are just a few examples. If you look at the gene expression profile for these alterations, it’s not easy to parse out which change is the one that’s occurring.

In order to link molecular pathways to phenotypes, they’re working on using virtual tissues, which are multi-cell models. They’re not trying to build a holistic model of the entire cell: instead, they’re focusing on cell behaviours that define key molecular pathways and define the key cell-cell interactions.

He then mentioned the V-Tissues conference that happened in April 2009 that went well, and their website for the conference. At this conference, they had transitional goals (bridging gaps). Some examples of tissue-level models (liver toxicity and cancer, immune response, cardiac modelling and more) and experimental and computational needs were given of what went on at the workshop. There were also breakout discussions which covered computational requirements (modelling and simulation frameworks and formalisms, and cell/tissue-level knowledge integration via ontologies to capture meaning and reduce ambiguity) and data requirements (organotypic cultures and histomorphometry).

They need to define use-cases. (Personal note: use cases and case studies are very important!)

The knowledge representation issues they’re dealing with include: key events in cell response, cell-cell and cell-ecm interactions, and microcirculation. They’re also interested in cell changes, such as in cases of chronic toxicity in cancer. They are very histopathologically focused.

They’re integrating this information using the V-Liver architecture, which goes from assays to v-Liver Knowledgebase, v-Liver simulator, and from there to outcomes. He mentions that some of these steps would involve the use of the semantic web, but didn’t go into details. They have a key need for fundamental cell behaviours (cellular outcomes, and key intracellular processes and outcomes.

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!

Categories
Meetings & Conferences Semantics and Ontologies Standards

James Glazier, Indiana University: Towards a CBO (CBO Workshop 2009)

There is plenty of rain this week, but no-one seemed to have been washed away. After a round of introductions of the attendees, James Glazier gave a brief introduction to the goals of the workshop.

GO, OPB, SBML, CellML are examples of standards currently in the biological community. Why does the community he’s a part of feel that this is a necessary project? The understanding of tissue processes is still in its early stages, especially for tissue misbehaviour. The field is rapidly growing, and so by coming in early now, we can avoid later inconsistencies that can bedevil standards development. Further, the hope is that standards can help bootstrap development.

Personal note: Many people at this workshop seem to be talking about syntax / formats as standards, and ontologies as something else that is separate from these standards. Instead, it’s best to think of formats as just one type of standards, with ontologies being semantic standards and minimal checklists such as those listed in MIBBI as content standards. They all perform standardizing functions, and should be considered as standards. See a great post by Frank Gibson (and another post by Frank with slides) for more information.

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!