Meetings & Conferences Semantics and Ontologies

UKON 2018: Morning Session

Session 1 Chair: Dr Jennifer Warrender

This session contains short 10-minute talks.

Organising Results of Deep Learning from PubMed using the Clinical Evidence-Based Ontology (CEBO)

M. Arguello Casteleiro (Manchester), D. Maseda-Fernandez (NHS), J. Des-Diz(Hospital do Salnes), C. Wroe (BMJ), M.J. Fernandez-Prieto (Salford), G. Demetriou(Manchester), J. Keane (Manchester), G. Nenadic (Manchester) and R. Stevens (Manchester)

They are combining three areas when studying semantic deep learning: natural language programming, deep learning and the semantic web. The purpose of CEBO is to filter the strings into the ones that are the most useful for clinicians. In other words, CEBO filters and organises Deep Learning outputs. Ot has 658 axioms (177 classes).

Using Card Sorting to Design Faceted Navigation Structures

Ed de Quincey (Keele)

Some of this work is a few years old, but the technique hasn’t been used much and therefore he’d like to present it to make it more visible again. Card sorting begins with topics of content + cards + people who sort them into categories = the creation of information architecture. It can be used to (re)design websites, for example. For the new CS website at Keele, they gave 150 students (in groups of 5) 100+ slides and asked them to categorize. Pictures as well as text can be used, e.g. products. You can also do card sorting with physical objects.

Repeated Single-Criterion Sorting: Rugg and McGeorge have discussed this in a paper. With this technique, because you’re asking them to sort multiple times, you instead use about 8-20 cards at a time. Also, you can get a huge amount of information just by doing this with about 6 people. An example is sorting mobile phones. You ask people to sort the objects into groups based on a particular criterion, e.g. color. Then after sorting, you ask them to sort again, and continue on with a large number of criteria. You ask them to keep sorting until they can’t think of any other ways to sort them. Then you pick a couple at random, and ask the people to describe the main difference between them, which usually gets you another few criteria to sort on.

Overall, this allows you to elicit facets from people. Allows you to create a user-centered version of faceted navigation. For his work, he looked at music genre, and investigated whether or not it is the best way to navigate music. He asked 51 people to sort based on their own criteria. He got 289 sorts/criteria during this work. This was then reduced to 78 after grouping them into superordinate constructs by an independent judge. After a while, you found a commonality for genre, speed and song, but then after that it becomes a lot more personal, e.g. “songs I like to listen to in lectures” 😉

Then you can create a co-occurence matrix for things like gender. There was no agreement with respect to genre, which was interesting. Spotify now supports more personal facets, which wasn’t available 8 years ago when this work was first done. As such, this technique could be very useful for developing ontologies.


Peter Murray-RustCharles Matthews and Thomas Arrow (ContentMine)

Peter feels that there is a critical need for Liberation Ontology, and regain control from publishers. Wikidata has about 50 million entities and even more triples, and it’s democratic. He says it is our hope for digital freedom. WikiFactMine (his group) added 13 million new items (scientific articles) to it. There are loads of disparate categories, so if you want ontological content, WikiData is the first (and only) place to go. Good example of a typical record is Douglas Adams (Q42 – look it up!).  Scientific articles can be WikiData items. They were funded by WikiMedia to set up WikiFactMine for mining anything, but particularly the scholarly literature.

You can create WikiFactMine dictionaries. It is constructed such that there is a hierarchy of types (e.g. the entire animal kingdom in the biology subset). They created a dictionary of drugs just by searching on “drug” and pulling out the information associated with it. There are issues with mining new publications however. Then you can combine dictionaries, e.g. gene, drug, country and virus. By doing co-occurence of country + disease, you may be able to predict outbreaks.

The Right to Read is the Right to Mine.

Is there some kind of curation / moderation on WikiData? There is curation on the properties (the community has to agree to this). WRT data, if people think it’s too trivial, it can be marked as a candidate for deletion, and discussions can ensue.

A Malay Translated Qur’an Ontology using Indexing Approach for Information Retrieval

Nor Diana AhmadEric Atwell and Brandon Bennett (Leeds)

Improving the query mechanism for retrieval from Malay-translated Qur’an. Many Muslims, especially Malay readers, read the Qur’an but do not understand Arabic. Most of the Malay-translated applications only offer keyword search methods, but does not help with a deeper understanding. Further, morphological analysis is complicated in Malay, because it has a different structure. They are building an semantic search and an ontology. They wish to improve speed and performance for finding relevant documents in a search query. Also built a natural-language algorithm for the Malay language.

Ontology + relational database was used. ~150,000 words. With keyword search, there was 50% precision, and with her new method, was ~80% precision.

Towards Models of Prospective Curation in the Drug Research Industry

Samiul Hasan (GlaxoSmithKline)

As we think about making precision medicine a reality, it is much more likely that we will fail because of the challenges of data sharing and data curation (Anthony Philippakis, the Broad Institute).

2 important attributes of scientific knowledge management: persistence and vigilance (without access to the right data and prior knowledge at the right time, we risk making very costly, avoidable business decisions). Persistence requires efficient organization, and vigilance requires effective organization. What’s getting in the way of these aspirations is the inconsistent use of language at the source, which creates serious downstream problems. What about implementing reward in data capture steps? How do we not miss vital data later on? Named entity recognition, document classification, reinforcement learning, trigger event detection. You need both vision-based and user-centric software development.

Posters and Demos: 1-minute intros

  • Bioschemas – exploiting schema markup to make biological sources more findable on the web.
  • Document-centric workflow for ontology development – read from excel spreadsheet using Tawny Owl and create an ontology which can be easily rebuilt
  • Tawny OWL – a highly programmatic environment for ontology development (use software engineering tools / IDEs to build up ontologies.
  • Hypernormalising the gene ontology – as ontologies get bigger, they get harder to maintain. Can you use hypernormalization to help this? It is an extension of the normalising methodology.
  • Bootstrapping Biomedical ontologies from literature – from PubMed to ontologies.
  • meta-ontology fault detection
  • Bioschemas – show the specification and how they’re reusing existing ontologies
  • Get the phenotype community to use logical definitions to increase cohesion within the community (Monarch Consortium)

Please note that this post is merely my notes on the presentations. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Housekeeping & Self References In The News Outreach

WISE-ing up: Encouraging girls (and kids generally) in STEM

Kids Love Science

Kids love science (you should see their hands up at a STEM event!), but somehow as they get older many of them learn (or are taught) that it’s boring, or not cool. I do a decent amount of STEM Ambassador volunteering to try to ensure this change in perception never happens: I’ve made Jelly baby DNA with Key Stage 1, talked about non-standard career trajectories with kids almost ready to start university, built birds’ nests with 4 year olds… I’ve even single-handedly done combination presentation-and-practicals for an entire Junior School over the course of one day! I usually get really good feedback from teachers about the events I run, and I also get lots of support from my local STEM Ambassador Hub (one lovely lady even dropping off supplies for an event at my house on her way home!), but it’s not often that I get a letter from a child.

So imagine my pleasure and surprise when I received a letter this week from a child in the Junior School where I did the day-long event. She wrote so eloquently and earnestly. Of course I felt great that she said some lovely things about me. But what was even better is that the event seemed to really spark an interest. Irrespective of her (and all the other children’s) ultimate careers, I’m hoping that the work I do with them encourages them to face the world with open eyes and a thoughtful mind. Words like this are what really keep us STEM Ambassadors going:

Thank you so much for teaching us about DNA. You have sparked my curiosity […] I loved learning all the interesting facts […] This amazed and confused me too! I would love to learn even more about DNA […] Science week would not have been the same without you.

I absolutely agree – Science can be amazing and confusing. And weird, and wonderful, and mind blowing.

Women and Girls in STEM

Encouraging an interest in STEM for all children is at the heart of the volunteering that I do. Recently, however, I have started to learn more about how to specifically encourage women and girls into STEM careers. There’s a lot of talk in the news about gender balance and pay equality, and even the big names in tech like Microsoft have been struggling both to retain women and provide an equal playing field.

It’s not all bad news, though. Every single group and department I’ve worked in (that’s right, every single job) has had lots of diversity, and I’ve never felt neglected, belittled or sidelined. For example, the Oxford e-Research Centre (where I am currently employed) published an article today about my STEM volunteering and the recent career profiles I’ve been a part of (more on that next). But there’s still a lot of work to be done.

There is a huge drop off in the number of girls studying core STEM subjects at the age of 16. Just 35% of girls choose maths, physics, computing or a technical vocational qualification compared to 94% of boys. This reduces the number going on to do a degree or level 4 qualification in maths, physics, computer science or engineering – 9% of girls compared to 29% of boys. Source: WISE Campaign

As such, I’ve jumped on the chances I’ve been given recently to make a positive difference. The North Yorkshire Business and Education Partnership’s ‘Pen Portraits’ have been designed to give female students a glimpse into the variety of STEM based careers available to them. Through my work as a STEM ambassador, I was asked to provide one of these portraits – if you follow the above link, you’ll find me in there along with a number of other great women in STEM.

People Like Me front cover

As a direct result of NYBEP’s work, I became more involved with (and become a member of) WISE and attended a workshop discussing women and girls in STEM. Part of what WISE does is the People Like Me campaign, which creates a series of packs that STEM Ambassadors and schools can use to help girls identify the parts of their personalities that align with STEM careers. If you take a look at the “Careers in North Yorkshire and East Riding” People Like Me pack, you find me there too! The Science Museum are also doing a series of tweets about STEM Ambassadors, which I highly encourage you to peruse (FYI, you may find me amongst them).

It may seem like I’m tooting my own horn (which I am, to a certain extent, after all – this is my blog!), but the main thing that interests me is getting kids engaged in STEM, and I’m hoping that all the volunteering and STEM education skills I’m learning now together with the increased visibility of these issues will ultimately help kids get interested in STEM, stay interested in STEM, and have equal opportunities in STEM careers.


Housekeeping & Self References Outreach

Sweetie DNA and Schoolkids: Genes and DNA for Year 3s

I volunteer with the STEM Ambassador programme in the north of England, and in preparation for a talk / hands-on session I was giving at a local primary school last week, I went in search of visual aids for DNA. The main focus of the event at the school was helping the kids of three Year 3 classes build models of DNA out of sweets (as described in this Guardian article). Before we got stuck into the gummy bears and liquorice, I wanted to give them a short introduction to DNA. I had discovered a lovely pattern for crocheting DNA, which I followed the night before the event, which worked out great (you can see the results in my other blog post about the crocheted DNA itself). After asking them to pass the “DNA” around and take a look, I got started on my talk.

I used the slides below to give them something to look at while we chatted about DNA. Getting them to try to pronounce “Deoxyribonucleic acid” was hilarious for all of us and got them engaged in what I was saying from the start.

After giving them an introduction, I stopped at slide 7 and showed them the sweetie DNA that I had made with my son over the weekend in my best “Here’s one I prepared earlier” style. They were very excited to be using sweets to build their models – I hope they were allowed to eat them at the end of the day!

They were already sitting about 5 to a table, so we handed out enough materials that each table could make one model. The sugar phosphate backbone was strawberry liquorice with sherbet inside, and the As, Ts, Cs, and Gs were gummy bears. They all worked together really well. The gummy bears were very colorful but quite firm, so it took quite a bit of effort for the kids to push them onto the cocktail sticks / toothpicks. However, we had only one poked palm (that I was aware of) – the kids were pretty dexterous. The kids made beautiful models, and it was loads of fun helping them. They were quite keen to show us adults their handiwork, too. They were rightly proud of their sweet-based masterpieces.

Once they finished building their models, we had just a few minutes left, so I showed them slides 8 and 9, which talked about putting genes from one organism in another. I told them to imagine me tearing out a recipe for fluorescence from the jellyfish recipe book and stuffing it into a bacteria’s recipe book. Then, you could create fluorescent bacteria! Slide 9 is a picture of an agar plate of fluorescent bacterial colonies with a difference: the researchers had made a beach scene with them! So, I asked the kids to draw “bacterial colony”pictures with chalk on black paper. They loved that as well: volcanoes, cheetahs, eagles, sharks, and more. One scientific soul even drew DNA and the bacteriophage from slide 2!

The kids were engaged throughout, providing loads of good answers to my questions and asking fantastic questions themselves. I visited 3 different classrooms, and they all showed such an interest in science. 8 is a fabulous age – all curiosity and interest. Thanks very much to the lovely teachers and staff, and of course to the schoolkids; it was fun hanging out with you all! …and thanks for letting me use your pictures of my visit. Thanks also to the STEM Ambassador programme for both organizing this visit and providing the sweets!

Housekeeping & Self References Outreach

Adventures in Crocheted DNA

I made some crocheted DNA this week, and I was so impressed with both the free pattern I found, and the result, that I thought I should share my experiences here.

IMG_20170313_201224993 (1)

I volunteer with the STEM Ambassador programme in the north of England, and in preparation for a talk / hands-on session I was giving at a local primary school, I went in search of visual aids for talking about DNA. I was already planning to help the kids of three Year 3 classes build models of DNA out of sweets (as described in this Guardian article), but before we got stuck into the gummy bears and liquorice, I wanted to give them a short introduction to DNA. (I go into more detail about the actual presentation I gave, as well as how the sweetie DNA turned out, in my related post on Genes and DNA for Year 3s.)

So, how do you make crocheted DNA? Well, I had a vague recollection of a DNA scarf pattern that I had come across some time ago (and you can try your hand at too), but I knew that would take too long to make. Also, it didn’t really have the 3-dimensional look I was going for. The scarf is gorgeous and scientifically accurate, but it isn’t much better than a drawing or a video from the perspective of the kids; it doesn’t show them the shape of a double helix.

My Googling then took me to the Wunderkammer blog by Jessica Polka, where she had posted this free pattern for crocheted DNA. It was another happy convergence (as was true with the scarf) of science and wool. You should all visit Jessica’s blog post as it goes into detail about how, if you’re right handed, you end up with a left-handed helix for your crocheted DNA if you follow her pattern. While I appreciate chirality, I went with the simpler left-handed helix for my work this week.

Jessica, however, went the extra mile and crocheted both left handed and backwards with her right hand! I admire her dedication, but I didn’t have that kind of time. I thought it might be useful for others to see the fruits of my labor, and provide a few helpful details on the pattern for others.

Firstly, the original pattern allows you to choose your own length of DNA, which is helpful. However, I had no idea how long it would end up, so for other people looking to make this pattern, I made an initial chain of length 50. As you can see from the picture at the top of the page, the resulting length of DNA was about twice the length of a crochet hook, or about 30 cm, give or take. Your work will be longer than that if you pull it tight (as the natural double helix shape contracts the length somewhat), and shorter if you have an 8-year-old squashing it as small as she can in order to mimic how the DNA is stored in the nucleus 🙂

An important note at this stage is that the pattern is American, and if you’re used to reading UK patterns please replace any reference to “single crochet” with “double crochet”.

After you create the chain and start on the single crochets, then you start to see the single spiral / helix forming:


It’s really quite magical, and I don’t mind saying I felt weirdly happy watching the spiral slowly (but neatly) curl behind the active part of the work. However, I didn’t really believe the second row of single crochets would work as nicely as the first – I figured some fiddling would be required. However, even the second (and final) row spiralled neatly behind the “active site” (I know, I should have been a comedian! Ha ha), as you can see from this picture, where the completed (left-hand) double helix is on the right of the image, and the incomplete single spiral on the left:


Finally, I didn’t tidy away the ends of the yarn on either side of the completed work as it 1) allowed for a useful place to hold the DNA while twirling it, creating a pleasing spin to watch, and 2) it was just about right to tie the ends together and make a circle of DNA should you so desire!

It only took about 30 minutes (including interruptions). I gave the DNA to the school at the end of the STEM event, as the kids seemed enthralled by it. The major and minor “grooves” were clear – clear enough that I was able to point them out to 7 and 8 year olds, who were able to understand the difference. I was also able to flatten it and show its similarities with the “ladder” diagram that I had up on a slide to show them how they were going to build their sweetie DNA.

Kids playing with Crocheted DNA
Kids playing with Crocheted DNA

I made it almost as an afterthought, yet it was so beautifully tactile when held and elegant when spun that the kids really enjoyed it. One girl in particular kept on spinning and spinning it, making the 30 minutes of my effort well worthwhile. I’ll definitely be making more whenever I run a similar event in future. Perhaps I should start making a full set of human “chromosomes”? But what colors should I choose for each one? Thoughts in the comments please!

Meetings & Conferences

The NormSys registry for modeling standards in systems and synthetic biology


Martin Golebiewski . NormSys covers the COMBINE standards, but they have plans to extend it to further modelling standards. Each standard has multiple versions/levels, and trying to figure out which standard you need to use can be tricky. Normsys provides a summary of each standard as well as a matrix summarizing each of the biological applications that are relevant in this community. Each standard has a detailed listing of what it supports and what it doesn’t with respect to annotation, supported math, unit support, multiscale models, and more.

There are also links to the specification and webpage for the standard as well as publications and model repository records. They also have information on how a given standard may be transformed into other standards. Information on related software is also available. Additional matrices describe what formats are available as input and output wrt format transformation.

NormSys is an information resource for community standards in systems biology. It provides a comparison of their main characteristics and features, and classifies them by fields of application (with examples). Transformation between standards is available, as well as bundled links to corresponding web resources, direct links to validation tools, and faceted browsing for searching with different criteria. The initial focus is on commonly-used standards (e.g. COMBINE) and related efforts.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Meetings & Conferences

SED-ML support in JWS Online


Martin Peters

In an ideal world, you should be able to reproduce any simulation published in a paper. This does happen for some models. You have a website which links the paper from JWS and data from the FAIRDOM hub. Then you can tweak the parameters of a published model and see how the results change. This means that there is a SED-ML database as part of JWS online. Once you’ve made your modifications, you can then re-submit the model back to JWS.

You can also export the COMBINE archive you’ve created in the course of this work and take it away to do more simulations locally. Currently, only time-course simulation is supported (to keep the computational time as low as possible). Excel spreadsheets are used instead of NuML. Further, there is support only for 2d plots. However, they have achieved their goal of being able to go from a paper to the actual simulation of a model from that paper in one click.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Meetings & Conferences Uncategorized

The ZBIT Systems Biology Software and Web Service Collection


Andreas Draeger

In systems biology, people want to perform dynamic simulations, steady-state analyses and others. SBML is the format to use for the model, but you also need a data structure for use in the software, and as such they developed jSBML.

People build models from KEGG, textbooks and more. They try to rebuild KEGG diagrams in CellDesigner, which is very time consuming. Is there a better way to do this? And, indeed, there are even difficulties with this manual method, as some reaction participants present when you study the record aren’t visible in the associated diagram (e.g. the addition of ATP), which can cause issues for novices. Therefore they developed KEGGtranslator to convert KEGG pathways to various file formats. Another way to add a data source to your model is through BioPAX2SBML. Additionally, they’ve created ModelPolisher which can augment models with information from the BiGG database, which is available as a command-line tool and as a web service. For dynamic simulation, they have a tool called SBMLSquezer, which generates kinetic equations automatically and also reads information from SABIO-RK.

This system was applied to all networks in KEGG. They use SBMLsimulator to run the simulations. They’ve developed a documentation system called SBML2LaTeX which helps people document their models.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Meetings & Conferences

COMBINE 2016 Day 3: SigNetSim, A web-based framework for designing kinetic models of molecular signaling networks



Vincent Noel

He was asked to develop a web tool which would be easy for biologists and students, but which could use a parallel simulated annealing algorithm and perform model reduction. He used Python to write the core library and the web interface, with some parts of the library in C. In this software, an SBML model is read in and a symbolic math model is built. It is compatible with SBML up to version of L3V1. The integration is performed using C-generated code, which can be executed in parallel. To perform integration for systems of ODEs or DAEs, the software uses the Sundials library. To perform model fitting, the software uses simulated annealing. It also has some compatibility with Jupyter, mainly to allow the symbolic math model to be able to be worked with directly.

SigNetSim’s web interface uses the Django framework with the Bootstrap front end. There is also a simple DB backend for storing experimental data for these models. The library and web interface will be on github, and the paper should be submitted in the next few months.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Meetings & Conferences

COMBINE 2016 Day 3: Modelling ageing to enhance a healthy lifespan


Daryl Shanley

Age is a major risk factor for chronic disease, and chronic diseases are the major cause of death and disability in the world, estimated at around 70% (WHO 2005). Molecular damage is the underlying factor in all of these (DNA damage (cancer), dementia and more). Ageing results from the accumulation of molecular damage. There is an irreversible accumulation of macromolecular damage, even though we have ameliorating systems such as the antioxidant systems, some damage escapes repair and builds up. Levels of oxidised protein, mutational frequency in nuclear DNA and mutational frequency in mDNA all increase exponentially with age. This underlying damage gives rise to cellular senescence. Cells which go into a permanent state of cell cycle arrest are called senescent, and they secrete a number of chemicals into the surrounding environment. The number of these cells increase with age. If you remove these senescent cells (e.g. from mice) there is a definite survival enhancement, though we don’t really understand why. So, although overall there aren’t many of them, they do seem to have quite an impact.

The good news is that there is plasticity in ageing. For instance, caloric restriction in mice does allow them to live longer (almost double). In part, this is due to them overeating if they’re allowed to free eat, but these undernourished mice aren’t healthy – they’re infertile, for example (it’s not a “natural state”). Mutations that bring longer life are in genes associated with nutrition – they’re signalling to the organism that there is less food available. This signal is somehow reducing molecular damage. However, it’s hard to test this in humans…

If we build models of known mechanisms, we can explore interventions, and with known interventions we can explore mechanisms. With a lot of background information, we can use the models to optimise synergy/antagonism, dose and timing. Ageing is caused by multiple mechanisms, and most damage increases exponentially – can the cycle be slowed or broken – there is an implication of positive feedback.

After existing knowledge and data has been used to create a calibrated model, then we perform sensitivity analysis and validate the model. Once all that has been done, then you can start using the model to make the predictions you’d like to see. It’s a long journey for a single model! They’ve created a set of Python modules for COPASI called PyCoTools, which allows you to compare models by generating other alternative models based on a starting model.

They are using a systems approach to model the development of the senescent phenotype with a view to find interventions to prevent progression and reverse the phenotype. They’d already been working on the processes involved in this with earlier models of insulin signalling, stress response, DNA damage, mitochondrial dynamics and ROSs.

Bringing all of these models together into an integrative dynamic model for cellular senescence is just the first task; they also needed to create an independent in vitro data set for estimating the integrated model parameters. This data was then used to fit their model. They had to infer what was going on inside the mitochondria, by inferring the internal states for ‘new’ and ‘old’ mitochondria. Then the model was used to make interventions for improving mito function and its phenotype, especially via combinations that would be difficult to perform in the lab.

If you reduce ROS in the model, it has an impact on the entire network. The results can be used to inform later experimental designs. Then there was in vitro confirmation of increased mitochondrial membrane potential during ROS inhibition. The model matched initially, but at a later date it diverged from the lab. When you go back and look at the cells, you find that there was very little movement among the senescent cells, which hampers autophagy. This is why the autophagy/mitophagy was predicted in the model, but wasn’t being seen in the lab. It’s a quality of the senescent cell which is blocking their removal from the cell. Mitochondrial dynamics are reduced over time, driven by an inability to remove the network of dysfunctional mitochondria.

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

Meetings & Conferences

COMBINE 2016 Day 3: From Grassroots community standards to ISO Standards


Martin Golebiewski

You need standards at every stage of the systems biology life cycle. These standards need to work together, be interoperable. From modelling to simulation, to experimental data and back again – there are standards for each step. There are a large number of community standards for the life sciences, in many different subdomains (he references here.)

This presence of many standards for different domains creates quite a lot of overlap, which can cause issues. Even within a single domain, it is normal to see different standards for different purposes, e.g. for the model description and the simulation of the model, and the results of the simulation etc. The way in which the synbio and sysbio standards interrelate is complex.

In COMBINE, there are the official standards, the associated standardization efforts, and the related standardization efforts. The tasks in COMBINE for the board and the whole community are to: organize concerted meetings (COMBINE and HARMONY), training events for the application of the standards, coordinate standards development, develop common procedures and tools (such as the COMBINE archive) and provide a recognized voice.

A similar approach, but with a broader focus, is the European CHARME network, which has been created to harmonize standardization strategies to increase efficiency and competitiveness of European life-science research. This funds networking action for five years from March 2016. See  There are 5 working groups within CHARME. WG2 involves innovation transfer, to have more involvement with industry.

NormSys is intended to bring together standards developers, research initiatives, publishers, industry, journals, funders, and standardization bodies. How should standards be published and distributed? How do we convince communities to apply standards, and how do we certify the implementation of standards? There is a nice matrix of the standards they are dealing with at

NormSys is meant to be a bridge builder between research communities, industry and standardization bodies. There are actually a very large number of standardization bodies worldwide. ISO is the world’s largest developer of voluntary international standards. Anything that comes from ISO has to come out of a consensus of 164 national standards bodies, therefore finding such a consensus within ISO can be tricky. Most of the experts involved in the ISO standards are doing it voluntarily, or through dedicated non-ISO projects which fund it.

Within ISO, there are technical committees. These TCs might have further subgroups or working groups. There can also be national groups which have mirror committees, and then delegates from these committees are sent to the international committee meetings. The timeline for the full 6 stages of standard development with ISO can be around 36 months. However, this doesn’t include any of the preliminary work that needs to happen before the official stages begin.

There are three main ISO document types: IS (International standard), TS (Technical specification) and TR (Technical Report). Most relevant for us here is the ISO TC 276 for Biotechnology. Its scope is the standardization in the field of biotechnology processes that include the following: terms and definitions, biobanks and bioresources, analytical methods, bioprocessing, data processing including annotation, analysis, validation, comparability and integration, and finally meterology.

There are 5 WG for this TC: yerminology, biobanks, analytical methods, bioprocessing, and finally data processing and integration (WG5). ISO/IEC JTC 1/SC 29 involves the coding of audio, picture, multimedia and hypermedia information (this includes genome compression). ISO TC 276 WG5 was established in April 2015, and there are 60 experts from 13 countries. He says the next meeting is in Dublin, and there is still scope for people to join and help in this effort.

They’ve been working on standards for data collection, structuring and handling during deposition, preservation and distribution of microbes, recommended MI data set for data publication. One of the most important tasks of WG5 is the standardization of genome compression. This was identified as a need from the MPEG consortium.

The biggest deal for COMBINE is the focus on developing an ISO standard for applying and connecting community modelling standards. “Downstream data processing and integration workflows – minimal requirements for downstream data processing and integration workflows for interfacing and linking heterogeneous data, models and corresponding metadata.”

Please note that this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!