COMBINE 2016 Day 2: Data Integration and Mining for Synthetic Biology Design


Goksel Misirli

How can we use ontologies to facilitate synthetic biology? Engineering biological systems is challenging, and integrating the data about them is even more so. Information may be spread out in different databases, different formats, and different semantics. This information should be integrated to inform and constrain biological design. Therefore onto Gruber, and his “specification of a conceptualization” definition of ontologies. Ontologies are useful for capturing different relationships between biological parts and to facilitate data mining. They are already used widely in bioinformatics, including GO, SO, SBO, SBOL etc.

They have created the Synthetic Biology Ontology (SyBiOnt), available at The SyBiOnt knowledgebase includes information about sequences, annotations, metabolic pathways, gene regulatory networks, protein-protein interactions, and gene expression. Once the KB was built, you examine it via a set of competency questions. For example, which parts can be used as inducible promoters? When an appropriate query was run, 51 promoters were classified as inducible within the KB.

They also performed an automatic identification of biological parts, and classified according to activator sites, repressor sites, inducible promoters, repressible promotors, SigA promoters, SigB promoters, constitutive promoters, repressor encoding CDSs, activator encoding CDSs, response regulator encoding CDs and more.

There were many other competency questions that could be, and were, asked.

