BBSRC Systems Biology Grantholder Workshop, University of Nottingham, 16 December 2008.
SyMBA Demo. The lunch hour was also the demo hour. People came to visit me at the SyMBA demo desk for the whole hour, and we had some interesting conversations. There is one particular question I would like to relate from that hour: what should a bioinformatician choose as an output/export format for multi-omics data? This post relates my thoughts about this challenge. It's not meant to be comprehensive: just some ramblings.
I solve this challenge in SyMBA by storing everything as FuGE objects, which can be exported to FuGE-ML. FuGE-ML can be converted into ISA-TAB and into an html format that mimics ISA-TAB using an XSLT. Therefore, because of this interlink between FuGE and ISA-TAB, you can leverage two complementary formats.
But to a bioinformatician who has just been tasked with building an application (and generally on a short time-scale), how do they choose what export format to use, e.g. FuGE or ISA-TAB? There are considerations of:
- scale: lightweight or heavyweight implementation. A lightweight implementation might favor your own version of ISACreator and the use of ISA-TAB, or a FuGE-based archive (but not a full-blown LIMS) like SyMBA. A heavyweight solution might be a full LIMS such as PIMS, or another FuGE implementation in development called SysFusion.
- intent: what is the purpose of storing this data? Is it for later analysis? For later deposition to a public database, e.g. at the EBI? Is it archiving? Is it a combination of these things? Your intent will shape what type of application you build, and what formats you focus your effort on. If your intent is storage only, choose whatever is most convenient for your users. However, these days there is always some aspect of data sharing or publishing. If you need further analysis of the data, then you probably want to be able to produce a computationally-friendly format such as XML. If your intent is submission to public databases, you need to ensure you export in a format they import.
Unfortunately, what this means is that the decision depends on the circumstances. FuGE and ISA-TAB are linked, and so you really get two for the price of one with those. I see this sort of thing as a positive – you have a choice as to the representation, storage and export of your data – a choice of formats! And many, like FuGE and ISA-TAB, are going to be easily convertable. The choice depends on your needs, but there is one easy choice: use something that's already been developed – don't reinvent the wheel!
Anyone else have any further suggestions?