Other than where specified, these are my notes from the IB07 Conference, and are in no
way expressions of opinion, and any errors are probably just due to my
OXL is the ONDEX data format, and they are presenting it as a possible format for the exchange of integrated data. OXL is based upon an ontology (opinion/question: a true ontology, or a CV?) of concepts and relations. ONDEX itself is an open-source data warehouse in Java that performs ontology-based data integration. OXL is in RDF. There are two ways to use RDF: firstly, model things as predicates (but then you cannot have attributes), and secondly they should be modelled as classes. However, it also seems that they have OXL in XML format, using an XSD.
In their XML format, they don't use any cross-references: it is fully expanded. Yes, it generates lots of XML files, but with file compression it isn't a problem. It does make whole-document validation more difficult, but they're working on it. This method makes it more human-readable.
They then presented some examples. The first was the identification of possible pathogenicity genes in Vibrio salmonicida (with the university of Tromso). Identify clusters of orthologs involving V. salmonicida, then colour nodes according to pathogenicity phenotype.
Here are my opinions: A well-presented talk on the whole. Don't mean to harp on today about architecture slides, but they're important when describing software. They had some, but they were so small they were pretty hard to read. Also, I've never been convinced about the "human-readable" explanation for why to make a change to an XSD: XML is simply not meant to be human-readable, and changes shouldn't be made to the XSD to make it so. However, ONDEX is a reasonably mature application, and so it may be useful to ask others to use their format. My main question is about probabilities: a lot of similar work uses weights on edges in data integration: how can these be modelled with OXL?