Presented by Anna-Lise Veuthey, from SIB.
Other than where specified, these are my notes from the IB07 Conference, and not expressions of opinion. Any errors are probably just due to my
own misunderstanding. 🙂
Increase interoperability between molecular biology and clinical resources by indexing UniProtKB with medical terminologies, including MeSH. Related work includes GenesTrace, PhenoGO, and MedGene. These systems use text mining methods, or knowledge- and semantic-based methods using ontological relationships of terms.
Why use MeSH? MeSH is a hierarchical CV developed by NLM. It is part of UMLS and thus is linked to other medical terminologies. Further, it is used to index the biomedical literature.
200 disease names from 97 Swiss-Prot entries manually mapped to MeSH terms. used to evaluate the procedure in terms of recall and precision, and used to set up a score threshold.
The mapping system was tuned for high precision to provide a fully automated procedure. But we need to improve the recall by: including NLP techniques in the disease extraction and matching procedures, refining the score with other parameters, trying to map to other terminologies such as SNOMed-CT, and using information from the literature which is indexed with MeSH terms.
They developed a generic terminology mapping procedure which can be used to link various biomedical resources. Further, indexing SP with medical terms opens new possibilities of searching and mining data relevant for clinical research.