…the Protein Subcellular Location Image Database: Subcellular location assignments, annotated image collections, image analysis tools, and generative models of protein distributions

Robert Murphy, Carnegie Mellon

Everything he describes are open source and available on their website.

Tools that analyze images of proteins and their distributions within cells. SLIC = subcellular location image classifcation. The challenges include: cells have different shapes, sizes, and orientations; structures are not found in fixed locations within the cell; instead, they describe each image numerically and operate on the descriptors, known as SLF or subcellular location features. The tools within SLIC are: segmentation, feature classification, clustering and comparison. You can do the analysis at many different levels of granularity. Computational classification of subcellular location is very high quality. SLIC is available in Matlab, and in Python, and some of it has been ported to C++/ITK.

Decomposing mixture patterms involve sorting out proteins that are in more than one place. PUnMix either learns to unmix given instances of the pure patterns, or will use a previous instance of a pattern. You learn the types by clustering using object features. For instance, if they know what a lysosome and what a golgi pattern looked like, and the computer is given a mixture, the computer can tell you what sort of fraction. But, how do you test something like that? Create real images that are mixtures of two different probes.

To determine nuclear shape you can use the medial axis model. 11 parameters allow you to synthesize one nucleus – you learn those 11 parameters over 1000s of nuclei and you get a distribution. The model for the cell shap is about Distance ratio, and capture variation as a Principal components model, typically using 10 principle components. For models for protein containing objects , you see them as a mix of gaussian objects and learn distributions. The SLML model toolbox is all about storing these models. If you want to do cell simulations, then you can combine models together – interesting for the virtual cell, as you can model the proteins inside the cell with this.

They also distribute lots of annotated data sets where they’ve collected images of different proteins, both in 2d and 3d.

FriendFeed Discussion:

Please note that this post is merely my notes on the presentation. They are not guaranteed to be correct, and unless explicitly stated are not my opinions. They do not reflect the opinions of my employers. Any errors you can happily assume to be mine and no-one else’s. I’m happy to correct any errors you may spot – just let me know!


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s