SOC Software and Tools

Visiting with Enigma

Last week I got to see, touch, and hear about an Enigma machine. It was a really amazing experience. I meant to write about it then, but a variety of things (including having to decompile a variety of .class files because IntelliJ, which up until now was a picture-perfect IDE, emptied the corresponding .java files) came up. A very good post on the same experience can be found in Dan Swan’s blog, so I won’t duplicate that work here.

Suffice it to say it stuck in my head, especially since I had recently read Cryptonomicon (Neal Stephenson). Simon Singh was recommended as author for a very good non-fiction codebreaking history book. Another recommended book in this area is Jack Copeland’s Colossus, on the Colossus machine, which was not used for breaking Enigma, but rather for breaking a completely different cipher from the Lorenz SZ 40/42 cipher machine.

p.s. Yes, I had a backup of my java files, but no, they were a couple of days old and therefore it was quicker to decompile using jad.


Bomb Scare at Newcastle University

The entire university was evacuated this morning around 10:20am. What was originally thought to be another fire drill was announced (after around 20 minutes) to actually be a bomb scare. Everyone in the University was moved from the fire-safe positions to (I am guessing) bomb-safe positions at Exhibition Park nearby. Supposedly an anonymous phone call had been made to the University this morning, warning that a bomb would go off around 11:00am. Most of what I’m saying are rumours that went around the people waiting at the park, so it is unclear what was really going on.

It was a surreal situation: first, the migration en masse of a very large number of academics down the street and across a junction to the park. Everyone actually used the crosswalk! I would say that was a very British thing to do, except that there were many people who weren’t British among the group. There was certainly no panic. Then, when we got to the park, the only entertainment (initially at least) were all the young men who were practicing their cycling skills in the skatepark. Later, of course, there were the police, TV crew and firemen to watch, as well as our fellow academics. People watching at its finest! Believe me, seeing a group of about 20 cleaning ladies dressed in neat blue-and-white checked dresses sitting inside the skatepark, themselves watching the bike tricks, was definitely a memorable moment in time.

Once our half of the university was re-opened, walking back en masse was another interesting experience. There were so many of us that it seemed like we were a particularly oddly-dressed section of the Great North Run. We even went over the police tape that had earlier been barring the road, which made it seem like we were all crossing a finish line. I should have taken a picture with my phone, but I still forget such new-fangled technology is sitting in my pocket.

By 12:15, half of the university was back in their offices: the other buildings, including the medical school, still hadn’t been cleared. Conflicting rumours were passed around while we were at Exhibition Park: some said the caller identified the Medical School as the location of the bomb. Others, the new Devonshire building, while still others said no specific building was named, which was why the entire University needed to be cleared. I don’t know whether just a prank or something more.

In any case, as long as it remained a threat and nothing concrete, I could think of worse ways to spend more than two hours than in a park chatting with friends.

Update: There is now a news item on the Newcastle University website (Newcastle staff only). Basically, the link says that they received a warning for a bomb threat that they considered serious. I’ll post a link to a public site if one becomes available. By 15:00 all University buildings had been reoccupied.

Software and Tools


I couldn’t resist posting on the topic of C++ for-loops, as described by this wonderfully irreverent Reg Developer article. I shall leave you to read it, rather than summarizing it in detail here, but as a quick one-liner, it goes into the argument of incrementing counters versus incrementing iterators, and then in a final touch, touches on using algorithms over iterators.

As someone who came to C++ via the unintuitive path of Java -> C -> C++ (my coursework was more “advanced”, if I won’t get torn to pieces saying that, than the coding required in my first job), good ol’ i++ was one of my best friends. Admittedly, using other aspects of “pure” C++ were not problematic: I loved lording the use of templates in vectors, lists etc. over the Java folks. No casting in and out of Object for me! Shame Java had to go and sort out their language to allow that: no lording has happened recently.

Back to the subject at hand. I have to agree with the author of the Reg article, and say that about 90% of the time I try to “be good” and use an algorithm instead of for loops, I end up writing my own functors. (Ok ok – so the first few times I wrote a functor because I thought it was fun, not because it was neccessary – I still can count that in the percentages, can’t I?)

In short: looping algorithms – more fun than they look, but a guilty pleasure as I still cannot quite justify them for the simpler cases. But then the answer is never almost never “all or nothing” programming choices. (I’m sure there’s a completely “all or nothing” choice out there, enjoying its role as the exception that proves the rule.) As a biologist and a programmer (read “bioinformatician”, which is just too tough to say), I find I like this result, in keeping with the biologist’s perspective: the messy answers are the best ones.

Meetings & Conferences Standards

FuGO Workshop Days 2-3

Ever since Monday (Day 1), there has been change afoot in the secret depths of the FuGO workshop. Not only were the discussions stimulating (as my previous post indicated), but there were ideas of redefinitions and term shuffling that grew, and then grew again during the evening of beer and revelry at the Red Lion Hinxton. Days 2 and 3 continued in this vein, and while I am being deliberately obtuse in order to tantalize the reader with our goings-ons, there was the smell of change in the air (and luckily, in this low-30s heat wave for Britain, that was all – our meeting room is one of the few in the entirety of the EBI where 15+ people can sit in air-conditioned comfort).

I think we are all starting to feel comfortable about where FuGO is headed, and while there was probably a little “analysis paralysis” (a term which Chris was the first, but not the last, to gently use at this meeting), the top-level decisions that need to be made at this stage do require serious discussion, and I believe the balance was about right. Everyone was contributing, and the daily (local to the workshop) update of the OWL file looked significantly different after yesterday’s changes. I shall wait to comment on any specifics until everything is up on the FuGO website, but I look forward with interest to the final day of discussion, and will probably have a sufficently tired brain that the talks on upcoming FuGO tools on Friday will be a balm.

Meetings & Conferences Standards

FuGO Workshop Day 1

Today was day 1 of my first (but in reality the 2nd) FuGO Workshop. It was full of interesting ontology ideas for the realm of functional genomics and beyond. There were talks concerning new ontological communities (GCP by Martin Senger, BIRNLex by Bill Bug, Immunology Database and Analysis Portal by Richard Scheuerman, and IEDB by Bjoern Peters) first thing, followed by a very interesting discussion on OBO Foundry Principles and developments in a phenotype ontology for model organisms and a unit/measurement ontology.

The most interesting points made this morning (to me!) were:

  1. Richard Scheuerman: the Immunology DB and Analysis Portal group has been thinking closely about data relationships – how much do you capture in the ontology and how much in the corresponding data model? Their current answer is to use only an is_a in ontologies, and to capture more specific stuff in the data model. (As I understood it, they would like to change this to make the ontology more complex at a later date). With their ontology, they emphasized modeling the data based on how the data was going to be used (don’t go into too much detail of the robots, for example). In choosing what data fields to require, they realize that experimentalists would only be happy to fill out forms that had perhaps a dozen data fields, therefore it is important to choose fields which anticipate how users will want to query the database.
    I really like this idea of requiring only those fields which would be of most use to the biologist, rather than those that would make us bioinformaticians most happy. Hopefully with time, the biologist would be happy to fill out more of a FuGE data structure.
  2. The MIcheck Foundry, which will create a “suite of self-consistent, clearly bounted, orthogonal, integrable checklist modules”. This is coming out in a paper (currently in press in Nature Biotech by CF Taylor et al.). It will contain MICheck – a “common resource for minimum information checklists” analogous to obo/ncbo bioportal (analogous to a shop window for displaying these checklists). There are many minimum information checklists out there, and the number will only grow, so it makes a lot of sense.

The afternoon was characterized by good-natured “discussion”. Here’s a summary of the mild-mannered (but – seriously – quite interesting) discussions:

  1. Argument against multiple inheritance (MI) in application ontologies, by Barry Smith.
    The root of the problem is that one shouldn’t combine two diff classifications, e.g. color and car type (if trying to have a red cadillac inherit from both red car and cadillac). Instead there should be a color ontology and car ontology. Many ontologies were originally built to support administrative classifications, but his opinion is that when you’re doing science, you’re interested in capturing the law-like structures in reality, not administrative information. Barry says every instance in reality can fall into many types of classes – the issue is how to build the ontologies: you can capture these instances by having either separate single inheritance (SI) ontologies, or one single “messy” MI ontology. Further, if you have MI, you might not have reuseability (ie why have colors just in car ontology, when they could be reused if they were separate?). In response, Robert Stevens suggested that you could have MI and let a machine deal with the differences – take car and color ontology and combine them mechanically through the relationships you put between them: people shouldn’t be scared of multiple inheritance, just use it carefully.
    The final conclusion was that normalization can upack MI into SI as long as it is a “good” MI ontology, and therefore this can be a reasonable way of ensuring your MI ontology is still a strong one – most errors are associated with misuse or overuse of MI. Another way to think of it would be to have a set of SI reference ontologies, but in your application ontology you would use MI. SI is very useful, as it supports all sorts of reasoning algorithms / statistical tools that MI does not allow. So MI ontology ONLY works if you can break it down into a set of SIs. Normalize a MI in order to use the checking mechanism, and only use a MI if it can be normalized.
  2. I learnt today about fiat types (no – not a car!), and how they relate to a putative measurement/unit ontology. A unit is a fiat universal/type in the dimension we use it to measure. In other words, measurements involve fiat units. A measurement ontology should be about storing units, and therefore do NOT want numbers. Numbers are part of the data, not the ontology.

…And that was only the first day! Wowsers…