Thursday, February 27, 2014

How to maintain "clean experiments"?

During his presentation on software design and "clean code", Gautham made an excellent point: if we spent half as much effort making our code clear as we did on making our papers clear, our code would be pristine. And it's true: I know that I'm fairly relentless about making figures well and trying to write as clearly as possible. It's an exhausting, iterative process, but I feel like it's really important, and I think it's a good idea to apply the same sensibility to our code, which thankfully Gautham has done.

That got me thinking about what other aspects of our work could benefit from a focus on clarity and cleanliness, leading me to a natural candidate: our data. Currently, I am a stickler for "data trails". Every single figure in every paper lives completely in a folder in which lives a single script which, when run, will generate all the elements of the figure from raw data without any manual input required. This way, when you are wondering exactly where that one funny data point came from, you can (relatively) easily find that one threshold from that one cell from that one experiment that led to that data point. Some of this relies on software design, making sure that the code we use the analyze data keeps track of all the relevant metadata.

But what I haven't focused on or really thought about very much is how best to systematize data at the experiment level. That is, how can we ensure that we log our data appropriately and make sure that we can find what we need when we need it. Yes, of course we use lab notebooks and Google Docs, but it's currently a bit of a mish-mash. What is the best way to do this? I don't know, but here are a few of the important features I think such a system must have:
  1. It must have low overhead. If you have to take off your gloves in the heat of an experiment to go make an entry in your Google Doc log, there's a high probability it's not going to happen. Paper notebooks are great at this, computers bad, tablets good. I initially bought netbooks for the lab for this purpose, but nobody used them because they were so crappy and took like 38 minutes just to wake from sleep. Maybe an iPad?
  2. It must be searchable. This is the singular failure of the paper notebook, which is otherwise so wonderful in so many ways. If you can search quickly through your old experiments to see which experiments used so and so reagent, it really makes you more productive. This is killer, and Google Docs rules for this. Even on a tablet, though, I don't know how you can combine handwriting with searchability, unless you have good optical character recognition.
  3. It must be flexible. It has to somehow be able to quickly capture a large number of different types of experiments. Most electronic lab notebooks are so structured and unwieldy, with so many things to click to define your experiment, that they're just not worth the hassle. Again, freeform, like the paper notebook, makes the most sense here.
So far, the closest thing looks like some sort of tablet app that can translate handwriting to searchable text with a cloud organization option would make the most sense. Anybody know of such a thing?

No comments:

Post a Comment