How good is ‘good enough’ for research software

There are two linked problems which seem to face me with every piece of software I write for research use:

  1. When is the software complete enough to write a paper on it
  2. How to manage the versions and project description

I think that although similar questions arise within software written for general use, their answers are quite different for research software.

In a general software project it’s not unreasonable to set out with an idea of what you want the software to do and to have a list of features which, once complete, would mark the project as ready for a stable release.  Following the standard conventions then works reasonably well:

  • Alpha release – some features not implemented, any feature could be subject to change
  • Beta release – all (or nearly all) features implemented but bugs may still exist
  • Stable release – all features should be complete and code should be well tested to eliminate major bugs

On our projects page we used to put our software into these categories, but soon found that they really didn’t work well for research based software.  Most crucially, in normal software development progressing from alpha – beta – stable indicates an increase in quality, but in research software it peaks at beta, and stable is probably a bad sign.  For research software we now use a system more like the following:

  • Alpha – Software is still experimental.  It’s not been widely tested so treat all derived results with a healthy dose of scepticism and double check everything.  File formats and major concepts could still be subject to radical change.
  • Beta – The way the program works is mostly stable and the released functionality has been pretty well tested and results should mostly be reliable.  Specific details are still subject to change and more functionality could be added at any point, but saved files should still continue to work.
  • Legacy – Generally used for projects which aren’t being worked on any more.  If the functionality of the last release does what you want, then great, but no new functionality will be added, although obvious bugs will be fixed.

The most successful and active project are therefore under a pattern of continuous rolling development internally to adapt to the ongoing requirements of the research.  Public releases tend to happen either when a reasonably serious bug is found, or when a bunch of new features are complete and everything in the menus actually works.

The problem with this kind of development is that no project is ever ‘finished’, which means there’s no obvious point at which to pause to say ‘It’s time to write this up as a paper’.  There’s always one more thing you want to get finished before publishing.  Since our group is a core facility rather than a research group we’re not really under pressure to produce our own papers, since our focus is to provide a service – and this makes the problem worse.  We’ve had a few projects now where we kept deferring writing a paper to the point where the project moved into the ‘legacy’ phase and it was too late.

My new resolution therefore is to be less cautious about this.  As soon as a project is producing scientifically useful results and is a state of completion where I’d be happy for other people to use it without putting warning stickers all over it we should aim to get a paper out on it.  This will both help to promote work that we’re doing and get more people involved, but may also help to focus our attention on exactly which features we want to complete before committing to the publication.


Published:September 12, 2010

Bioinformatics Computing

Bookmark the permalink

Both comments and trackbacks are currently closed.