Choosing the best format for raw sequence data

Introduction In the current Illumina pipeline raw sequence data is generated in qseq files, but can optionally be converted to the more standard FastQ format for use with other analysis programs.  The FastQ files produced are uncompressed text files and take up a considerable amount of space in our storage system.  We’ve therefore been thinking…

Date
Categories
Tags
Comments

Published:June 16, 2011 View Post

Bioinformatics Computing

Comments closed

Adding custom chromosome name mappings into SeqMonk

When loading data into SeqMonk the program has to try to connect the chromosome names used in your data file with those which are present in the genome one which your project is based.  In many cases there won’t be an exact match between the two – many mapping programs report file names in their…

Date
Categories
Tags
Comments

Published:June 11, 2011 View Post

Bioinformatics

Comments closed

Interpreting the duplicate sequence plot in FastQC

Background The one analysis module which seems to elicit more questions than any other is the duplicate sequence plot. Of all of the plots which the program generates it’s probably the one which causes the most warnings / errors in otherwise nice looking data. I’m happy to admit that it’s not always immediately obvious what…

Date
Categories
Tags
Comments

Published:May 23, 2011 View Post

Bioinformatics

Comments closed

Selecting large random integers in Perl

We had a very odd bug in a simulation we were writing recently.  We were supposed to be sampling from a large pool of possible data, but were getting a very weird distribution of values. After much debugging we found a most unusual cause. Here is the pop quiz – read through the short script…

Date
Categories
Tags
Comments

Published:April 11, 2011 View Post

Computing

Comments closed

Recovered from defacing

Anyone visiting the site in the last couple of days may have found that I appeared to have simultaneously developed a fanatical interest in middle eastern politics, and a very poor taste in music.  Whilst it’s very convenient to have a simple to use blogging engine such as WordPress to use, I guess the downside…

Date
Categories
Tags
Comments

Published:April 10, 2011 View Post

Computing

Comments closed

First Impression of Antigua Winds SS490LQ Soprano Saxophone

Introduction I’d put off adding a soprano sax to my existing tenor and alto for a number of years since I had heard from a few sources that whilst it was possible to get a good sound out of a cheap alto/tenor that you really needed to spend a decent amount of money to get…

Date
Categories
Tags
Comments

Published:March 6, 2011 View Post

Music

Comments closed

Creating the ideal simple recording setup for live acousitc concerts

The Brief I regularly perform with a number of groups ranging from small ensembles, to choirs and full orchestras.  For a long time now I’ve been wanting to have a simple recording setup which I could use to make high quality recordings of these events, mostly for personal use but also with an eye towards…

Date
Categories
Tags
Comments

Published:February 6, 2011 View Post

Technology

Comments closed

The practical experience of owning a Kindle

I’ve now had my kindle for about two weeks, and during that time I’ve experienced the highs and lows of kindle ownership. Overall I must say that I am very happy with my purchase.  Far from being a gimmick I have found the Kindle to be a genuinely useful piece of technology.  It’s been getting…

Date
Categories
Tags
Comments

Published:January 15, 2011 View Post

Technology

Comments closed

How good is ‘good enough’ for research software

There are two linked problems which seem to face me with every piece of software I write for research use: When is the software complete enough to write a paper on it How to manage the versions and project description I think that although similar questions arise within software written for general use, their answers…

Date
Categories
Tags
Comments

Published:September 12, 2010 View Post

Bioinformatics Computing

Comments closed

Where do you analyse next gen sequence data?

We had an interesting discussion at the Bioinfo-core workshop at ISMB2010.  The discussion centred around the best way to handle the logistics of making sequence data available to using a sequencing service.  The problem is that the data is so big that even if you have a large central store you run the risk of…

Date
Categories
Tags
Comments

Published:July 13, 2010 View Post

Bioinformatics Computing

Comments closed