Introduction In the current Illumina pipeline raw sequence data is generated in qseq files, but can optionally be converted to the more standard FastQ format for use with other analysis programs. The FastQ files produced are uncompressed text files and take up a considerable amount of space in our storage system. We’ve therefore been thinking…
- Home
- Articles posted by simon
- (Page 3)
Adding custom chromosome name mappings into SeqMonk
When loading data into SeqMonk the program has to try to connect the chromosome names used in your data file with those which are present in the genome one which your project is based. In many cases there won’t be an exact match between the two – many mapping programs report file names in their…
Interpreting the duplicate sequence plot in FastQC
Background The one analysis module which seems to elicit more questions than any other is the duplicate sequence plot. Of all of the plots which the program generates it’s probably the one which causes the most warnings / errors in otherwise nice looking data. I’m happy to admit that it’s not always immediately obvious what…
Published:May 23, 2011 View Post
Selecting large random integers in Perl
We had a very odd bug in a simulation we were writing recently. We were supposed to be sampling from a large pool of possible data, but were getting a very weird distribution of values. After much debugging we found a most unusual cause. Here is the pop quiz – read through the short script…
Recovered from defacing
Anyone visiting the site in the last couple of days may have found that I appeared to have simultaneously developed a fanatical interest in middle eastern politics, and a very poor taste in music. Whilst it’s very convenient to have a simple to use blogging engine such as WordPress to use, I guess the downside…
First Impression of Antigua Winds SS490LQ Soprano Saxophone
Introduction I’d put off adding a soprano sax to my existing tenor and alto for a number of years since I had heard from a few sources that whilst it was possible to get a good sound out of a cheap alto/tenor that you really needed to spend a decent amount of money to get…
Creating the ideal simple recording setup for live acousitc concerts
The Brief I regularly perform with a number of groups ranging from small ensembles, to choirs and full orchestras. For a long time now I’ve been wanting to have a simple recording setup which I could use to make high quality recordings of these events, mostly for personal use but also with an eye towards…
Published:February 6, 2011 View Post
The practical experience of owning a Kindle
I’ve now had my kindle for about two weeks, and during that time I’ve experienced the highs and lows of kindle ownership. Overall I must say that I am very happy with my purchase. Far from being a gimmick I have found the Kindle to be a genuinely useful piece of technology. It’s been getting…
How good is ‘good enough’ for research software
There are two linked problems which seem to face me with every piece of software I write for research use: When is the software complete enough to write a paper on it How to manage the versions and project description I think that although similar questions arise within software written for general use, their answers…
Where do you analyse next gen sequence data?
We had an interesting discussion at the Bioinfo-core workshop at ISMB2010. The discussion centred around the best way to handle the logistics of making sequence data available to using a sequencing service. The problem is that the data is so big that even if you have a large central store you run the risk of…