Defensive strategies to get through software paper reviews

Over the last few years I’ve had to review tens of software application papers.  I’ve also been involved in writing a few of these (and have also conciously decided not to write papers for some of our software due to the amount of hassle involved). In many ways software papers are some of the worst…

Date
Categories
Tags
Comments

Published:May 25, 2016 View Post

Bioinformatics

Comments closed

Key points for developing training

The the ISMB workshop for Education in bioinformatics Gabriella Rustici gave an interesting talk on her “Top 10 tips for setting up a bioinformatics training course”.  I agreed with pretty much all of the points and she covered a lot of the types of issues raised elsewhere (make course objectives clear, use well supported software,…

Date
Categories
Tags
Comments

Published:July 12, 2015 View Post

Bioinformatics

Comments closed

Merging STDIN and STDOUT in a gridengine submission

We recently hit a problem when trying to run a fairly simple script through grid engine.  The script used an internal redirection to merge together STDIN and STDOUT before doing a final transformation on the result. A simplified version of what the script did is something like this: echo hello 2>&1 | sed s/^/prefix:/  …

Date
Categories
Tags
Comments

Published:July 8, 2014 View Post

Computing

Comments closed

Fast subset selection by row name in R

Introduction One of the best features about R is the simple way you can use a number of different strategies to create subsets of large data tables.  The basic selection mechanisms you have are that you can subset a data frame by providing: A set of column or row indices A set of boolean values…

Date
Categories
Tags
Comments

Published:December 5, 2013 View Post

Computing

Comments closed

A new way to look at duplication in FastQC v0.11

Introduction After a long gestation we’ll be releasing a new version of FastQC in the near future to address some of the common problems and confusions we’ve encountered in the current version.  I’ll write more about this in future posts but wanted to start with the most common complaint, that the duplicate sequence plot was…

Date
Categories
Tags
Comments

Published:September 3, 2013 View Post

Bioinformatics

Comments closed

Generating R reports with vector images from markdown with knitr

Introduction One really nice addition to a standard R environment is the ability to create reports which combine R code, comments and embedded graphical output.  The original mechanism for doing this was Sweave, but more recently a second system called knitr has emerged which seems to be more flexible, and this is what I’ve been…

Date
Categories
Tags
Comments

Published:June 21, 2013 View Post

Computing

Comments closed

Should you buy a nanopore sequencer?

This morning twitter is awash with posts discussing the newly announced nanopore sequencers from Oxford Nanopore. Speculation has been rife for some time about the potential specifications of the first sequencers to be produced by the company, and it certainly appears that the company have fulfilled the expectations placed upon them. I’m not going to…

Date
Categories
Tags
Comments

Published:February 18, 2012 View Post

Technology

Comments closed

Moving over to Casava 1.8

Introduction Illumina have recently released an updated version of their downstream analysis software CASAVA.  This is the analysis pipeline which runs after the sequencer has processed the raw data down to base call files and provides a variety of functionalities from creating usable base calls to alignment and variant calling.  Casava 1.8 makes some major…

Date
Categories
Tags
Comments

Published:September 16, 2011 View Post

Computing

Comments closed

Importing RNA-Seq data into SeqMonk

Introduction Mapped RNA-Seq data coming from eukaryotes is probably the most complicated data type to import into SeqMonk due to it’s relative complexity and the abundance of options with which you are presented.  Depending on exactly what sort of information you want to know about your data different data import options will be useful, so…

Date
Categories
Tags
Comments

Published:September 4, 2011 View Post

Bioinformatics

Comments closed

Want to improve your science? Get a dog.

Actually the dog is somewhat irrelevant – it’s what comes with it which matters.  One of the side-effects of dog ownership is that you get to spend an hour or so a day out walking, which means you have an hour or so with your own thoughts and no distractions. I’m sure everyone has experienced…

Date
Categories
Tags
Comments

Published:June 19, 2011 View Post

Computing

Comments closed