Name	APPS/BIO/BWA_0.6.1
Description	BWA short read mapper
Status	Production
Last update	2011-12-12

BWA Runtime Environment home page

Version information

Only this development version currently available.

Interface definition

The runtime environment sets the following environment variables:

BWA_DIR points to the BWA/bin base directory
PATH is set so that the $BWA_DIR is included in the path
BWADB is set to a suitable temporary directory for unpacking the datasets
BWA_NUM_CPUS is set to the number of allocated cpus for this job. Remember to use bwa aln -t $BWA_NUM_CPUS ... when running.

Examples

Here is a simple test case for the runtime environment.

Download the example files here.

The job description file bwa.xrsl

&
(executable=runbwa.sh)
(jobname=bwa_1)
(stdout=std_1.out)
(stderr=std_1.err)
(gmlog=gridlog_1)
(walltime=24h)
(memory=8000)
(disk=4000)
(runtimeenvironment>="APPS/BIO/BWA_0.6.1")
(inputfiles=
( "query.fastq" "query.fastq" )
( "S_pyogenes_m2.EB1_s_pyogenes_m2.dna.toplevel.fa" "S_pyogenes_m2.EB1_s_pyogenes_m2.dna.toplevel.fa" )
)
(outputfiles=
  ( "query.fastq.sam" "query.fastq.sam" )
)

The job script is very simple


#!/bin/sh
echo "Hello BWA!"
genome="S_pyogenes_m2.EB1_s_pyogenes_m2.dna.toplevel.fa"
query="query.fastq"
bwa index $genome
bwa aln -t $BWA_NUM_CPUS $genome $query > $query.sai 
bwa samse $genome $query.sai $query > $query.sam
exitcode=$?
echo "Bye BWA!"
exit $exitcode

Here the actual run consists of three steps: 1. Indexing the reference genome (bwa index), 2. Running the mapping task (bwa aln), 3. converting sai fomrmatted resuly file to a sam file format(bwa samse). The exitcode frombwa samse is used as the exit code for the script, this way ARC knows whether the job has succeeded or failed.

System administrator guide for installing the RE

BWA source code

Source and installation instructions for the BWA software itself can be found from the BWA home page

Here is an example of installing the latest version of BWA.

Get the package:

wget http://sourceforge.net/projects/bio-bwa/files/bwa-0.6.1.tar.bz2/download
bunzip2 bwa-0.6.1.tar.bz2
tar xvf bwa-0.6.1.tar

And compile:

cd bwa-0.6.1
make

Download runtime environment script template for SLURM.

Modify the scripts as needed and save the main script in your ARC runtime directory as APPS/BIO/BWA_0.6.1. If you wish to use the grid_bwa submission tool, make sure that you have also SAMTOOLS run time environemnt installed in your cluster.

As long as the interface requirements are satisfied, the implementation does not really matter. And some adaptation is needed anyway to accomondate differences in the cluster environment (batch queue systems, temporary directory location etc.)

Contact information

Contact kimmo.mattila@csc.fi if you have any grid_bwa use specific questions. Contact your local BWA guru in sequence analysis related questions.