Name | APPS/BIO/BWA_0.6.1 |
---|---|
Description | BWA short read mapper |
Status | Production |
Last update | 2011-12-12 |
Only this development version currently available.
The runtime environment sets the following environment variables:
Here is a simple test case for the runtime environment.
Download the example files here.
The job description file bwa.xrsl
& (executable=runbwa.sh) (jobname=bwa_1) (stdout=std_1.out) (stderr=std_1.err) (gmlog=gridlog_1) (walltime=24h) (memory=8000) (disk=4000) (runtimeenvironment>="APPS/BIO/BWA_0.6.1") (inputfiles= ( "query.fastq" "query.fastq" ) ( "S_pyogenes_m2.EB1_s_pyogenes_m2.dna.toplevel.fa" "S_pyogenes_m2.EB1_s_pyogenes_m2.dna.toplevel.fa" ) ) (outputfiles= ( "query.fastq.sam" "query.fastq.sam" ) )
The job script is very simple
#!/bin/sh echo "Hello BWA!" genome="S_pyogenes_m2.EB1_s_pyogenes_m2.dna.toplevel.fa" query="query.fastq" bwa index $genome bwa aln -t $BWA_NUM_CPUS $genome $query > $query.sai bwa samse $genome $query.sai $query > $query.sam exitcode=$? echo "Bye BWA!" exit $exitcode
Here the actual run consists of three steps: 1. Indexing the reference genome (bwa index), 2. Running the mapping task (bwa aln), 3. converting sai fomrmatted resuly file to a sam file format(bwa samse). The exitcode frombwa samse is used as the exit code for the script, this way ARC knows whether the job has succeeded or failed.
Source and installation instructions for the BWA software itself can be found from the BWA home page
Here is an example of installing the latest version of BWA.
Get the package:
wget http://sourceforge.net/projects/bio-bwa/files/bwa-0.6.1.tar.bz2/download bunzip2 bwa-0.6.1.tar.bz2 tar xvf bwa-0.6.1.tar
And compile:
cd bwa-0.6.1 make
Download runtime environment script template for SLURM.
Modify the scripts as needed and save the main script in your ARC runtime directory as APPS/BIO/BWA_0.6.1. If you wish to use the grid_bwa submission tool, make sure that you have also SAMTOOLS run time environemnt installed in your cluster.
As long as the interface requirements are satisfied, the implementation does not really matter. And some adaptation is needed anyway to accomondate differences in the cluster environment (batch queue systems, temporary directory location etc.)
Contact kimmo.mattila@csc.fi if you have any grid_bwa use specific questions. Contact your local BWA guru in sequence analysis related questions.