EMBOSS at CSC

Tehdyt toimenpiteet

EMBASSY: MIRA: emira

emira

Wiki

The master copies of EMBOSS documentation are available at http://emboss.open-bio.org/wiki/Appdocs on the EMBOSS Wiki.

Please help by correcting and extending the Wiki pages.

Function

MIRA fragment assembly program

Description

**************** EDIT HERE ****************

Algorithm

**************** EDIT HERE ****************

Usage

Here is a sample session with emira


% emira -setparam fasta -project cjejuni_demo -genome accurate -mxti -rns tigr -orh 
MIRA fragment assembly program

This is MIRA V2.8.3 (production version).

Please cite: Chevreux, B., Wetter, T. and Suhai, S. (1999), Genome Sequence
Assembly Using Trace Signals and Additional Sequence Information.
Computer Science and Biology: Proceedings of the German Conference on
Bioinformatics (GCB) 99, pp. 45-56.

Mail questions, bug reports, ideas or suggestions to:
	bach@chevreux.org

Compiled in boundtracking mode.
Compiled in bugtracking mode.

Parsing parameters: -genomeaccurate -fasta -GE:project=cjejuni_demo -GE:mxti=yes -OUT:orh=yes -GE:rns=tigr

Using quickmode switch -genomeaccurate : 
	-GE:uti=yes
	-AS:mrl=40:nop=4:sep=yes:rbl=4:sd=yes:sdlpo=yes:ugpf=yes
	-DP:ure=yes:rewl=30:rewme=2:feip=0;leip=0:tpae=no
	-CL:pvc=yes:pvcmla=18:qc=no:mbc=no:emlc=yes:mlcr=25:smlc=30
	-SK:bph=16:hss=4:pr=45:mhpr=200
	-AL:bip=20:bmin=25:bmax=130:mo=15:ms=30:mrs=65:egp=yes:egpl=low
	-CO:rodirs=25:mr=yes:asir=no:mrpg=2:emea=25
	    amgb=yes:amgbemc=yes:amgbnbs=yes
	-ED:ace=no

Using quickmode switch fasta : -GE:lj=fasta



Parameters parsed without error, perfect.

Used parameter settings:
  General (-GE):
	Project name (pro)                      : cjejuni_demo
	Load job (lj)                           : FASTA file (fasta)
	Filecheck only (fo)                     : No
	External quality (eq)                   : from SCF (scf)
	    Ext. qual. override (eqo)           : No
	    Discard reads on e.q. error (droeqe): No
	Read naming scheme (rns)                : TIGR (tigr)
	Merge with XML trace info (mxti)        : Yes
	Use template information (uti)          : Yes

	EST-assembly start step (ess)           : 1

  Assembly options (-AS):
	Minimum read length (mrl)               : 40
	Number of passes (nop)                  : 4
	    Skim each pass (sep)                : Yes
	Maximum number of RMB break loops (rbl) : 4
	Spoiler detection (sd)                  : Yes
	    Last pass only (sdlpo)              : Yes
	Base default quality (bdq)              : Yes

	Use genomic pathfinder (ugpf)           : Yes

	Use emergency search stop (uess)        : Yes
	    ESS partner depth (esspd)           : 500
	Use emergency blacklist (uebl)          : Yes
	Use max. contig build time (umcbt)      : No
	    Build time in seconds (bts)         : 10000

  Strain and backbone options (-SB):
	Load straindata (lsd)                   : No
	Load backbone (lb)                      : No
	    Start backbone usage in pass (sbuip): 3
	    Backbone strain name (bsn)          : (none)
	    Backbone file type (bft)            : FASTA file (fasta)
	    Backbone rail length (brl)          : 2500
	    Backbone base quality (bbq)         : 0
	    Also build new contigs (abnc)       : Yes

  Dataprocessing options (-DP):
	Use read extensions (ure)               : Yes
	    Read extension window length (rewl) : 30
	    Read extension w. maxerrors (rewme) : 2
	    First extension in pass (feip)      : 0
	    Last extension in pass (leip)       : 0
	Tag poly A/T at ends (tpae)             : No
	    Polybase window length (pbwl)       : 7
	    Polybase window maxerrors (pbwme)   : 2
	    Polyb. window grace distance (pbwgc): 9

  Clipping options (-CL):
	Possible vector leftover clip (pvc) : Yes
	    maximum len allowed (pvcmla)    : 18
	Quality clip (qc)                   : No
	    Minimum quality (qcmq)          : 20
	    Window length (qcwl)            : 30
	Masked bases clip (mbc)             : No
	    Gap size (mbcgs)                : 20
	    Max front gap (mbcmfg)          : 40
	    Max end gap (mbcmeg)            : 60
	Ensure minimum left clip (emlc)     : Yes
	    Minimum left clip req. (mlcr)   : 25
	    Set minimum left clip to (smlc) : 30

  Parameters for SKIM algorithm (-SK):
	Bases per hash (bph)             : 16
	Hash save stepping (hss)         : 4
	Percent required (pr)            : 45
	Maximum hashes in memory (mhim)  : 15000000
	Max hits per read (mhpr)         : 200

  Align parameters for Smith-Waterman align (-AL):
	Bandwidth in percent (bip)         : 20
	Bandwidth max (bmax)               : 130
	Bandwidth min (bmin)               : 25
	Minimum score (ms)                 : 30
	Minimum overlap (mo)               : 15
	Minimum relative score in % (mrs)  : 65
	Extra gap penalty (egp)            : Yes
	    extra gap penalty level (egpl) : low
	    Max. egp in percent (megpp)    : 100

  Contig parameters (-CO):
	Name prefix (np)                                         : cjejuni_demo
	Error analysis (an)                                      : SCF signal (signal)
	Reject on drop in relative alignment score (%)           : 25
	Max. error rate in dangerous zones in % (dmer)           : 1
	Mark repeats (mr)                                        : Yes
	    Assume SNP instead of repeats (asir)                 : No
	    Minimum reads per group needed for tagging (mrpg)    : 2
	    Minimum neighbour quality needed for tagging (mnq)   : 20
	    Minimum Group Quality needed for RMB Tagging (mgqrt) : 30
	    End-read Marking Exclusion Area in bases (emea)      : 25
	    Also mark gap bases (amgb)                           : Yes
	        Also mark gap bases - even multicolumn (amgbemc) : Yes
	        Also mark gap bases - need both strands (amgbnbs): Yes
	Default template insert size minimum (dismin)            : 500
	Default template insert size maximum (dismax)            : 5000

  Edit options (-ED):
	Automatic contig editing (ace)        : No
	Strict editing mode (sem)             : No
	Confirmation threshold in percent (ct): 50

  Directories (-DI):
	When loading EXP   files: 
	When loading SCF   files: 
	For writing log files   : cjejuni_demo_log
	For writing gap4 DA res.: cjejuni_demo_out

  Input files (-FI):
	When loading EXP fofn                    : cjejuni_demo_in.fofn
	When loading project from PHD            : cjejuni_demo_in.phd.1
	When loading project from CAF            : cjejuni_demo_in.caf
	When loading sequences from FASTA        : cjejuni_demo_in.fasta
	When loading qualities from FASTA quality: cjejuni_demo_in.fasta.qual
	When loading straindata                  : cjejuni_demo_straindata_in.txt
	When loading XML trace info files        : cjejuni_demo_traceinfo_in.xml

	When loading backbone from CAF           : cjejuni_demo_backbone_in.caf
	When loading backbone from GenBank       : cjejuni_demo_backbone_in.gbf
	When loading backbone from FASTA         : cjejuni_demo_backbone_in.fasta

  Output files (-OUTPUT/-OUT):
    Result files:
	Saved as CAF                       (orc): Yes
	Saved as FASTA                     (orf): Yes
	Saved as GAP4 (directed assembly)  (org): Yes
	Saved as phrap ACE                 (ora): Yes
	Saved as HTML                      (orh): Yes
	Saved as Transposed Contig Summary (ors): Yes
	Saved as simple text format        (ort): Yes

    Temporary result files:
	Saved as CAF                      (otc): No
	Saved as FASTA                    (otf): No
	Saved as GAP4 (directed assembly) (otg): No
	Saved as phrap ACE                (ota): No
	Saved as HTML                     (oth): No
	Saved as Transposed Contig Summary(ots): No
	Saved as simple text format       (ott): No

    Extended temporary result files:
	Saved as CAF                      (oetc): No
	Saved as FASTA                    (oetf): No
	Saved as GAP4 (directed assembly) (oetg): No
	Saved as phrap ACE                (oeta): No
	Saved as HTML                     (oeth): No
	Save also singlets               (oetas): No

    Alignment output customisation:
	TEXT characters per line          (tcpl): 60
	HTML characters per line          (hcpl): 60
	TEXT characters per line         (tegfc): ' '
	HTML characters per line         (hegfc): ' '

    File / directory names:
	CAF             : cjejuni_demo_out.caf
	FASTA           : cjejuni_demo_out.unpadded.fasta
	FASTA quality   : cjejuni_demo_out.unpadded.fasta.qual
	FASTA (padded)  : cjejuni_demo_out.padded.fasta
	FASTA qual.(pad): cjejuni_demo_out.padded.fasta.qual
	GAP4 (directory): cjejuni_demo_out.gap4da
	ACE             : cjejuni_demo_out.ace
	HTML            : cjejuni_demo_out.html
	Simple text     : cjejuni_demo_out.txt
	TCS overview    : cjejuni_demo_out.tcs

Creating directory cjejuni_demo_log ... done.
Creating directory cjejuni_demo_results ... done.
Creating directory cjejuni_demo_info ... done.
Localtime: Thu Jul 15 12:00:00 2010

Loading data normal (probably Sanger type) from FASTA file cjejuni_demo_in.fasta
Counting sequences in FASTA file:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Loading sequence data from FASTA file:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Loading quality data from FASTA quality file:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Done.
There haven been 544 reads given, 544 of which have quality accounted for.
Localtime: Thu Jul 15 12:00:00 2010

Checking SCF files (loading qualities only if needed):
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Done.
0 SCF files loaded ok.
544 SCF files were not found (see 'cjejuni_demo_log/cjejuni_demo_info_scfreadfail.0' for a list of names).


Localtime: Thu Jul 15 12:00:00 2010

Merging data from XML trace info file cjejuni_demo_traceinfo_in.xml ...Num reads: 496
Building hash table ... done.
Localtime: Thu Jul 15 12:00:00 2010

Generated 288 unique template ids for 544 valid reads.
Done merging XML data, matched 496 reads.


Localtime: Thu Jul 15 12:00:00 2010

Checking SCF files (loading qualities only if needed):
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Done.
0 SCF files loaded ok.
544 SCF files were not found (see 'cjejuni_demo_log/cjejuni_demo_info_scfreadfail.0' for a list of names).


Starting minimum left vector clip ... done.
Pool has 544 reads .
Checking reads for trace data:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
No SCF data present in any read, automatic contig editing is now switched off.
544 reads with valid data for assembly.

For the reads that are neither backbones nor rails:
- 0 reads have not enough good bases for assembly.
- 544 reads used for assembly.
- 0 reads have no real quality (see miralog.noqualities).
- mean length of good parts of used reads: 626
Localtime: Thu Jul 15 12:00:00 2010

Generated 288 unique template ids for 544 valid reads.
Localtime: Thu Jul 15 12:00:00 2010

Generated 0 unique strain ids for 544 reads.

Localtime: Thu Jul 15 12:00:00 2010


Searching for possible overlaps:
Localtime: Thu Jul 15 12:00:00 2010
We will get 1 partitions.
Progressend: 1088
Now running partitioned skimmer with 1 partitions:

Working on partition 1/1
Will contain read IDs 0 to 543
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Total megahubs: 0

Skim summary:
	accepted: 4243
	possible: 4607
	permbans: 0

Hits chosen: 4243

Localtime: Thu Jul 15 12:00:00 2010

Pre-assembly alignment search for read extension and / or vector clipping:
Making alignments.
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible forward matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible complement matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Calculating possible vector leftovers ... done.

Loading confirmed overlaps from disk (will need approximately 1.2 M.):
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Sorting confirmed overlaps (this may take a while) ... done.

Generating clusters:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Pre-assembly read extension:

Localtime: Thu Jul 15 12:00:00 2010

Searching possible read extensions:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Changed length of 258 sequences.
Mean length gained in these sequences: 73.2713 bases.
Pre-assembly vector clipping

Performing vector clipping ... done.
Pool has 544 reads .
Checking reads for trace data:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
No SCF data present in any read, automatic contig editing is now switched off.
544 reads with valid data for assembly.

For the reads that are neither backbones nor rails:
- 0 reads have not enough good bases for assembly.
- 544 reads used for assembly.
- 0 reads have no real quality (see miralog.noqualities).
- mean length of good parts of used reads: 660
Localtime: Thu Jul 15 12:00:00 2010

Generated 288 unique template ids for 544 valid reads.
Localtime: Thu Jul 15 12:00:00 2010

Generated 0 unique strain ids for 544 reads.

Localtime: Thu Jul 15 12:00:00 2010


Searching for possible overlaps:
Localtime: Thu Jul 15 12:00:00 2010
We will get 1 partitions.
Progressend: 1088
Now running partitioned skimmer with 1 partitions:

Working on partition 1/1
Will contain read IDs 0 to 543
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Total megahubs: 0

Skim summary:
	accepted: 4512
	possible: 4913
	permbans: 0

Hits chosen: 4512

Localtime: Thu Jul 15 12:00:00 2010



Pass: 1
Making alignments.
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible forward matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible complement matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Calculating possible vector leftovers ... done.

Loading confirmed overlaps from disk (will need approximately 1.3 M.):
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Sorting confirmed overlaps (this may take a while) ... done.

Generating clusters:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Localtime: Thu Jul 15 12:00:00 2010

Building new contig 1
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 544
+[1] t+t++++a+aaaaar
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 1
Contig length: 2467
Avg. contig coverage: 2.36
Consensus contains:	A: 701	C: 457	G: 592	T: 690	N: 0
			IUPAC: 7	Funny: 0	*: 20

Num reads: 7
Avg. read length: 833
Reads contain 5780 bases, 0 Ns and 55 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 2
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 537
+[1] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[61] ++++++++++++++++++++++++++++++++++++++++++++++++++++++t+++++
[120] ++++++++++++++++++++++++++++++++++++++++++a+++a+++++++++++++
[178] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[238] ++++++++++++a+++++a+++++++++++++++++++++++++++++++++++++++++
[296] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[356] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[416] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[476] +++++a++++++++a+a++++++++++++++++a++++++++++++++++++++
RL1
[526] aaaThat's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 2
Contig length: 40028
Avg. contig coverage: 8.66
Consensus contains:	A: 13590	C: 5845	G: 6941	T: 13404	N: 0
			IUPAC: 24	Funny: 0	*: 224

Num reads: 526
Avg. read length: 659
Reads contain 343983 bases, 0 Ns and 2661 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found
 - 1 Strong RMB
 - 3 Weak RMB
 - 0 SNP
positions tagged.Transfering contig RMB permanent pair bans.
Transfering tags to readpool.

The previously assembled contig had grave misassemblies, rebuilding contig 2 now.
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 537
+[1] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[61] ++++++++++++++++++++++++++++++++++++++++++++++++++++++t+++++
[120] ++++++++++++++++++++++++++++++++++++++++++a+++a+++++++++++++
[178] +++++++++++++++++++++++++++++++++++p+++p++++++++++++++++++++
[236] +++++++++a+++++a++++++++++++++++++++++++++++++++++++++++++++
[294] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[354] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[414] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[474] +++++++++a++++p+a+p+++++++++a+++++a+++++++++++++++++++++
RL1
[524] aaapThat's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 2
Contig length: 40021
Avg. contig coverage: 8.62
Consensus contains:	A: 13590	C: 5845	G: 6951	T: 13404	N: 0
			IUPAC: 14	Funny: 0	*: 217

Num reads: 524
Avg. read length: 658
Reads contain 342555 bases, 0 Ns and 2577 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found
 - 0 Strong RMB
 - 3 Weak RMB
 - 0 SNP
positions tagged.Transfering contig RMB permanent pair bans.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 3
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 13
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 3
Contig length: 805
Avg. contig coverage: 1
Consensus contains:	A: 303	C: 146	G: 115	T: 241	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 805
Reads contain 805 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 4
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 12
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 4
Contig length: 788
Avg. contig coverage: 1
Consensus contains:	A: 285	C: 152	G: 124	T: 227	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 788
Reads contain 788 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 5
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 11
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 5
Contig length: 788
Avg. contig coverage: 1
Consensus contains:	A: 254	C: 118	G: 133	T: 283	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 788
Reads contain 788 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 6
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 10
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 6
Contig length: 786
Avg. contig coverage: 1
Consensus contains:	A: 281	C: 129	G: 138	T: 238	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 786
Reads contain 786 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 7
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 9
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 7
Contig length: 865
Avg. contig coverage: 1
Consensus contains:	A: 314	C: 149	G: 103	T: 299	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 865
Reads contain 865 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 8
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 8
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 8
Contig length: 963
Avg. contig coverage: 1
Consensus contains:	A: 215	C: 286	G: 205	T: 257	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 963
Reads contain 963 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 9
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 7
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 9
Contig length: 1052
Avg. contig coverage: 1
Consensus contains:	A: 308	C: 286	G: 166	T: 292	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 1052
Reads contain 1052 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 10
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 6
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 10
Contig length: 563
Avg. contig coverage: 1
Consensus contains:	A: 195	C: 71	G: 110	T: 187	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 563
Reads contain 563 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 11
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 5
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 11
Contig length: 893
Avg. contig coverage: 1
Consensus contains:	A: 251	C: 177	G: 136	T: 329	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 893
Reads contain 893 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 12
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 4
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 12
Contig length: 478
Avg. contig coverage: 1
Consensus contains:	A: 116	C: 160	G: 101	T: 101	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 478
Reads contain 478 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 13
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 3
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 13
Contig length: 869
Avg. contig coverage: 1
Consensus contains:	A: 286	C: 245	G: 93	T: 245	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 869
Reads contain 869 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 14
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 2
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 14
Contig length: 973
Avg. contig coverage: 1
Consensus contains:	A: 266	C: 228	G: 254	T: 225	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 973
Reads contain 973 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 15
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 1
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 15
Contig length: 972
Avg. contig coverage: 1
Consensus contains:	A: 284	C: 230	G: 123	T: 335	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 972
Reads contain 972 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Saving project statistics to file: cjejuni_demo_log/cjejuni_demo_info_contigstats_pass.1.txt
Saving read tag list to file: cjejuni_demo_log/cjejuni_demo_info_readtaglist.1.txt
Saving contig tag list to file: cjejuni_demo_log/cjejuni_demo_info_consensustaglist.1.txt
Saving project contig<->read list to file: cjejuni_demo_log/cjejuni_demo_info_contigreadlist_pass.1.txt


Pass: 2

Performing vector clipping ... done.
Pool has 544 reads .
Checking reads for trace data:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
No SCF data present in any read, automatic contig editing is now switched off.
544 reads with valid data for assembly.

For the reads that are neither backbones nor rails:
- 0 reads have not enough good bases for assembly.
- 544 reads used for assembly.
- 0 reads have no real quality (see miralog.noqualities).
- mean length of good parts of used reads: 660
Localtime: Thu Jul 15 12:00:00 2010

Generated 288 unique template ids for 544 valid reads.
Localtime: Thu Jul 15 12:00:00 2010

Generated 0 unique strain ids for 544 reads.

Localtime: Thu Jul 15 12:00:00 2010


Searching for possible overlaps:
Localtime: Thu Jul 15 12:00:00 2010
We will get 1 partitions.
Progressend: 1088
Now running partitioned skimmer with 1 partitions:

Working on partition 1/1
Will contain read IDs 0 to 543
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Total megahubs: 0

Skim summary:
	accepted: 4512
	possible: 4913
	permbans: 0

Hits chosen: 4512

Localtime: Thu Jul 15 12:00:00 2010

Making alignments.
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible forward matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible complement matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Calculating possible vector leftovers ... done.

Loading confirmed overlaps from disk (will need approximately 1.3 M.):
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Sorting confirmed overlaps (this may take a while) ... done.

Generating clusters:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Localtime: Thu Jul 15 12:00:00 2010

Building new contig 1
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 544
+[1] t+t++++a+aaaaar
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 1
Contig length: 2467
Avg. contig coverage: 2.36
Consensus contains:	A: 701	C: 457	G: 592	T: 690	N: 0
			IUPAC: 7	Funny: 0	*: 20

Num reads: 7
Avg. read length: 833
Reads contain 5780 bases, 0 Ns and 55 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 2
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 537
+[1] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[61] +++++++++++++++++++++++++++++++++++++++++++++++++++t++++++++
[120] ++++++++++++++++++++++++++++++++++++++a+a++++++aa+++++++++++
[176] +++++++++++++++++++++++++++++++++++++p++++++++++p+++++++++++
[234] ++++++++++a+++++a+++++++++++++++++++++++++++++++++++++++++++
[292] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[352] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[412] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[472] +++++++++++++++++p+++++++a+++a+++++++++++++++++++++++++
RL1
[524] aapaThat's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 2
Contig length: 40021
Avg. contig coverage: 8.62
Consensus contains:	A: 13590	C: 5845	G: 6951	T: 13404	N: 0
			IUPAC: 14	Funny: 0	*: 217

Num reads: 524
Avg. read length: 658
Reads contain 342548 bases, 0 Ns and 2577 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found
 - 0 Strong RMB
 - 3 Weak RMB
 - 0 SNP
positions tagged.Transfering contig RMB permanent pair bans.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 3
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 13
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 3
Contig length: 805
Avg. contig coverage: 1
Consensus contains:	A: 303	C: 146	G: 115	T: 241	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 805
Reads contain 805 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 4
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 12
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 4
Contig length: 788
Avg. contig coverage: 1
Consensus contains:	A: 285	C: 152	G: 124	T: 227	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 788
Reads contain 788 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 5
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 11
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 5
Contig length: 788
Avg. contig coverage: 1
Consensus contains:	A: 254	C: 118	G: 133	T: 283	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 788
Reads contain 788 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 6
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 10
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 6
Contig length: 786
Avg. contig coverage: 1
Consensus contains:	A: 281	C: 129	G: 138	T: 238	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 786
Reads contain 786 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 7
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 9
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 7
Contig length: 865
Avg. contig coverage: 1
Consensus contains:	A: 314	C: 149	G: 103	T: 299	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 865
Reads contain 865 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 8
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 8
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 8
Contig length: 963
Avg. contig coverage: 1
Consensus contains:	A: 215	C: 286	G: 205	T: 257	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 963
Reads contain 963 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 9
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 7
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 9
Contig length: 1052
Avg. contig coverage: 1
Consensus contains:	A: 308	C: 286	G: 166	T: 292	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 1052
Reads contain 1052 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 10
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 6
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 10
Contig length: 563
Avg. contig coverage: 1
Consensus contains:	A: 195	C: 71	G: 110	T: 187	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 563
Reads contain 563 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 11
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 5
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 11
Contig length: 893
Avg. contig coverage: 1
Consensus contains:	A: 251	C: 177	G: 136	T: 329	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 893
Reads contain 893 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 12
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 4
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 12
Contig length: 478
Avg. contig coverage: 1
Consensus contains:	A: 116	C: 160	G: 101	T: 101	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 478
Reads contain 478 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 13
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 3
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 13
Contig length: 869
Avg. contig coverage: 1
Consensus contains:	A: 286	C: 245	G: 93	T: 245	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 869
Reads contain 869 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 14
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 2
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 14
Contig length: 973
Avg. contig coverage: 1
Consensus contains:	A: 266	C: 228	G: 254	T: 225	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 973
Reads contain 973 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 15
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 1
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 15
Contig length: 972
Avg. contig coverage: 1
Consensus contains:	A: 284	C: 230	G: 123	T: 335	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 972
Reads contain 972 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Saving project statistics to file: cjejuni_demo_log/cjejuni_demo_info_contigstats_pass.2.txt
Saving read tag list to file: cjejuni_demo_log/cjejuni_demo_info_readtaglist.2.txt
Saving contig tag list to file: cjejuni_demo_log/cjejuni_demo_info_consensustaglist.2.txt
Saving project contig<->read list to file: cjejuni_demo_log/cjejuni_demo_info_contigreadlist_pass.2.txt


Pass: 3

Performing vector clipping ... done.
Pool has 544 reads .
Checking reads for trace data:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
No SCF data present in any read, automatic contig editing is now switched off.
544 reads with valid data for assembly.

For the reads that are neither backbones nor rails:
- 0 reads have not enough good bases for assembly.
- 544 reads used for assembly.
- 0 reads have no real quality (see miralog.noqualities).
- mean length of good parts of used reads: 660
Localtime: Thu Jul 15 12:00:00 2010

Generated 288 unique template ids for 544 valid reads.
Localtime: Thu Jul 15 12:00:00 2010

Generated 0 unique strain ids for 544 reads.

Localtime: Thu Jul 15 12:00:00 2010


Searching for possible overlaps:
Localtime: Thu Jul 15 12:00:00 2010
We will get 1 partitions.
Progressend: 1088
Now running partitioned skimmer with 1 partitions:

Working on partition 1/1
Will contain read IDs 0 to 543
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Total megahubs: 0

Skim summary:
	accepted: 4498
	possible: 4913
	permbans: 14

Hits chosen: 4498

Localtime: Thu Jul 15 12:00:00 2010

Making alignments.
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible forward matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible complement matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Calculating possible vector leftovers ... done.

Loading confirmed overlaps from disk (will need approximately 1.3 M.):
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Sorting confirmed overlaps (this may take a while) ... done.

Generating clusters:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Localtime: Thu Jul 15 12:00:00 2010

Building new contig 1
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 544
+[1] t+t++++a+aaaaar
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 1
Contig length: 2467
Avg. contig coverage: 2.36
Consensus contains:	A: 701	C: 457	G: 592	T: 690	N: 0
			IUPAC: 7	Funny: 0	*: 20

Num reads: 7
Avg. read length: 833
Reads contain 5780 bases, 0 Ns and 55 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 2
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 537
+[1] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[61] +++++++++++++++++++++++++++++++++++++++++++++++++++t++++++++
[120] ++++++++++++++++++++++++++++++++++++++a+a++++++aa+++++++++++
[176] +++++++++++++++++++++++++++++++++++++p++++++++++p+++++++++++
[234] ++++++++++a+++++a+++++++++++++++++++++++++++++++++++++++++++
[292] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[352] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[412] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[472] +++++++++++++++++p+++++++a+++a+++++++++++++++++++++++++
RL1
[524] aapaThat's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 2
Contig length: 40021
Avg. contig coverage: 8.62
Consensus contains:	A: 13590	C: 5845	G: 6951	T: 13404	N: 0
			IUPAC: 14	Funny: 0	*: 217

Num reads: 524
Avg. read length: 658
Reads contain 342548 bases, 0 Ns and 2577 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found
 - 0 Strong RMB
 - 3 Weak RMB
 - 0 SNP
positions tagged.Transfering contig RMB permanent pair bans.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 3
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 13
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 3
Contig length: 805
Avg. contig coverage: 1
Consensus contains:	A: 303	C: 146	G: 115	T: 241	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 805
Reads contain 805 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 4
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 12
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 4
Contig length: 788
Avg. contig coverage: 1
Consensus contains:	A: 285	C: 152	G: 124	T: 227	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 788
Reads contain 788 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 5
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 11
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 5
Contig length: 788
Avg. contig coverage: 1
Consensus contains:	A: 254	C: 118	G: 133	T: 283	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 788
Reads contain 788 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 6
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 10
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 6
Contig length: 786
Avg. contig coverage: 1
Consensus contains:	A: 281	C: 129	G: 138	T: 238	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 786
Reads contain 786 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 7
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 9
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 7
Contig length: 865
Avg. contig coverage: 1
Consensus contains:	A: 314	C: 149	G: 103	T: 299	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 865
Reads contain 865 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 8
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 8
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 8
Contig length: 963
Avg. contig coverage: 1
Consensus contains:	A: 215	C: 286	G: 205	T: 257	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 963
Reads contain 963 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 9
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 7
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 9
Contig length: 1052
Avg. contig coverage: 1
Consensus contains:	A: 308	C: 286	G: 166	T: 292	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 1052
Reads contain 1052 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 10
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 6
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 10
Contig length: 563
Avg. contig coverage: 1
Consensus contains:	A: 195	C: 71	G: 110	T: 187	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 563
Reads contain 563 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 11
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 5
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 11
Contig length: 893
Avg. contig coverage: 1
Consensus contains:	A: 251	C: 177	G: 136	T: 329	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 893
Reads contain 893 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 12
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 4
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 12
Contig length: 478
Avg. contig coverage: 1
Consensus contains:	A: 116	C: 160	G: 101	T: 101	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 478
Reads contain 478 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 13
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 3
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 13
Contig length: 869
Avg. contig coverage: 1
Consensus contains:	A: 286	C: 245	G: 93	T: 245	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 869
Reads contain 869 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 14
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 2
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 14
Contig length: 973
Avg. contig coverage: 1
Consensus contains:	A: 266	C: 228	G: 254	T: 225	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 973
Reads contain 973 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 15
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 1
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 15
Contig length: 972
Avg. contig coverage: 1
Consensus contains:	A: 284	C: 230	G: 123	T: 335	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 972
Reads contain 972 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Saving project statistics to file: cjejuni_demo_log/cjejuni_demo_info_contigstats_pass.3.txt
Saving read tag list to file: cjejuni_demo_log/cjejuni_demo_info_readtaglist.3.txt
Saving contig tag list to file: cjejuni_demo_log/cjejuni_demo_info_consensustaglist.3.txt
Saving project contig<->read list to file: cjejuni_demo_log/cjejuni_demo_info_contigreadlist_pass.3.txt


Localtime: Thu Jul 15 12:00:00 2010
Hunting contig join spoiler ... done.
Localtime: Thu Jul 15 12:00:00 2010


Pass: 4

Performing vector clipping ... done.
Pool has 544 reads .
Checking reads for trace data:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
No SCF data present in any read, automatic contig editing is now switched off.
544 reads with valid data for assembly.

For the reads that are neither backbones nor rails:
- 0 reads have not enough good bases for assembly.
- 544 reads used for assembly.
- 0 reads have no real quality (see miralog.noqualities).
- mean length of good parts of used reads: 660
Localtime: Thu Jul 15 12:00:00 2010

Generated 288 unique template ids for 544 valid reads.
Localtime: Thu Jul 15 12:00:00 2010

Generated 0 unique strain ids for 544 reads.

Localtime: Thu Jul 15 12:00:00 2010


Searching for possible overlaps:
Localtime: Thu Jul 15 12:00:00 2010
We will get 1 partitions.
Progressend: 1088
Now running partitioned skimmer with 1 partitions:

Working on partition 1/1
Will contain read IDs 0 to 543
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Total megahubs: 0

Skim summary:
	accepted: 4498
	possible: 4913
	permbans: 14

Hits chosen: 4498

Localtime: Thu Jul 15 12:00:00 2010

Making alignments.
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible forward matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Aligning possible complement matches:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 
Localtime: Thu Jul 15 12:00:00 2010

Calculating possible vector leftovers ... done.

Loading confirmed overlaps from disk (will need approximately 1.3 M.):
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Sorting confirmed overlaps (this may take a while) ... done.

Generating clusters:
 [0%] ....|.... [10%] ....|.... [20%] ....|.... [30%] ....|.... [40%] ....|.... [50%] ....|.... [60%] ....|.... [70%] ....|.... [80%] ....|.... [90%] ....|.... [100%] 

Localtime: Thu Jul 15 12:00:00 2010

Building new contig 1
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 544
+[1] t+t++++a+aaaaar
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 1
Contig length: 2467
Avg. contig coverage: 2.36
Consensus contains:	A: 701	C: 457	G: 592	T: 690	N: 0
			IUPAC: 7	Funny: 0	*: 20

Num reads: 7
Avg. read length: 833
Reads contain 5780 bases, 0 Ns and 55 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 2
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 537
+[1] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[61] +++++++++++++++++++++++++++++++++++++++++++++++++++t++++++++
[120] ++++++++++++++++++++++++++++++++++++++a+a++++++aa+++++++++++
[176] +++++++++++++++++++++++++++++++++++++p++++++++++p+++++++++++
[234] ++++++++++a+++++a+++++++++++++++++++++++++++++++++++++++++++
[292] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[352] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[412] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[472] +++++++++++++++++p+++++++a+++a+++++++++++++++++++++++++
RL1
[524] aapaThat's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 2
Contig length: 40021
Avg. contig coverage: 8.62
Consensus contains:	A: 13590	C: 5845	G: 6951	T: 13404	N: 0
			IUPAC: 14	Funny: 0	*: 217

Num reads: 524
Avg. read length: 658
Reads contain 342548 bases, 0 Ns and 2577 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Marking possibly misassembled repeats ...done. Found
 - 0 Strong RMB
 - 3 Weak RMB
 - 0 SNP
positions tagged.Transfering contig RMB permanent pair bans.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 3
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 13
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 3
Contig length: 805
Avg. contig coverage: 1
Consensus contains:	A: 303	C: 146	G: 115	T: 241	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 805
Reads contain 805 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 4
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 12
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 4
Contig length: 788
Avg. contig coverage: 1
Consensus contains:	A: 285	C: 152	G: 124	T: 227	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 788
Reads contain 788 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 5
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 11
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 5
Contig length: 788
Avg. contig coverage: 1
Consensus contains:	A: 254	C: 118	G: 133	T: 283	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 788
Reads contain 788 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 6
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 10
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 6
Contig length: 786
Avg. contig coverage: 1
Consensus contains:	A: 281	C: 129	G: 138	T: 238	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 786
Reads contain 786 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 7
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 9
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 7
Contig length: 865
Avg. contig coverage: 1
Consensus contains:	A: 314	C: 149	G: 103	T: 299	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 865
Reads contain 865 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 8
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 8
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 8
Contig length: 963
Avg. contig coverage: 1
Consensus contains:	A: 215	C: 286	G: 205	T: 257	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 963
Reads contain 963 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 9
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 7
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 9
Contig length: 1052
Avg. contig coverage: 1
Consensus contains:	A: 308	C: 286	G: 166	T: 292	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 1052
Reads contain 1052 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 10
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 6
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 10
Contig length: 563
Avg. contig coverage: 1
Consensus contains:	A: 195	C: 71	G: 110	T: 187	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 563
Reads contain 563 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 11
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 5
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 11
Contig length: 893
Avg. contig coverage: 1
Consensus contains:	A: 251	C: 177	G: 136	T: 329	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 893
Reads contain 893 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 12
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 4
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 12
Contig length: 478
Avg. contig coverage: 1
Consensus contains:	A: 116	C: 160	G: 101	T: 101	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 478
Reads contain 478 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 13
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 3
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 13
Contig length: 869
Avg. contig coverage: 1
Consensus contains:	A: 286	C: 245	G: 93	T: 245	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 869
Reads contain 869 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 14
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 2
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 14
Contig length: 973
Avg. contig coverage: 1
Consensus contains:	A: 266	C: 228	G: 254	T: 225	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 973
Reads contain 973 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Building new contig 15
Localtime: Thu Jul 15 12:00:00 2010
Unused reads: 1
+
RL1
That's it for this contig.


Finished building the contig.
Localtime: Thu Jul 15 12:00:00 2010

-------------- Contig statistics ----------------
Contig id: 15
Contig length: 972
Avg. contig coverage: 1
Consensus contains:	A: 284	C: 230	G: 123	T: 335	N: 0
			IUPAC: 0	Funny: 0	*: 0

Num reads: 1
Avg. read length: 972
Reads contain 972 bases, 0 Ns and 0 gaps.
-------------------------------------------------
Localtime: Thu Jul 15 12:00:00 2010
Saving of extra temporary singlets disabled.
Marking possibly misassembled repeats ...done. Found none.
Transfering reads to readpool.
Localtime: Thu Jul 15 12:00:00 2010

Saving project statistics to file: cjejuni_demo_log/cjejuni_demo_info_contigstats_pass.4.txt
Saving read tag list to file: cjejuni_demo_log/cjejuni_demo_info_readtaglist.4.txt
Saving contig tag list to file: cjejuni_demo_log/cjejuni_demo_info_consensustaglist.4.txt
Saving project contig<->read list to file: cjejuni_demo_log/cjejuni_demo_info_contigreadlist_pass.4.txt



Assembly finished, saving final results.


Localtime: Thu Jul 15 12:00:00 2010
Saving project statistics to file: cjejuni_demo_info/cjejuni_demo_info_contigstats.txt
Localtime: Thu Jul 15 12:00:00 2010
Saving read tag list to file: cjejuni_demo_info/cjejuni_demo_info_readtaglist.txt
Localtime: Thu Jul 15 12:00:00 2010
Saving contig tag list to file: cjejuni_demo_info/cjejuni_demo_info_consensustaglist.txt
Localtime: Thu Jul 15 12:00:00 2010
Saving project contig<->read list to file: cjejuni_demo_info/cjejuni_demo_info_contigreadlist.txt
Localtime: Thu Jul 15 12:00:00 2010
Saving contigs to file: cjejuni_demo_results/cjejuni_demo_out.caf
Localtime: Thu Jul 15 12:00:00 2010
Saving contigs to directory: cjejuni_demo_results/cjejuni_demo_out.gap4da
(first deleting old directory)
(now creating new directory)
(saving contigs)
Done.
Localtime: Thu Jul 15 12:00:00 2010
Saving contigs to FASTA file: cjejuni_demo_results/cjejuni_demo_out.unpadded.fasta
Saving padded contigs to FASTA file: cjejuni_demo_results/cjejuni_demo_out.padded.fasta
Saving contig qualities to FASTA quality file: cjejuni_demo_results/cjejuni_demo_out.unpadded.fasta.qual
Saving padded contig qualities to FASTA quality file: cjejuni_demo_results/cjejuni_demo_out.padded.fasta.qual
Localtime: Thu Jul 15 12:00:00 2010
Saving contigs TCS to file: cjejuni_demo_results/cjejuni_demo_out.tcs
Localtime: Thu Jul 15 12:00:00 2010
Saving SNP analysis to file: cjejuni_demo_info/cjejuni_demo_info_snpanalysis.txt
Saving contigs to file: cjejuni_demo_results/cjejuni_demo_out.txt
Localtime: Thu Jul 15 12:00:00 2010
Saving contigs to file: cjejuni_demo_results/cjejuni_demo_out.ace
Localtime: Thu Jul 15 12:00:00 2010
Saving contigs to file: cjejuni_demo_results/cjejuni_demo_out.html
Localtime: Thu Jul 15 12:00:00 2010


End of assembly process, thank you for using MIRA.

Go to the output files for this example

Command line arguments

MIRA fragment assembly program
Version: EMBOSS:6.3.0

   Standard (Mandatory) qualifiers:
   -project            string     [mira] Default is mira. Defines the project
                                  name for this assembly. The project name
                                  automatically influences the name of input
                                  and output files or directories. E.g. in the
                                  default setting, the file names for the
                                  output of the assembly in FASTA format would
                                  be mira_out.fasta and mira_out.fasta.qual.
                                  Setting the project name to 'MyProject'
                                  would generate MyProject_out.fasta and
                                  MyProject_out.fasta.qual. (Any string)

   Additional (Optional) qualifiers: (none)
   Advanced (Unprompted) qualifiers:
   -paramsfile         infile     Loads parameters from the filename given.
                                  Allows a maximum of 10 levels of recursion,
                                  i.e. a -params option appearing within a
                                  file that loads other parameter files
   -setparam           menu       [unspecified] Sets parameters suited for
                                  loading sequences from FASTA, PHD or CAF
                                  files. The default is not to specify the
                                  type of input file. (Values: unspecified
                                  (Unspecified); fasta (Fasta); phd (PHD); caf
                                  (CAF))
   -expdir             directory  [.] Defines the directory where mira should
                                  search for experiment files (EXP).
   -scfdir             directory  [.] Defines the directory where mira should
                                  search for SCF files
   -feifile            infile     [mira_in.fofn] Defines the file of filenames
                                  where the names of the EXP files of a
                                  project are located.
   -fpifile            infile     [mira_in.fofn] Defines the file of filenames
                                  where the names of the PHD files of a
                                  project are located.
   -pifile             infile     [mira_in.phd] Defines the PHD file to load
                                  sequences of a project from.
   -faifile            infile     [mira_in.fasta] Defines the FASTA file to
                                  load sequences of a project from.
   -fqifile            infile     [mira_in.fasta.qual] Defines the fasta file
                                  to load base qualities of a project from.
                                  Although the order of reads in the quality
                                  file does not need to be the same as in the
                                  fasta or fofn projects (although it saves a
                                  bit of time if they are).
   -cifile             infile     [mira_in.caf] Defines the file to load a CAF
                                  project from. Filename must end with
                                  '.caf'.
   -sdifile            infile     [mira_straindata_in.txt] Defines the file to
                                  load straindata from. Only used in EST
                                  projects (miraEST).
   -xtiifile           infile     [mira_xmltraceinfo_in.xml] Defines the file
                                  to load a trace info file in XML format
                                  from. This can be used both when merging XML
                                  data to loaded files or when loading a
                                  project from an XML trace info file.
   -genome             menu       [normal] Quality grades of de-novo genome
                                  assembly. Draft is quick-and-dirty, suited
                                  to get a first look on approximate coverage
                                  of a running project. Should not be used for
                                  anything else. Normal is the default
                                  parameter set of mira that is able to tackle
                                  most genomes. A bit slower than the draft
                                  version, but includes such options as read
                                  extension and vector remnant clipping.
                                  Accurate is still slower than the normal
                                  mode but should be used for genomes that
                                  pose a problem to the normal mode. (Values:
                                  draft (Draft); normal (Normal); accurate
                                  (Accurate))
   -mapping            menu       [normal] Work like the -genome switches
                                  except they are to be used when performing
                                  mapping assemblies against given backbone
                                  sequences. (Values: draft (Draft); normal
                                  (Normal); accurate (Accurate))
   -clipping           menu       [medium] Three clipping grade modifiers,
                                  from light clipping when working with well
                                  preprocessed sequences to heavy clipping
                                  when the sequences that are being assembled
                                  had only sloppy or no preprocessing. Note 1
                                  - the light version is already included in
                                  the -genome and -mapping switches. Note 2 -
                                  it is recommended that you perform a
                                  thorough preprocessing (clipping sequencing
                                  vector stretches, clipping of low quality
                                  bases, tagging standard repeats etc.) before
                                  assembling sequences. The clipping routines
                                  of mira are more optimised to cope with the
                                  last remnants of wrongly preprocessed
                                  sequences than with sequences having had no
                                  pre-processing at all. (Values: light
                                  (Light); medium (Medium); heavy (Heavy))
   -highlyrepetitive   boolean    [N] A modifier switch for genome data that
                                  is deemed to be highly repetitive. The
                                  assemblies will run slower due to more
                                  iterative cycles that give mira a chance to
                                  resolve nasty repeats.
   -highqualitydata    boolean    [N] A modifier switch when the sequences
                                  that are used are of exceptional quality.
                                  mira will then bump up a few quality
                                  parameters which should lead to less false
                                  positives in the repeat and SNP detection
                                  routines.
   -estmode            boolean    [N] Switches mira to a good initial preset
                                  for assembling EST data. Note that this is
                                  not needed (and even counterproductive) when
                                  used with miraEST.
   -horrid             boolean    [N] Sets a number of parameters useful when
                                  dealing with really horrid data sets. Useful
                                  means that parameters are chosen to so that
                                  time and memory consumption do not explode
                                  beyond all hope of the program returning.
                                  Note that MIRA will return in most cases
                                  useful assemblies with this switch, but
                                  these might not be as optimised as with
                                  normal operation. The definition of 'horrid'
                                  is a bit flexible, for example, (a) a
                                  genomic projects with more than 2.000 reads
                                  that all seem to align partly to each other
                                  but have different repetitive structures or
                                  (b) EST clusters with a few thousand almost
                                  similar reads.
   -borg               boolean    [N] Sets several parameters to have mira try
                                  to assemble as many reads as possible. Will
                                  probably slow down the assembly process and
                                  use more memory. 'We are MIRA of borg. You
                                  will be assembled, resistance is futile!'
   -lj                 menu       [fofnexp] Defines whether to load and
                                  assemble EXP files from a file of filenames
                                  ('mira_in.fofn'), load and assemble FASTA
                                  sequences ('mira_in.fasta') and their
                                  qualities ('mira_in.fasta.qual'), load and
                                  assemble sequences or qualities from a phd
                                  file ('mira_in.phd') or to load a project
                                  from a CAF file ('mira_in.caf') and assemble
                                  or eventually reassemble it. N.B. fofnphd
                                  is not currently available. (Values: fofnexp
                                  (EXP files from a file of filenames); fasta
                                  (Load and assemble FASTA); caf (Load and
                                  assemble CAF); phd (Load and assemble PHD);
                                  fofnphd (PHD files from a file of
                                  filenames))
   -fo                 boolean    [N] If set to 'Y', the project will not be
                                  assembled and no assembly output files will
                                  be produced. Instead, the project files will
                                  only be loaded. This switch is useful for
                                  checking consistency of input files.
   -mxti               boolean    [N] Some file formats above (FASTA, PHD or
                                  even CAF and EXP) possibly don't contain all
                                  the info necessary or useful for each read
                                  of an assembly. Should additional
                                  information, such as like clipping positions
                                  etc., be available in a XML trace info file
                                  in NCBI format (see File formats), then set
                                  this option to 'Y' and it will be merged to
                                  the data loaded. Please note, quality
                                  clippings given here will override quality
                                  clippings loaded earlier or performed by
                                  mira. Minimum clippings will still be made
                                  by the program, though.
   -rns                menu       [sanger] Defines the centre naming scheme
                                  for read suffixes. Currently, only Sanger
                                  Institute and TIGR naming schemes are
                                  supported out of the box. How to choose?
                                  Please read the documentation available at
                                  the different centres or ask your sequence
                                  provider. In a nutshell, the Sanger scheme
                                  is
                                  'somename.[pqsfrw][12][bckdeflmnpt][a|b|c|...'
                                  (e.g. U13a08f10.p1ca), TIGR scheme is
                                  'somenameTF*|TR*|TA*' (e.g. GCPBN02TF or
                                  GCPDL68TABRPT103A58B). (Values: sanger
                                  (Sanger); tigr (TIGR))
   -eq                 menu       [SCF] Defines the source format for reading
                                  qualities from external sources. Normally
                                  takes effect only when these are not present
                                  in the format of the load_job project (EXP
                                  and FASTA can have them, CAF and PHD must
                                  have them). (Values: none (None); SCF (SCF))
   -eqo                boolean    [N] Only takes effect when 'lj' is fofnexp.
                                  Defines whether or not the qualities from
                                  the external source override the possibly
                                  loaded qualities from the load job project.
                                  This might be of use in case some
                                  post-processing software fiddles around with
                                  the quality values of the input file but
                                  one wants to have the original ones.
   -[no]droeqe         boolean    [Y] Should there be a major mismatch between
                                  the external quality source and the
                                  sequence (e.g. the base sequence read from a
                                  SCF file does not match the originally read
                                  base sequence), should the read be excluded
                                  from assembly or not. If not, it will use
                                  the qualities it had before trying to load
                                  the external qualities (either default
                                  qualities or the ones loaded from the
                                  original source).
   -[no]uti            boolean    [Y] Two reads sequenced from the same clone
                                  template form a read pair with a known
                                  minimum and maximum distance. This feature
                                  will definitively help for contigs
                                  containing lots of repeats. Set this to 'Y'
                                  if your data contains information on insert
                                  sizes. Information on insert sizes can be
                                  given via the SI tag in EXP files (for each
                                  read pair individually), or for the whole
                                  project using dismin and dismax
   -ess                integer    [1] Controls the starting step of the EST
                                  assembly and is therefore only useful in
                                  miraEST. EST assembly is a three step
                                  process, each with different settings to the
                                  assembly engine, with the result of each
                                  step being saved to disk. If results of
                                  previous steps are present in a directory,
                                  one can easily 'play around' with different
                                  setting for subsequent steps by reusing the
                                  results of the previous steps and directly
                                  starting with step two or three. (Integer
                                  from 1 to 4)
   -[no]ps             boolean    [Y] Controls whether date and time are
                                  printed out during the assembly. Suppressing
                                  it isn't useful in normal operation, only
                                  when debugging or benchmarking.
   -lsd                boolean    [N] Straindata is a key value file, one read
                                  per line. First the name of the read, then
                                  the strain name of the organism the read
                                  comes from. It is used by the program to
                                  differentiate different types of SNPs
                                  appearing in organisms and classifying them.
   -lb                 boolean    [N] A backbone is a sequence (or a previous
                                  assembly) that is used as a template for the
                                  current assembly. The current assembly
                                  process will first assemble reads to loaded
                                  backbone contigs before creating new
                                  contigs. This feature is helpful for
                                  assembling against previous (and already
                                  possibly edited) assembly iterations, or to
                                  make a comparative assembly of two very
                                  closely related organisms. Please read 'very
                                  closely related' as in 'only SNP mutations
                                  or short indels present'.
   -sbuip              integer    [3] When assembling against backbones, this
                                  parameter defines the pass iteration (see
                                  nop) from which on the backbones will be
                                  really used. In the passes preceding this
                                  number, the non-backbone reads will be
                                  assembled together as if no backbones
                                  existed. This allows mira to correctly spot
                                  repetitive stretches that differ by single
                                  bases and tag them accordingly. Rule of
                                  thumb - if backbones belong to the same
                                  strain as the reads to assemble, set to 1.
                                  If backbones are a different strain, then
                                  set sbuib to 1 lower than nop (example - nop
                                  4 and sbuip 3). (Integer 1 or more)
   -bsn                string     Defines the name of the strain that the
                                  backbone sequences have. (Any string)
   -bft                menu       [fasta] Defines the filetype of the backbone
                                  file given. Currently (2.8.1 ) only FASTA,
                                  CAF and GBF files are supported. When GBF
                                  (GenBank files, also named .gbk) files are
                                  loaded, the features within these files are
                                  automatically transformed into
                                  Staden-compatible tags and get passed
                                  through the assembly. (Values: fasta
                                  (Fasta); caf (CAF); gbf (GenBank))
   -brl                integer    [2500] Parameter for the internal sectioning
                                  size of the backbone. Extremely repetitive
                                  sequences may require reducing the default
                                  value, but the default value should work
                                  well in 99.9% of all cases. (Integer from
                                  1000 to 3000)
   -bbq                integer    [-1] Defines the default quality that the
                                  backbone sequences have if they came without
                                  quality values in their files (like in GBF
                                  format or when FASTA is used without .qual
                                  files). A value of -1 causes mira to use the
                                  same default quality for backbones as for
                                  reads. (Integer from -1 to 100)
   -[no]abnc           boolean    [Y] The standard mode of the assembler is to
                                  assemble available reads to a backbone and
                                  make new contigs with the remaining reads.
                                  If this option is set to 'N', the reads that
                                  cannot be assembled into existing contigs
                                  are put as singlets into the assembly, not
                                  forming new contigs.
   -mrl                integer    [40] Minimum length that reads must have to
                                  be considered for the assembly. Shorter
                                  sequences will be filtered out at the
                                  beginning of the process and won't be
                                  present in the final project. (Integer 20 or
                                  more)
   -nop                integer    [3] Defines how many iterations of the whole
                                  assembly process are done. Rule of thumb -
                                  for quick and dirty assembly use 1 (not
                                  recommended). For assembly using read
                                  extensions and / or automatic contig editing
                                  (-ure and -ace) use at least 2. The
                                  recommended setting is 3 or higher, as some
                                  knowledge generated by the assembler can be
                                  used only from the third iteration on. More
                                  than 3 passes might be useful for projects
                                  containing many repetitive elements. See
                                  also -rbl and -mr for parameters that affect
                                  the assembly and disentanglement of
                                  possible repeats. (Integer 1 or more)
   -[no]sep            boolean    [Y] Defines whether the skim algorithm (and
                                  with it also the recalculation of
                                  Smith-Waterman alignments) is called in
                                  between each main pass. If set to 'N',
                                  skimming is done only when needed by the
                                  workflow, either when read extensions are
                                  searched for (-ure) or when possible vector
                                  leftovers are to be clipped (-pvc). Setting
                                  this option to 'Y' is highly recommended,
                                  setting it to 'N' is only for quick and
                                  dirty assemblies.
   -rbl                integer    [2] Defines the maximum number of times a
                                  contig can be rebuilt during main assembly
                                  passes (-nop) if misassemblies, due to
                                  possible repeats, are found. (Integer 1 or
                                  more)
   -[no]sd             boolean    [Y] Default is 'Y' for mira and 'N' for
                                  miraEST. A spoiler can be either a chimeric
                                  read or it is a read with long parts of
                                  unclipped vector sequence still included
                                  (that was too long for the -pvc vector
                                  leftover clipping routines). A spoiler
                                  typically prevents contigs being joined;
                                  MIRA will cut them back so that they present
                                  no more harm to the assembly. Recommended
                                  for assemblies of mid-to-high coverage
                                  genomic assemblies; not recommended for
                                  assemblies of ESTs as one might lose splice
                                  variants with that. A minimum number of two
                                  assembly passes (-nop) must be run for this
                                  option to take effect.
   -[no]sdlpo          boolean    [Y] Defines whether the spoiler detection
                                  algorithms are run only for the last pass or
                                  for all passes (-nop). Takes effect only if
                                  spoiler detection (-sd) is on.
   -bdq                integer    [10] Defines the default base quality of
                                  reads that have no quality read from a file.
                                  (Integer 0 or more)
   -[no]ugpf           boolean    [Y] MIRA has two different pathfinder
                                  algorithms it chooses from to find its way
                                  through the (more or less) complete set of
                                  possible sequence overlaps; a genomic and an
                                  EST pathfinder. The genomic looks a bit
                                  into the future of the assembly and tries to
                                  stay on safe grounds using a maximum of
                                  information already present in the contig
                                  that is being built. The EST version, on the
                                  contrary, will directly jump at the complex
                                  cases posed by very similar repetitive
                                  sequences and try to solve those first; it
                                  is willing to fall down to brute force when
                                  really bad cases (such as coverage with
                                  thousands of sequences) are encountered.
                                  Generally, the genomic pathfinder will also
                                  work quite well with EST sequences (but
                                  might get slowed down a lot in pathological
                                  cases), while the EST algorithm does not
                                  work so well on genomes. If in doubt,
                                  leaveas 'Y' for genome projects and set to
                                  'N' for EST projects.
   -[no]uess           boolean    [Y] Another important switch if you plan to
                                  assemble non-normalised EST libraries, where
                                  some ESTs may reach coverages of several
                                  hundreds or thousands of reads. This switch
                                  lets MIRA save a lot of computational time
                                  when aligning those extremely high coverage
                                  areas (but only there), at the expense of
                                  some accuracy.
   -esspd              integer    [500] Defines the number of potential
                                  partners a read must have for MIRA switching
                                  into emergency search stop mode for that
                                  read. (Integer 1 or more)
   -umcbt              boolean    [N] Defines whether there is an upper limit
                                  of time to be used to build one contig. Set
                                  this to 'Y' in EST assemblies where you
                                  think that extremely high coverages occur.
                                  Less useful for assembly of genomic
                                  sequences.
   -bts                integer    [10000] Depending on -umcbt above, this
                                  number defines the time in seconds alloted
                                  to building one contig. (Integer 1 or more)
   -[no]ure            boolean    [Y] Defines whether there is an upper limit
                                  of time to be used to build one contig. Set
                                  this to 'Y' in EST assemblies where you
                                  think that extremely high coverages occur.
                                  Less useful for assembly of genomic
                                  sequences.
   -rewl               integer    [30] Only takes effect when -ure is set to
                                  'Y'. The read extension routines use a
                                  sliding window approach on Smith-Waterman
                                  alignments. This parameter defines the
                                  window length. (Integer 1 or more)
   -rewme              integer    [2] Only takes effect when -ure is set to
                                  'Y'. The read extension routines use a
                                  sliding window approach on Smith-Waterman
                                  alignments. This parameter defines the
                                  number maximum number of errors
                                  (disagreements) between two alignments in
                                  the given window. (Integer 1 or more)
   -feip               integer    [0] Only takes effect when -ure is set to
                                  'Y'. The read extension routines can be
                                  called before assembly and/or after each
                                  assembly pass (see -nop). This parameter
                                  defines the first pass in which the read
                                  extension routines are called. The default
                                  of 0 tells mira to extend the reads the
                                  first time before the first assembly pass.
                                  (Integer 0 or more)
   -leip               integer    [0] Only takes effect when -ure is set to
                                  'Y'. The read extension routines can be
                                  called before assembly and/or after each
                                  assembly pass (see -nop). This parameter
                                  defines the last pass in which the read
                                  extension routines are called. The default
                                  of 0 tells mira to extend the reads the last
                                  time before the first assembly pass.
                                  (Integer 0 or more)
   -tpae               boolean    [N] This option is useful in EST assembly.
                                  Poly-AT stretches at the end of reads that
                                  were not correctly masked or clipped in
                                  pre-processing steps from external programs
                                  get tagged here. The assembler will not use
                                  these stretches for critical operations.
                                  Additionally, the tags do provide a good
                                  visual anchor when looking at the assembly
                                  with different programs.
   -pbwl               integer    [7] Only takes effect when -tpae is set to
                                  'Y'. Defines the window length within which
                                  all bases (except the maximum number of
                                  errors allowed) must be either A or T to be
                                  considered a polybase stretch. (Integer 1 or
                                  more)
   -pbwme              integer    [2] Only takes effect when -tpae is set to
                                  'Y. Defines the maximum number of errors
                                  allowed in a given window length such that a
                                  stretch is considered to be a polybase
                                  stretch. The distribution of these errors is
                                  not important. (Integer 1 or more)
   -pbwgd              integer    [9] Only takes effect when -tpae is set to
                                  'Y'. Defines the number of bases from the
                                  end of a sequence (if masked, from the end
                                  of the masked area) within which a polybase
                                  stretch is looked for without finding one.
                                  (Integer 1 or more)
   -[no]pvc            boolean    [Y] Mira will try to identify possible
                                  sequencing vector relicts present at the
                                  start of a sequence and clip them away.
                                  These relicts are usually a few bases long
                                  and were not correctly removed from the
                                  sequence in data pre-processing steps of
                                  external programs. You might want to turn
                                  off this option if you know (or think) that
                                  your data contains a lot of repeats and the
                                  option below to fine tune the clipping
                                  behaviour does not give the expected
                                  results.
   -pvcmla             integer    [18] The clipping of possible vector relicts
                                  option works quite well. Unfortunately the
                                  bounds of repeats or differences in EST
                                  splice variants sometimes show the same
                                  alignment behaviour as possible sequencing
                                  vector relicts and could therefore also be
                                  clipped. To stop the vector clipping from
                                  mistakenly clipping repetitive regions or
                                  EST splice variants, this option puts an
                                  upper bound to the number of bases a
                                  potential clip is allowed to have. If the
                                  number of bases is below or equal to this
                                  threshold then the bases are clipped. If the
                                  number of bases exceeds the threshold then
                                  the clip is NOT performed. Setting the value
                                  to 0 turns off the threshold i.e. clips are
                                  then always performed if a potential vector
                                  is found. (Integer 0 or more)
   -qc                 boolean    [N] Default is 'N', but is automatically set
                                  to 'Y' when using the setparam options
                                  'fasta' or 'phd' (can be turned off again by
                                  subsequent options afterwards). This will
                                  let mira perform its own quality clipping
                                  before sequences are entered into the
                                  assembly. The clip function performed is a
                                  sequence end window quality clip with back
                                  iteration to get a maximum number of bases
                                  as useful sequence. Note that the bases
                                  clipped away here can still be used
                                  afterwards if there is enough evidence
                                  supporting their correctness when the option
                                  -ure is turned on.
   -qcmq               integer    [20] This is the minimum quality required of
                                  bases in a window in order to be accepted.
                                  Please be cautious and don't use extreme
                                  values here, because then the clipping will
                                  be too lax or too harsh. Values below 15 and
                                  higher than 35 are disallowed. (Integer
                                  from 15 to 35)
   -qcwl               integer    [30] This is the length of a window in bases
                                  for the quality clip. (Integer 10 or more)
   -[no]mbc            boolean    [Y] This will let mira perform a 'clipping'
                                  of bases that were masked out (replaced with
                                  the character X). It is generally not a
                                  good idea to use mask bases to remove
                                  unwanted portions of a sequence; the EXP
                                  file format and the NCBI traceinfo format
                                  have excellent possibilities to circumvent
                                  this. But because a lot of pre-processing
                                  software is built around cross_match,
                                  scylla- and phrap-style base masking, the
                                  need arised for mira to be able to handle
                                  this too. mira will look at the start and
                                  end of each sequence to see whether there
                                  are masked bases that should be 'clipped'.
   -mbcgs              integer    [20] While performing the clip of masked
                                  bases, mira will look if it can merge larger
                                  chunks of masked bases that are a maximum
                                  of -mbcgs apart. (Integer 0 or more)
   -mbcmfg             integer    [40] While performing the clip of masked
                                  bases at the start of a sequence, mira will
                                  allow up to this number of unmasked bases in
                                  front of a masked stretch. (Integer 0 or
                                  more)
   -mbcmeg             integer    [60] While performing the clip of masked
                                  bases at the end of a sequence, mira will
                                  allow up to this number of unmasked bases
                                  behind a masked stretch. (Integer 0 or more)
   -[no]emlc           boolean    [Y] If on, ensures a minimum left clip on
                                  each read according to the parameters in
                                  -mlcr & -smlc
   -mlcr               integer    [25] If -emlc is 'Y', checks whether there
                                  is a left clip whose length is at least the
                                  size specified here. (Integer 0 or more)
   -smlc               integer    [30] If -emlc is 'Y' and the actual left
                                  clip is < -mlcr, then set the left clip of
                                  read to the value given here. (Integer 0 or
                                  more)
   -bph                integer    [14] Default is 14 on 32 bit systems and 16
                                  on 64 bit systems. Controls the number of
                                  consecutive bases n which are used as a word
                                  hash. The higher the value the faster the
                                  search. The lower the value the more weak
                                  matches are found. Values below 10 are not
                                  recommended. (Integer 1 or more)
   -hss                integer    [4] This is a parameter controlling the
                                  stepping increments with which hashes are
                                  generated. This allows for a more
                                  fine-grained search as matches are now found
                                  with at least n+s (see -bph) equal bases
                                  instead of the SSAHA 2n. The higher the
                                  value the faster the search. The lower the
                                  value the more weak matches are found.
                                  (Integer 1 or more)
   -pr                 integer    [50] Controls the relative percentage of
                                  exact word matches in an approximate overlap
                                  that has to be reached to accept the
                                  overlap as a possible match. Increasing this
                                  number will decrease the number of possible
                                  alignments that have to be checked by
                                  Smith-Waterman later on in the assembly, but
                                  it might also lead to the rejection of
                                  weaker overlaps (i.e. overlaps that contain
                                  a higher number of mismatches). (Integer 1
                                  or more)
   -mhpr               integer    [200] Controls the maximum number of
                                  possible hits one read can maximally
                                  transport to the Smith-Waterman alignment
                                  phase. If more potential hits are found,
                                  only the best ones are taken. This is an
                                  important option for tackling projects that
                                  contain extreme assembly conditions. For
                                  example, 5000 reads that are all very
                                  similar would generate around 40 to 50
                                  million possible alignments (forward and
                                  reverse complement). Setting this parameter
                                  to 200 reduces the number of alignments to
                                  check to around 1.5-2 million. As the
                                  assembly increases in passes (-nop),
                                  different combinations of possible hits will
                                  be checked, always the probably best ones
                                  first. So the accuracy of the assembly
                                  should only suffer when lowering this number
                                  too much. (Integer 1 or more)
   -bip                integer    [15] The banded Smith-Waterman alignment
                                  uses this percentage number to compute the
                                  bandwidth it has to use when computing the
                                  alignment matrix. E.g. expected overlap is
                                  150 bases, bip=10 -> the banded SW will
                                  compute a band of 15 bases to each side of
                                  the expected alignment diagonal, thus
                                  allowing up to 15 unbalanced inserts /
                                  deletes in the alignment. INCREASING AND
                                  DECREASING THIS NUMBER - increasing will
                                  find more non-optimal alignments but will
                                  also increase SW runtime between linear and
                                  ^2, decreasing will work the other way round
                                  (it might miss a few bad alignments but
                                  gain speed). (Integer from 1 to 100)
   -bmin               integer    [25] Minimum bandwidth in bases to each
                                  side. (Integer 1 or more)
   -bmax               integer    [50] Maximum bandwidth in bases to each
                                  side. (Integer 1 or more)
   -mo                 integer    [15] Minimum number of overlapping bases
                                  needed in an alignment of two sequences to
                                  be accepted. (Integer 1 or more)
   -ms                 integer    [15] Describes the minimum score of an
                                  overlap to be taken into account for
                                  assembly. mira uses a default scoring scheme
                                  for SW align. Each match counts 1, a match
                                  with an N counts 0, each mismatch with a
                                  non-N base -1 and each gap -2. Use a bigger
                                  score to weed out a number of chance
                                  matches, a lower score to perhaps find the
                                  single (short) alignment that might join two
                                  contigs together (at the expense of
                                  computing time and memory). (Integer 1 or
                                  more)
   -mrs                integer    [65] Describes the min percentage of
                                  matching between two reads to be considered
                                  for assembly. Increasing this number will
                                  save memory but one might lose possible
                                  alignments. A maximum of 80 is probably
                                  sensible here. Decreasing below 55 will
                                  probably make memory and time consumption
                                  explode. (Integer from 1 to 100)
   -egp                boolean    [N] Defines whether or not to increase
                                  penalties applied to alignments containing
                                  long gaps. Setting this to 'Y' might help in
                                  projects with frequent repeats. On the
                                  other hand, it is definitively disturbing
                                  when assembling very long reads containing
                                  multiple long indels in the called base
                                  sequence ... although this should not happen
                                  in the first place and is a sure sign for
                                  problems lying ahead. When in doubt, set it
                                  to 'Y' for EST projects and de-novo genome
                                  assembly, set it to 'N' for assembly of
                                  closely related strains (assembly against a
                                  backbone). When set to 'N', it is
                                  recommended to have -amgb and -amgbemc both
                                  set to 'Y'.
   -egpl               menu       [low] Has no effect if extra_gap_penalty is
                                  off. Defines an extra penalty applied to
                                  'long' gaps. There are these predefined
                                  levels - 1. low - use this if you expect
                                  your base caller frequently misses two or
                                  more bases. 2. medium - use this if your
                                  base caller is expected to frequently miss
                                  one to two bases. 3. high - use this if your
                                  base caller does not frequently miss more
                                  than one base. For some stages of the EST
                                  assembly process, a special value 'est' is
                                  used. (Values: low (Low); medium (Medium);
                                  high (High); est (EST split splices))
   -megpp              integer    [100] Has no effect if extra_gap_penalty is
                                  off. Defines the maximum extra penalty in
                                  percent applied to 'long' gaps. (Integer
                                  from 1 to 100)
   -np                 string     [mira] Contigs will have this string
                                  prepended to their names. (Any string)
   -an                 menu       [signal] When adding reads to a contig,
                                  dangerous regions can get an extra integrity
                                  check. none = no extra check. text = check
                                  is only text-based. signal = check is signal
                                  based, if the SCF trace is not available,
                                  fallback is 'text'. For the time being, only
                                  regions tagged as ALUS or REPT in the
                                  experiment file are considered dangerous.
                                  (Values: none (None); text (Text); signal
                                  (Signal))
   -rodirs             integer    [15] When adding reads to a contig, reject
                                  the reads if the drop in the quality of the
                                  consensus is > the given value in %. Lower
                                  values mean stricter checking. This value is
                                  doubled should a read be entered that has a
                                  template partner (a read pair) at the right
                                  distance. (Integer from 1 to 100)
   -dmer               integer    [1] When adding reads to a contig, reject
                                  the reads if the error in zones known as
                                  dangerous exceeds the given value in %.
                                  Lower values mean stricter checking in these
                                  danger zones. For the time being, only
                                  regions tagged as ALUS or REPT in the
                                  experiment file are considered dangerous.
                                  (Integer from 1 to 100)
   -[no]mr             boolean    [Y] One of the most important switches in
                                  MIRA. If set to 'Y', MIRA will try to
                                  resolve misassemblies due to repeats by
                                  identifying single base stretch differences
                                  and tag those critical bases as RMB (Repeat
                                  Marker Base, weak or strong). This switch is
                                  also needed when MIRA is run in EST mode to
                                  identify possible inter-, intra- and
                                  intra-and-interorganism SNPs.
   -asir               boolean    [N] Only takes effect when -mr is set to
                                  'Y', effect is also dependent on the fact
                                  whether strain data (see -lsd) is present or
                                  not. Usually, mira will mark bases that
                                  differentiate between repeats, when a
                                  conflict occurs between reads that belong to
                                  one strain. If the conflict occurs between
                                  reads belonging to different strains they
                                  are marked as SNP. However, if this switch
                                  is set to 'Y',= then conflicts within a
                                  strain are also marked as SNP. This switch
                                  is mainly used in assemblies of ESTs; it
                                  should not be set for genomic assembly.
   -mrpg               integer    [2] Only takes effect when -mr is set to
                                  'Y'. This defines the minimum number of
                                  reads in a group that are needed for the RMB
                                  (Repeat Marker Bases) or SNP detection
                                  routines to be triggered. A group is defined
                                  by the reads carrying the same nucleotide
                                  for a given position, i.e., an assembly with
                                  mrpg=2 will need at least two times two
                                  reads with the same nucleotide (having at
                                  least a quality as defined in -mgqrt) to be
                                  recognised as repeat marker or a SNP.
                                  Setting this to a low number increases
                                  sensitivity, but might produce a few false
                                  positives, resulting in reads being thrown
                                  out of contigs because of falsely identified
                                  possible repeat markers (or wrongly
                                  recognised as SNP). (Integer 2 or more)
   -mgqrt              integer    [30] Only takes effect when -mr is set to
                                  'Y'. This defines the minimum quality of a
                                  group of bases to be taken into account as
                                  potential repeat marker. The lower the
                                  number, the more sensitive you get, but
                                  lowering below 25 is not recommended as a
                                  lot of wrongly called bases can have a
                                  quality approaching this value and you'd end
                                  up with a lot of false positives. The
                                  higher the overall coverage of your project
                                  the better, and the higher you can set this
                                  number. A value of 35 will probably remove
                                  all false positives, a value of 40 will
                                  probably never show false positives.
                                  (Integer 25 or more)
   -emea               integer    [15] Only takes effect when -mr is set to
                                  'Y'. Using the end of sequences of Sanger
                                  type shotgun sequencing is always a bit
                                  risky, as wrongly called bases tend to crowd
                                  there or some sequencing vector relicts
                                  hang around. It is even more risky to use
                                  these stretches for detecting possible
                                  repeats, so one can define an exclusion area
                                  where the bases are not used when
                                  determining whether a mismatch is due to
                                  repeats or not. (Integer 0 or more)
   -[no]amgb           boolean    [Y] Determines whether columns containing
                                  gap bases (indels) are also tagged.
   -[no]amgbemc        boolean    [Y] Only takes effect when -amgb is set to
                                  'Y'. Determines whether multiple columns
                                  containing gap bases (indels) are also
                                  tagged.
   -[no]amgbnbs        boolean    [Y] Only takes effect when -amgb is set to
                                  'Y'. Determines whether, for both tagging
                                  columns containing gap bases, both strands
                                  need to have a gap. Setting this to 'N' is
                                  not recommended except when working in
                                  desperately low coverage situations.
   -dismin             integer    [500] The minimum distance that read pairs
                                  may be apart. There is an additional error
                                  margin of 10% subtracted from this value
                                  during internal computations. (Integer 0 or
                                  more)
   -dismax             integer    [5000] The maximum distance that read pairs
                                  may be apart. There is an additional error
                                  margin of 10% added to this value during
                                  internal computations. (Integer 0 or more)
   -ace                boolean    [N] Once contigs have been build, mira can
                                  call a built-in version of the automatic
                                  contig editor EdIt. EdIt will try to resolve
                                  discrepancies in the contig by performing
                                  trace analysis and correct even hard to
                                  resolve errors. This option is always
                                  useful, but especially in conjunction with
                                  -nop and -ure. Notice: the current
                                  development version has a memory leak in the
                                  editor, therefore the option is not
                                  automatically turned on.
   -[no]sem            boolean    [Y] If set to 'Y' the automatic editor will
                                  not take error hypotheses with a low
                                  probability into account, even if all the
                                  requirements to make an edit are fulfilled.
   -ct                 integer    [50] The higher this value, the more strict
                                  the automatic editor will apply its internal
                                  rule set. Going below 40 is not
                                  recommended. (Integer from 1 to 100)
   -[no]orc            boolean    [Y] Output CAF results
   -[no]org            boolean    [Y] Output GAP4 results
   -[no]orf            boolean    [Y] Output FASTA results
   -ora                boolean    [N] Output ACE results
   -[no]ort            boolean    [Y] Output TXT results
   -[no]ors            boolean    [Y] Output TCS results
   -orh                boolean    [N] Output HTML results
   -otc                boolean    [N] Output temporary CAF results
   -otg                boolean    [N] Output temporary GAP4 results
   -otf                boolean    [N] Output temporary FASTA results
   -ota                boolean    [N] Output temporary ACE results
   -ott                boolean    [N] Output temporary TXT results
   -ots                boolean    [N] Output temporary TCS results
   -oth                boolean    [N] Output temporary HTML results
   -oetc               boolean    [N] Output extra temporary CAF results
   -oetg               boolean    [N] Output extra temporary GAP4 results
   -oetf               boolean    [N] Output extra temporary FASTA results
   -oeta               boolean    [N] Output extra temporary ACE results
   -oett               boolean    [N] Output extra temporary TXT results
   -oeth               boolean    [N] Output extra temporary HTML results
   -tcpl               integer    [60] When producing an output in text format
                                  (-ort|ott|oett), this parameter defines how
                                  many bases each line of an alignment should
                                  contain. (Integer 1 or more)
   -hcpl               integer    [60] When producing an output in text format
                                  (-orh|oth|oeth), this parameter defines how
                                  many bases each line of an alignment should
                                  contain. (Integer 1 or more)
   -gapfda             string     [gap4da] Defines the extension of the
                                  directory where mira will write the result
                                  of an assembly ready to import into the
                                  Staden package (GAP4) in Direct Assembly
                                  format. The name of the directory will then
                                  be _. (Any string)
   -log                string     [miralog] Defines the directory where mira
                                  will write some log files to. Note that the
                                  name of the actual project will be
                                  prepended. (Any string)
   -co                 string     [mira_out.caf] Defines the file in CAF
                                  format to save an assembled project to.
                                  Filename must end with '.caf'. (Any string)

   Associated qualifiers:

   "-expdir" associated qualifiers
   -extension          string     Default file extension

   "-scfdir" associated qualifiers
   -extension          string     Default file extension

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write first file to standard output
   -filter             boolean    Read first file from standard input, write
                                  first file to standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options and exit. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages
   -version            boolean    Report version number and exit

Qualifier

Type

Description

Allowed values

Default

Standard (Mandatory) qualifiers

-project

string

Default is mira. Defines the project name for this assembly. The project name automatically influences the name of input and output files or directories. E.g. in the default setting, the file names for the output of the assembly in FASTA format would be mira_out.fasta and mira_out.fasta.qual. Setting the project name to 'MyProject' would generate MyProject_out.fasta and MyProject_out.fasta.qual.

Any string

mira

Additional (Optional) qualifiers

(none)

Advanced (Unprompted) qualifiers

-paramsfile

infile

Loads parameters from the filename given. Allows a maximum of 10 levels of recursion, i.e. a -params option appearing within a file that loads other parameter files

Input file

Required

-setparam

list

Sets parameters suited for loading sequences from FASTA, PHD or CAF files. The default is not to specify the type of input file.

unspecified	(Unspecified)
fasta	(Fasta)
phd	(PHD)
caf	(CAF)

unspecified

-expdir

Directory

-feifile

infile

Defines the file of filenames where the names of the EXP files of a project are located.

Input file

mira_in.fofn

-fpifile

infile

Defines the file of filenames where the names of the PHD files of a project are located.

Input file

mira_in.fofn

-pifile

infile

Defines the PHD file to load sequences of a project from.

Input file

mira_in.phd

-faifile

infile

Defines the FASTA file to load sequences of a project from.

Input file

mira_in.fasta

-fqifile

infile

Defines the fasta file to load base qualities of a project from. Although the order of reads in the quality file does not need to be the same as in the fasta or fofn projects (although it saves a bit of time if they are).

Input file

mira_in.fasta.qual

-cifile

infile

Defines the file to load a CAF project from. Filename must end with '.caf'.

Input file

mira_in.caf

-sdifile

infile

Defines the file to load straindata from. Only used in EST projects (miraEST).

Input file

mira_straindata_in.txt

-xtiifile

infile

Defines the file to load a trace info file in XML format from. This can be used both when merging XML data to loaded files or when loading a project from an XML trace info file.

Input file

mira_xmltraceinfo_in.xml

-genome

list

Quality grades of de-novo genome assembly. Draft is quick-and-dirty, suited to get a first look on approximate coverage of a running project. Should not be used for anything else. Normal is the default parameter set of mira that is able to tackle most genomes. A bit slower than the draft version, but includes such options as read extension and vector remnant clipping. Accurate is still slower than the normal mode but should be used for genomes that pose a problem to the normal mode.

draft	(Draft)
normal	(Normal)
accurate	(Accurate)

normal

-mapping

list

Work like the -genome switches except they are to be used when performing mapping assemblies against given backbone sequences.

draft	(Draft)
normal	(Normal)
accurate	(Accurate)

normal

-clipping

list

Three clipping grade modifiers, from light clipping when working with well preprocessed sequences to heavy clipping when the sequences that are being assembled had only sloppy or no preprocessing. Note 1 - the light version is already included in the -genome and -mapping switches. Note 2 - it is recommended that you perform a thorough preprocessing (clipping sequencing vector stretches, clipping of low quality bases, tagging standard repeats etc.) before assembling sequences. The clipping routines of mira are more optimised to cope with the last remnants of wrongly preprocessed sequences than with sequences having had no pre-processing at all.

light	(Light)
medium	(Medium)
heavy	(Heavy)

medium

-highlyrepetitive

boolean

A modifier switch for genome data that is deemed to be highly repetitive. The assemblies will run slower due to more iterative cycles that give mira a chance to resolve nasty repeats.

Boolean value Yes/No

-highqualitydata

boolean

A modifier switch when the sequences that are used are of exceptional quality. mira will then bump up a few quality parameters which should lead to less false positives in the repeat and SNP detection routines.

Boolean value Yes/No

-estmode

boolean

Switches mira to a good initial preset for assembling EST data. Note that this is not needed (and even counterproductive) when used with miraEST.

Boolean value Yes/No

-horrid

boolean

Sets a number of parameters useful when dealing with really horrid data sets. Useful means that parameters are chosen to so that time and memory consumption do not explode beyond all hope of the program returning. Note that MIRA will return in most cases useful assemblies with this switch, but these might not be as optimised as with normal operation. The definition of 'horrid' is a bit flexible, for example, (a) a genomic projects with more than 2.000 reads that all seem to align partly to each other but have different repetitive structures or (b) EST clusters with a few thousand almost similar reads.

Boolean value Yes/No

-borg

boolean

Sets several parameters to have mira try to assemble as many reads as possible. Will probably slow down the assembly process and use more memory. 'We are MIRA of borg. You will be assembled, resistance is futile!'

Boolean value Yes/No

-lj

list

Defines whether to load and assemble EXP files from a file of filenames ('mira_in.fofn'), load and assemble FASTA sequences ('mira_in.fasta') and their qualities ('mira_in.fasta.qual'), load and assemble sequences or qualities from a phd file ('mira_in.phd') or to load a project from a CAF file ('mira_in.caf') and assemble or eventually reassemble it. N.B. fofnphd is not currently available.

fofnexp	(EXP files from a file of filenames)
fasta	(Load and assemble FASTA)
caf	(Load and assemble CAF)
phd	(Load and assemble PHD)
fofnphd	(PHD files from a file of filenames)

fofnexp

-fo

boolean

If set to 'Y', the project will not be assembled and no assembly output files will be produced. Instead, the project files will only be loaded. This switch is useful for checking consistency of input files.

Boolean value Yes/No

-mxti

boolean

Some file formats above (FASTA, PHD or even CAF and EXP) possibly don't contain all the info necessary or useful for each read of an assembly. Should additional information, such as like clipping positions etc., be available in a XML trace info file in NCBI format (see File formats), then set this option to 'Y' and it will be merged to the data loaded. Please note, quality clippings given here will override quality clippings loaded earlier or performed by mira. Minimum clippings will still be made by the program, though.

Boolean value Yes/No

-rns

list

Defines the centre naming scheme for read suffixes. Currently, only Sanger Institute and TIGR naming schemes are supported out of the box. How to choose? Please read the documentation available at the different centres or ask your sequence provider. In a nutshell, the Sanger scheme is 'somename.[pqsfrw][12][bckdeflmnpt][a|b|c|...' (e.g. U13a08f10.p1ca), TIGR scheme is 'somenameTF*|TR*|TA*' (e.g. GCPBN02TF or GCPDL68TABRPT103A58B).

sanger	(Sanger)
tigr	(TIGR)

sanger

-eq

list

Defines the source format for reading qualities from external sources. Normally takes effect only when these are not present in the format of the load_job project (EXP and FASTA can have them, CAF and PHD must have them).

none	(None)
SCF	(SCF)

SCF

-eqo

boolean

Only takes effect when 'lj' is fofnexp. Defines whether or not the qualities from the external source override the possibly loaded qualities from the load job project. This might be of use in case some post-processing software fiddles around with the quality values of the input file but one wants to have the original ones.

Boolean value Yes/No

-[no]droeqe

boolean

Should there be a major mismatch between the external quality source and the sequence (e.g. the base sequence read from a SCF file does not match the originally read base sequence), should the read be excluded from assembly or not. If not, it will use the qualities it had before trying to load the external qualities (either default qualities or the ones loaded from the original source).

Boolean value Yes/No

Yes

-[no]uti

boolean

Two reads sequenced from the same clone template form a read pair with a known minimum and maximum distance. This feature will definitively help for contigs containing lots of repeats. Set this to 'Y' if your data contains information on insert sizes. Information on insert sizes can be given via the SI tag in EXP files (for each read pair individually), or for the whole project using dismin and dismax

Boolean value Yes/No

Yes

-ess

integer

Controls the starting step of the EST assembly and is therefore only useful in miraEST. EST assembly is a three step process, each with different settings to the assembly engine, with the result of each step being saved to disk. If results of previous steps are present in a directory, one can easily 'play around' with different setting for subsequent steps by reusing the results of the previous steps and directly starting with step two or three.

Integer from 1 to 4

-[no]ps

boolean

Controls whether date and time are printed out during the assembly. Suppressing it isn't useful in normal operation, only when debugging or benchmarking.

Boolean value Yes/No

Yes

-lsd

boolean

Straindata is a key value file, one read per line. First the name of the read, then the strain name of the organism the read comes from. It is used by the program to differentiate different types of SNPs appearing in organisms and classifying them.

Boolean value Yes/No

-lb

boolean

A backbone is a sequence (or a previous assembly) that is used as a template for the current assembly. The current assembly process will first assemble reads to loaded backbone contigs before creating new contigs. This feature is helpful for assembling against previous (and already possibly edited) assembly iterations, or to make a comparative assembly of two very closely related organisms. Please read 'very closely related' as in 'only SNP mutations or short indels present'.

Boolean value Yes/No

-sbuip

integer

When assembling against backbones, this parameter defines the pass iteration (see nop) from which on the backbones will be really used. In the passes preceding this number, the non-backbone reads will be assembled together as if no backbones existed. This allows mira to correctly spot repetitive stretches that differ by single bases and tag them accordingly. Rule of thumb - if backbones belong to the same strain as the reads to assemble, set to 1. If backbones are a different strain, then set sbuib to 1 lower than nop (example - nop 4 and sbuip 3).

Integer 1 or more

-bsn

string

Defines the name of the strain that the backbone sequences have.

Any string

-bft

list

Defines the filetype of the backbone file given. Currently (2.8.1 ) only FASTA, CAF and GBF files are supported. When GBF (GenBank files, also named .gbk) files are loaded, the features within these files are automatically transformed into Staden-compatible tags and get passed through the assembly.

fasta	(Fasta)
caf	(CAF)
gbf	(GenBank)

fasta

-brl

integer

Parameter for the internal sectioning size of the backbone. Extremely repetitive sequences may require reducing the default value, but the default value should work well in 99.9% of all cases.

Integer from 1000 to 3000

2500

-bbq

integer

Defines the default quality that the backbone sequences have if they came without quality values in their files (like in GBF format or when FASTA is used without .qual files). A value of -1 causes mira to use the same default quality for backbones as for reads.

Integer from -1 to 100

-1

-[no]abnc

boolean

The standard mode of the assembler is to assemble available reads to a backbone and make new contigs with the remaining reads. If this option is set to 'N', the reads that cannot be assembled into existing contigs are put as singlets into the assembly, not forming new contigs.

Boolean value Yes/No

Yes

-mrl

integer

Minimum length that reads must have to be considered for the assembly. Shorter sequences will be filtered out at the beginning of the process and won't be present in the final project.

Integer 20 or more

-nop

integer

Defines how many iterations of the whole assembly process are done. Rule of thumb - for quick and dirty assembly use 1 (not recommended). For assembly using read extensions and / or automatic contig editing (-ure and -ace) use at least 2. The recommended setting is 3 or higher, as some knowledge generated by the assembler can be used only from the third iteration on. More than 3 passes might be useful for projects containing many repetitive elements. See also -rbl and -mr for parameters that affect the assembly and disentanglement of possible repeats.

Integer 1 or more

-[no]sep

boolean

Defines whether the skim algorithm (and with it also the recalculation of Smith-Waterman alignments) is called in between each main pass. If set to 'N', skimming is done only when needed by the workflow, either when read extensions are searched for (-ure) or when possible vector leftovers are to be clipped (-pvc). Setting this option to 'Y' is highly recommended, setting it to 'N' is only for quick and dirty assemblies.

Boolean value Yes/No

Yes

-rbl

integer

Defines the maximum number of times a contig can be rebuilt during main assembly passes (-nop) if misassemblies, due to possible repeats, are found.

Integer 1 or more

-[no]sd

boolean

Default is 'Y' for mira and 'N' for miraEST. A spoiler can be either a chimeric read or it is a read with long parts of unclipped vector sequence still included (that was too long for the -pvc vector leftover clipping routines). A spoiler typically prevents contigs being joined; MIRA will cut them back so that they present no more harm to the assembly. Recommended for assemblies of mid-to-high coverage genomic assemblies; not recommended for assemblies of ESTs as one might lose splice variants with that. A minimum number of two assembly passes (-nop) must be run for this option to take effect.

Boolean value Yes/No

Yes

-[no]sdlpo

boolean

Defines whether the spoiler detection algorithms are run only for the last pass or for all passes (-nop). Takes effect only if spoiler detection (-sd) is on.

Boolean value Yes/No

Yes

-bdq

integer

Defines the default base quality of reads that have no quality read from a file.

Integer 0 or more

-[no]ugpf

boolean

MIRA has two different pathfinder algorithms it chooses from to find its way through the (more or less) complete set of possible sequence overlaps; a genomic and an EST pathfinder. The genomic looks a bit into the future of the assembly and tries to stay on safe grounds using a maximum of information already present in the contig that is being built. The EST version, on the contrary, will directly jump at the complex cases posed by very similar repetitive sequences and try to solve those first; it is willing to fall down to brute force when really bad cases (such as coverage with thousands of sequences) are encountered. Generally, the genomic pathfinder will also work quite well with EST sequences (but might get slowed down a lot in pathological cases), while the EST algorithm does not work so well on genomes. If in doubt, leaveas 'Y' for genome projects and set to 'N' for EST projects.

Boolean value Yes/No

Yes

-[no]uess

boolean

Another important switch if you plan to assemble non-normalised EST libraries, where some ESTs may reach coverages of several hundreds or thousands of reads. This switch lets MIRA save a lot of computational time when aligning those extremely high coverage areas (but only there), at the expense of some accuracy.

Boolean value Yes/No

Yes

-esspd

integer

Defines the number of potential partners a read must have for MIRA switching into emergency search stop mode for that read.

Integer 1 or more

500

-umcbt

boolean

Defines whether there is an upper limit of time to be used to build one contig. Set this to 'Y' in EST assemblies where you think that extremely high coverages occur. Less useful for assembly of genomic sequences.

Boolean value Yes/No

-bts

integer

Depending on -umcbt above, this number defines the time in seconds alloted to building one contig.

Integer 1 or more

10000

-[no]ure

boolean

Boolean value Yes/No

Yes

-rewl

integer

Only takes effect when -ure is set to 'Y'. The read extension routines use a sliding window approach on Smith-Waterman alignments. This parameter defines the window length.

Integer 1 or more

-rewme

integer

Only takes effect when -ure is set to 'Y'. The read extension routines use a sliding window approach on Smith-Waterman alignments. This parameter defines the number maximum number of errors (disagreements) between two alignments in the given window.

Integer 1 or more

-feip

integer

Only takes effect when -ure is set to 'Y'. The read extension routines can be called before assembly and/or after each assembly pass (see -nop). This parameter defines the first pass in which the read extension routines are called. The default of 0 tells mira to extend the reads the first time before the first assembly pass.

Integer 0 or more

-leip

integer

Only takes effect when -ure is set to 'Y'. The read extension routines can be called before assembly and/or after each assembly pass (see -nop). This parameter defines the last pass in which the read extension routines are called. The default of 0 tells mira to extend the reads the last time before the first assembly pass.

Integer 0 or more

-tpae

boolean

This option is useful in EST assembly. Poly-AT stretches at the end of reads that were not correctly masked or clipped in pre-processing steps from external programs get tagged here. The assembler will not use these stretches for critical operations. Additionally, the tags do provide a good visual anchor when looking at the assembly with different programs.

Boolean value Yes/No

-pbwl

integer

Only takes effect when -tpae is set to 'Y'. Defines the window length within which all bases (except the maximum number of errors allowed) must be either A or T to be considered a polybase stretch.

Integer 1 or more

-pbwme

integer

Only takes effect when -tpae is set to 'Y. Defines the maximum number of errors allowed in a given window length such that a stretch is considered to be a polybase stretch. The distribution of these errors is not important.

Integer 1 or more

-pbwgd

integer

Only takes effect when -tpae is set to 'Y'. Defines the number of bases from the end of a sequence (if masked, from the end of the masked area) within which a polybase stretch is looked for without finding one.

Integer 1 or more

-[no]pvc

boolean

Mira will try to identify possible sequencing vector relicts present at the start of a sequence and clip them away. These relicts are usually a few bases long and were not correctly removed from the sequence in data pre-processing steps of external programs. You might want to turn off this option if you know (or think) that your data contains a lot of repeats and the option below to fine tune the clipping behaviour does not give the expected results.

Boolean value Yes/No

Yes

-pvcmla

integer

The clipping of possible vector relicts option works quite well. Unfortunately the bounds of repeats or differences in EST splice variants sometimes show the same alignment behaviour as possible sequencing vector relicts and could therefore also be clipped. To stop the vector clipping from mistakenly clipping repetitive regions or EST splice variants, this option puts an upper bound to the number of bases a potential clip is allowed to have. If the number of bases is below or equal to this threshold then the bases are clipped. If the number of bases exceeds the threshold then the clip is NOT performed. Setting the value to 0 turns off the threshold i.e. clips are then always performed if a potential vector is found.

Integer 0 or more

-qc

boolean

Default is 'N', but is automatically set to 'Y' when using the setparam options 'fasta' or 'phd' (can be turned off again by subsequent options afterwards). This will let mira perform its own quality clipping before sequences are entered into the assembly. The clip function performed is a sequence end window quality clip with back iteration to get a maximum number of bases as useful sequence. Note that the bases clipped away here can still be used afterwards if there is enough evidence supporting their correctness when the option -ure is turned on.

Boolean value Yes/No

-qcmq

integer

This is the minimum quality required of bases in a window in order to be accepted. Please be cautious and don't use extreme values here, because then the clipping will be too lax or too harsh. Values below 15 and higher than 35 are disallowed.

Integer from 15 to 35

-qcwl

integer

This is the length of a window in bases for the quality clip.

Integer 10 or more

-[no]mbc

boolean

This will let mira perform a 'clipping' of bases that were masked out (replaced with the character X). It is generally not a good idea to use mask bases to remove unwanted portions of a sequence; the EXP file format and the NCBI traceinfo format have excellent possibilities to circumvent this. But because a lot of pre-processing software is built around cross_match, scylla- and phrap-style base masking, the need arised for mira to be able to handle this too. mira will look at the start and end of each sequence to see whether there are masked bases that should be 'clipped'.

Boolean value Yes/No

Yes

-mbcgs

integer

While performing the clip of masked bases, mira will look if it can merge larger chunks of masked bases that are a maximum of -mbcgs apart.

Integer 0 or more

-mbcmfg

integer

While performing the clip of masked bases at the start of a sequence, mira will allow up to this number of unmasked bases in front of a masked stretch.

Integer 0 or more

-mbcmeg

integer

While performing the clip of masked bases at the end of a sequence, mira will allow up to this number of unmasked bases behind a masked stretch.

Integer 0 or more

-[no]emlc

boolean

If on, ensures a minimum left clip on each read according to the parameters in -mlcr & -smlc

Boolean value Yes/No

Yes

-mlcr

integer

If -emlc is 'Y', checks whether there is a left clip whose length is at least the size specified here.

Integer 0 or more

-smlc

integer

If -emlc is 'Y' and the actual left clip is < -mlcr, then set the left clip of read to the value given here.

Integer 0 or more

-bph

integer

Default is 14 on 32 bit systems and 16 on 64 bit systems. Controls the number of consecutive bases n which are used as a word hash. The higher the value the faster the search. The lower the value the more weak matches are found. Values below 10 are not recommended.

Integer 1 or more

-hss

integer

This is a parameter controlling the stepping increments with which hashes are generated. This allows for a more fine-grained search as matches are now found with at least n+s (see -bph) equal bases instead of the SSAHA 2n. The higher the value the faster the search. The lower the value the more weak matches are found.

Integer 1 or more

-pr

integer

Controls the relative percentage of exact word matches in an approximate overlap that has to be reached to accept the overlap as a possible match. Increasing this number will decrease the number of possible alignments that have to be checked by Smith-Waterman later on in the assembly, but it might also lead to the rejection of weaker overlaps (i.e. overlaps that contain a higher number of mismatches).

Integer 1 or more

-mhpr

integer

Controls the maximum number of possible hits one read can maximally transport to the Smith-Waterman alignment phase. If more potential hits are found, only the best ones are taken. This is an important option for tackling projects that contain extreme assembly conditions. For example, 5000 reads that are all very similar would generate around 40 to 50 million possible alignments (forward and reverse complement). Setting this parameter to 200 reduces the number of alignments to check to around 1.5-2 million. As the assembly increases in passes (-nop), different combinations of possible hits will be checked, always the probably best ones first. So the accuracy of the assembly should only suffer when lowering this number too much.

Integer 1 or more

200

-bip

integer

The banded Smith-Waterman alignment uses this percentage number to compute the bandwidth it has to use when computing the alignment matrix. E.g. expected overlap is 150 bases, bip=10 -> the banded SW will compute a band of 15 bases to each side of the expected alignment diagonal, thus allowing up to 15 unbalanced inserts / deletes in the alignment. INCREASING AND DECREASING THIS NUMBER - increasing will find more non-optimal alignments but will also increase SW runtime between linear and ^2, decreasing will work the other way round (it might miss a few bad alignments but gain speed).

Integer from 1 to 100

-bmin

integer

Minimum bandwidth in bases to each side.

Integer 1 or more

-bmax

integer

Maximum bandwidth in bases to each side.

Integer 1 or more

-mo

integer

Minimum number of overlapping bases needed in an alignment of two sequences to be accepted.

Integer 1 or more

-ms

integer

Describes the minimum score of an overlap to be taken into account for assembly. mira uses a default scoring scheme for SW align. Each match counts 1, a match with an N counts 0, each mismatch with a non-N base -1 and each gap -2. Use a bigger score to weed out a number of chance matches, a lower score to perhaps find the single (short) alignment that might join two contigs together (at the expense of computing time and memory).

Integer 1 or more

-mrs

integer

Describes the min percentage of matching between two reads to be considered for assembly. Increasing this number will save memory but one might lose possible alignments. A maximum of 80 is probably sensible here. Decreasing below 55 will probably make memory and time consumption explode.

Integer from 1 to 100

-egp

boolean

Defines whether or not to increase penalties applied to alignments containing long gaps. Setting this to 'Y' might help in projects with frequent repeats. On the other hand, it is definitively disturbing when assembling very long reads containing multiple long indels in the called base sequence ... although this should not happen in the first place and is a sure sign for problems lying ahead. When in doubt, set it to 'Y' for EST projects and de-novo genome assembly, set it to 'N' for assembly of closely related strains (assembly against a backbone). When set to 'N', it is recommended to have -amgb and -amgbemc both set to 'Y'.

Boolean value Yes/No

-egpl

list

Has no effect if extra_gap_penalty is off. Defines an extra penalty applied to 'long' gaps. There are these predefined levels - 1. low - use this if you expect your base caller frequently misses two or more bases. 2. medium - use this if your base caller is expected to frequently miss one to two bases. 3. high - use this if your base caller does not frequently miss more than one base. For some stages of the EST assembly process, a special value 'est' is used.

low	(Low)
medium	(Medium)
high	(High)
est	(EST split splices)

low

-megpp

integer

Has no effect if extra_gap_penalty is off. Defines the maximum extra penalty in percent applied to 'long' gaps.

Integer from 1 to 100

100

-np

string

Contigs will have this string prepended to their names.

Any string

mira

-an

list

When adding reads to a contig, dangerous regions can get an extra integrity check. none = no extra check. text = check is only text-based. signal = check is signal based, if the SCF trace is not available, fallback is 'text'. For the time being, only regions tagged as ALUS or REPT in the experiment file are considered dangerous.

none	(None)
text	(Text)
signal	(Signal)

signal

-rodirs

integer

When adding reads to a contig, reject the reads if the drop in the quality of the consensus is > the given value in %. Lower values mean stricter checking. This value is doubled should a read be entered that has a template partner (a read pair) at the right distance.

Integer from 1 to 100

-dmer

integer

When adding reads to a contig, reject the reads if the error in zones known as dangerous exceeds the given value in %. Lower values mean stricter checking in these danger zones. For the time being, only regions tagged as ALUS or REPT in the experiment file are considered dangerous.

Integer from 1 to 100

-[no]mr

boolean

One of the most important switches in MIRA. If set to 'Y', MIRA will try to resolve misassemblies due to repeats by identifying single base stretch differences and tag those critical bases as RMB (Repeat Marker Base, weak or strong). This switch is also needed when MIRA is run in EST mode to identify possible inter-, intra- and intra-and-interorganism SNPs.

Boolean value Yes/No

Yes

-asir

boolean

Only takes effect when -mr is set to 'Y', effect is also dependent on the fact whether strain data (see -lsd) is present or not. Usually, mira will mark bases that differentiate between repeats, when a conflict occurs between reads that belong to one strain. If the conflict occurs between reads belonging to different strains they are marked as SNP. However, if this switch is set to 'Y',= then conflicts within a strain are also marked as SNP. This switch is mainly used in assemblies of ESTs; it should not be set for genomic assembly.

Boolean value Yes/No

-mrpg

integer

Only takes effect when -mr is set to 'Y'. This defines the minimum number of reads in a group that are needed for the RMB (Repeat Marker Bases) or SNP detection routines to be triggered. A group is defined by the reads carrying the same nucleotide for a given position, i.e., an assembly with mrpg=2 will need at least two times two reads with the same nucleotide (having at least a quality as defined in -mgqrt) to be recognised as repeat marker or a SNP. Setting this to a low number increases sensitivity, but might produce a few false positives, resulting in reads being thrown out of contigs because of falsely identified possible repeat markers (or wrongly recognised as SNP).

Integer 2 or more

-mgqrt

integer

Only takes effect when -mr is set to 'Y'. This defines the minimum quality of a group of bases to be taken into account as potential repeat marker. The lower the number, the more sensitive you get, but lowering below 25 is not recommended as a lot of wrongly called bases can have a quality approaching this value and you'd end up with a lot of false positives. The higher the overall coverage of your project the better, and the higher you can set this number. A value of 35 will probably remove all false positives, a value of 40 will probably never show false positives.

Integer 25 or more

-emea

integer

Only takes effect when -mr is set to 'Y'. Using the end of sequences of Sanger type shotgun sequencing is always a bit risky, as wrongly called bases tend to crowd there or some sequencing vector relicts hang around. It is even more risky to use these stretches for detecting possible repeats, so one can define an exclusion area where the bases are not used when determining whether a mismatch is due to repeats or not.

Integer 0 or more

-[no]amgb

boolean

Determines whether columns containing gap bases (indels) are also tagged.

Boolean value Yes/No

Yes

-[no]amgbemc

boolean

Only takes effect when -amgb is set to 'Y'. Determines whether multiple columns containing gap bases (indels) are also tagged.

Boolean value Yes/No

Yes

-[no]amgbnbs

boolean

Only takes effect when -amgb is set to 'Y'. Determines whether, for both tagging columns containing gap bases, both strands need to have a gap. Setting this to 'N' is not recommended except when working in desperately low coverage situations.

Boolean value Yes/No

Yes

-dismin

integer

The minimum distance that read pairs may be apart. There is an additional error margin of 10% subtracted from this value during internal computations.

Integer 0 or more

500

-dismax

integer

The maximum distance that read pairs may be apart. There is an additional error margin of 10% added to this value during internal computations.

Integer 0 or more

5000

-ace

boolean

Once contigs have been build, mira can call a built-in version of the automatic contig editor EdIt. EdIt will try to resolve discrepancies in the contig by performing trace analysis and correct even hard to resolve errors. This option is always useful, but especially in conjunction with -nop and -ure. Notice: the current development version has a memory leak in the editor, therefore the option is not automatically turned on.

Boolean value Yes/No

-[no]sem

boolean

If set to 'Y' the automatic editor will not take error hypotheses with a low probability into account, even if all the requirements to make an edit are fulfilled.

Boolean value Yes/No

Yes

-ct

integer

The higher this value, the more strict the automatic editor will apply its internal rule set. Going below 40 is not recommended.

Integer from 1 to 100

-[no]orc

boolean

Output CAF results

Boolean value Yes/No

Yes

-[no]org

boolean

Output GAP4 results

Boolean value Yes/No

Yes

-[no]orf

boolean

Output FASTA results

Boolean value Yes/No

Yes

-ora

boolean

Output ACE results

Boolean value Yes/No

-[no]ort

boolean

Output TXT results

Boolean value Yes/No

Yes

-[no]ors

boolean

Output TCS results

Boolean value Yes/No

Yes

-orh

boolean

Output HTML results

Boolean value Yes/No

-otc

boolean

Output temporary CAF results

Boolean value Yes/No

-otg

boolean

Output temporary GAP4 results

Boolean value Yes/No

-otf

boolean

Output temporary FASTA results

Boolean value Yes/No

-ota

boolean

Output temporary ACE results

Boolean value Yes/No

-ott

boolean

Output temporary TXT results

Boolean value Yes/No

-ots

boolean

Output temporary TCS results

Boolean value Yes/No

-oth

boolean

Output temporary HTML results

Boolean value Yes/No

-oetc

boolean

Output extra temporary CAF results

Boolean value Yes/No

-oetg

boolean

Output extra temporary GAP4 results

Boolean value Yes/No

-oetf

boolean

Output extra temporary FASTA results

Boolean value Yes/No

-oeta

boolean

Output extra temporary ACE results

Boolean value Yes/No

-oett

boolean

Output extra temporary TXT results

Boolean value Yes/No

-oeth

boolean

Output extra temporary HTML results

Boolean value Yes/No

-tcpl

integer

When producing an output in text format (-ort|ott|oett), this parameter defines how many bases each line of an alignment should contain.

Integer 1 or more

-hcpl

integer

When producing an output in text format (-orh|oth|oeth), this parameter defines how many bases each line of an alignment should contain.

Integer 1 or more

-gapfda

string

Defines the extension of the directory where mira will write the result of an assembly ready to import into the Staden package (GAP4) in Direct Assembly format. The name of the directory will then be <projectname>_.<extension>

Any string

gap4da

-log

string

Defines the directory where mira will write some log files to. Note that the name of the actual project will be prepended.

Any string

miralog

-co

string

Defines the file in CAF format to save an assembled project to. Filename must end with '.caf'.

Any string

mira_out.caf

Associated qualifiers

"-expdir" associated directory qualifiers

-extension

string

Default file extension

Any string

"-scfdir" associated directory qualifiers

-extension

string

Default file extension

Any string

General qualifiers

-auto

boolean

Turn off prompts

Boolean value Yes/No

-stdout

boolean

Write first file to standard output

Boolean value Yes/No

-filter

boolean

Read first file from standard input, write first file to standard output

Boolean value Yes/No

-options

boolean

Prompt for standard and additional values

Boolean value Yes/No

-debug

boolean

Write debug output to program.dbg

Boolean value Yes/No

-verbose

boolean

Report some/full command line options

Boolean value Yes/No

-help

boolean

Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose

Boolean value Yes/No

-warning

boolean

Report warnings

Boolean value Yes/No

-error

boolean

Report errors

Boolean value Yes/No

-fatal

boolean

Report fatal errors

Boolean value Yes/No

-die

boolean

Report dying program messages

Boolean value Yes/No

-version

boolean

Report version number and exit

Boolean value Yes/No

Input file format

emira reads any normal sequence USAs.

Output file format

emira outputs a graph to the specified graphics device. outputs a report format file. The default format is ...

Output files for usage example

File: EdIt.log

Directory: cjejuni_demo_info

This directory contains output files.

Directory: cjejuni_demo_log

This directory contains output files.

Directory: cjejuni_demo_results

This directory contains output files.

Data files

**************** EDIT HERE ****************

Notes

None.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with status 0.

Known bugs

None.

Author(s)

This program is an EMBOSS wrapper for a program written by Bastien Chevreux as part of the MIRA package.

History

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.

Comments

None