




Qclip
NAME
qclip -- an Experiment File sequence clipper
SYNOPSIS
Usage when confidence values are available (default mode):
qclip
[-c
] [-vt
] [-m
minimum_extent]
[-M
maximum_extent] [-w
window_length]
[-q
average_quality]
Usage when confidence values are not available or are to be ignored:
qclip
[-c
] [-vt
] [-m
minimum_extent]
[-M
maximum_extent] [-s
start_offset]
[-R
r_length]
[-r
r_unknown]
[-L
l_length] [-l
l_unknown]
DESCRIPTION
Qclip
is a simple program to decide how much of the 5' and 3' ends of a
sequence, stored as an Experiment File, should be clipped off
i.e. marked to be ignored during assembly.
The decision is made either by analysing the average confidence levels
stored in the Experiment file (or an associated trace file), or by
counting the numbers of unknown bases (eg -
or N
) found within
windows slid left to right along the sequence.
Large numbers of files can be processed in a single run and each file
argument is assumed to be a valid Experiment File. The sequence
is read from the Experiment File SQ
record and the trace is read
using the LN
and LT
identifiers; clipping is performed
and QL
and QR
identifiers are appended to the file.
For the default mode of clipping by confidence levels, the program firstly
finds the region of highest average quality. A window is then slid from this
point both rightwards and leftwards until the average quality over that
window length (specified with the -w
argument) drops below the
average_quality argument. The exact position of the clip point within that
window is determined by successively decreasing the window length.
When confidence values are not available, or when the -n
argument is
used, only the sequence base calls are analysed. In this
case the right clip position is calculated by sliding a window of
length r_length
rightwards along the sequence, starting from base
start_offset
, and stopping when a window containing at least
r_unknown
unknown bases is found.
The left clip position is calculated by
sliding a window leftwards from base start_offset
. The
algorithm used is identical to the right clip position except that the
l_unknown
and l_length
parameters are used.
The default arguments are
"-c -m 0 -M 9999 -w 30 -q 10
."
OPTIONS
-v
- Enable verbose output. This outputs information on which files are currently being clipped.
-t
- Test mode. The QL and QR information is written to stdout instead of being appended to the Experiment file.
-c
- Clip by confidence levels. This is the default mode of operation.
-n
- Clip by unknown base calls, even when confidence values are available.
-m
extent-
If the clip algorithm returns a
QL
clip value of less than extent, use extent as theQL
value. -M
extent-
If the clip algorithm returns a
QR
clip value of more than extent, use extent as theQR
value. -w
- Only used for the confidence level clipping mode. The window length over which to compute the average confidence value.
-q
- Only used for the confidence level clipping mode. The minimum average confidence in any given window for this window to be considered as good quality sequence.
-s
offset- Only used for the unknown base clipping mode. Force the first window to start the calculations from position offset in the sequence. This can be useful to avoid poor data at the 5' end of a sequence.
-R
length- Only used for the unknown base clipping mode. Set the length for the first rightwards window to length
-r
unknown- Only used for the unknown base clipping mode. Stop sliding the first rightwards window when there are greater than or equal to unknown bases within the current window.
-L
length- Only used for the unknown base clipping mode. Set the length for the second rightwards window to length. Setting this value to zero prevents the second window calculations from being performed.
-l
unknown- Only used for the unknown base clipping mode. Stop sliding the second rightwards window when there are greater than or equal to unknown bases within the current window.
EXAMPLE
To clip a batch of sequences listed in the `fofn' file with a minimum left clip value of 20 bases use:
qclip -m 20 `cat fofn`
SEE ALSO
See section ExperimentFile(4).





This page is maintained by staden-package. Last generated on 25 April 2003.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/manpages_unix_13.html