




Records
It is important to note that the assembly program gap4 (see section Gap4 introduction) will not operate to its full effect if it is not given all the necessary data. For example gap4 contains many functions that can analyse the positions and relative orientations of readings from the same template in order to check the correctness of the assembly and determine the contig order. However if the records that name templates and their estimated lengths, and define the primers used to obtain readings from them are missing, none of these valuable analyses can be performed reliably. One way to ensure that all the necessary fields are present is to use the program pregap4 (see section Pregap4 introduction).
In the descriptions below records containing * are those read into the database during normal assembly; those with ** are extra items required when entering pre-assembled data; those with *** are read from SCF files (after the experiment file has been read to obtain the SCF file name); (see section SCF introduction) the record marked **** is an extra item required for Directed Assembly.
The order of records in the file is not important. They are listed here in alphabetical order with, where possible, reasons for the origin of their names. Several are redundant and no group is likely to make use of them all. Obviously others can be added in the future. Initially they might be of local use but if their use becomes wider they can be added to the standard set. Standard EMBL records such as FT are assumed to be included.
- AC
- ACcession number
- AP
- Assembly Position ****
- AQ
- AVerage Quality for bases 100..200
- AV
- Accuracy values for externally assembled data **, ***
- BC
- Base Calling software
- CC
- Comment line
- CF
- Cloning vector sequence File
- CH
- Special CHemistry
- CL
- Cloning vector Left end
- CN
- Clone Name
- CR
- Cloning vector Right end
- CS
- Cloning vector Sequence present in sequence *
- CV
- Cloning Vector type
- DR
- Direction of Read
- DT
- DaTe of experiment
- EN
- Entry Name
- EX
- EXperimental notes
- FM
- sequencing vector Fragmentation Method
- ID
- IDentifier *
- LE
- was Library Entry, but now identifies a well in a micro titre dish
- LI
- was subclone LIbrary but now identifies a micro titre dish
- LN
- Local format trace file Name *
- LT
- Local format trace file Type *
- MC
- MaChine on which experiment ran
- MN
- Machine generated trace file Name
- MT
- Machine generated trace file Type
- ON
- Original base Numbers (positions) **
- OP
- OPerator
- PC
- Position in Contig **
- PD
- Primer data (the sequence of a primer)
- PN
- Primer Name
- PR
- PRimer type *
- PS
- Processing Status
- QL
- poor Quality sequence present at Left (5') end *
- QR
- poor Quality sequence present at Right (3') end *
- RS
- Reference Sequence for numbering and mutation detection
- SC
- Sequencing vector Cloning site
- SE
- SEnse (ie whether complemented) **
- SF
- Sequencing vector sequence File
- SI
- Sequencing vector Insertion length *
- SL
- Sequencing vector sequence present at Left (5') end *
- SP
- Sequencing vector Primer site (relative to cloning site)
- SQ
- SeQuence *
- SR
- Sequencing vector sequence present at Right (3') end *
- SS
- Screening Sequence
- ST
- STrands *
- SV
- Sequencing Vector type *
- TG
- Gel reading Tag *
- TC
- Contig Tag *
- TN
- Template Name *
- WT
- Wild type trace





This page is maintained by staden-package. Last generated on 25 April 2003.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/formats_unix_19.html