




The GDatabase Structure
#define GAP_DB_VERSION 2 #define GAP_DNA 0 #define GAP_PROTEIN 1 typedef struct { GCardinal version; /* Database version - GAP_DB_VERSION */ GCardinal maximum_db_size; /* MAXDB */ GCardinal actual_db_size; /* */ GCardinal max_gel_len; /* 4096 */ GCardinal data_class; /* GAP_DNA or GAP_PROTEIN */ /* Used counts */ GCardinal num_contigs; /* number of contigs used */ GCardinal num_readings; /* number of readings used */ /* Bitmaps */ GCardinal Nfreerecs; /* number of bits */ GCardinal freerecs; /* record no. of freerecs bitmap */ /* Arrays */ GCardinal Ncontigs; /* elements in array */ GCardinal contigs; /* record no. of array of type GContigs */ GCardinal Nreadings; /* elements in array */ GCardinal readings; /* record no. of array of type GReading */ GCardinal Nannotations; /* elements in array */ GCardinal annotations; /* record no. of array of type GAnnotation */ GCardinal free_annotations; /* head of list of free annotations */ GCardinal Ntemplates; /* elements in array */ GCardinal templates; /* record no. of array of type GTemplates */ GCardinal Nclones; /* elements in array */ GCardinal clones; /* record no. of array of type GClones */ GCardinal Nvectors; /* elements in array */ GCardinal vectors; /* record no. of array of type GVectors */ GCardinal contig_order; /* record no. of array of type GCardinal */ GCardinal Nnotes; /* elements in array */ GCardinal notes_a; /* records that are GT_Notes */ GCardinal notes; /* Unpositional annotations */ GCardinal free_notes; /* SINGLY linked list of free notes */ } GDatabase;
This is always the first record in the database. In contains information about the Gap4 database as a whole and can be viewed as the root from which all other records are eventually referenced from. Care must be taken when dealing with counts of contigs and readings as there are two copies; one for the used number and one for the allocated number.
The structure contains several database record numbers of arrays. These arrays in turn contain record numbers of structures. Most other structures, and indeed functions within Gap4, then reference structure numbers (eg a reading number) and not their record numbers. The conversion from one to the other is done by accessing the arrays listed in the GDatabase structure.
For instance, to read the structure for contig number 5 we could do the following.
GContigs c; GT_Read(io, arr(GCardinal, io->contigs, 5-1), &c, sizeof(c), GT_Contigs);
In the above code, io->contigs
is the array of GCardinals whose record
number is contained within the contigs element of the GDatabase
structure. In practise, this is hidden away by simply calling
"contig_read(io, 5, c)
" instead.
- version
-
Database record format version control. The current version is held
within the
GAP_DB_VERSION
macro. - maximum_db_size
- actual_db_size
-
These are essentially redundant as Gap4 can support any number of
readings up to maximum_db_size, and maximum_db_size can be
anything the user desires. It is specifable using the
-maxdb
command line argument to gap4. - max_gel_len
- This is currently hard coded as 4096 (but is relatively easy to change).
- data_class
- This specifies whether the database contains DNA or protein sequences. In the current implementation only DNA is supported.
- num_contigs
- num_readings
- These specify the number of used contigs and readings. They may be different from the number of records allocated.
- Nfreerecs
- freerecs
- freerecs is the record number of a bitmap with a single element per record in the database. Each free bit in the bitmap corresponds to a free record. The Nfreerecs variable holds the number of bits allocated in the freerecs bitmap.
- Ncontigs
- contigs
- contigs is the record number of an array of GCardinals. Each element of the array is the record number of a GContigs structures. Ncontigs is the number of elements allocated in the contigs array. Note that this is different from num_contigs, which is the number of elements used.
- Nreadings
- readings
- readings is the record number of an array of GCardinals. Each element of the array is the record number of a GReadings structures. Nreadings is the number of elements allocated in the readings array. Note that this is different from num_readings, which is the number of elements used.
- Nannotations
- annotations
- free_annotations
- annotations is the record number of an array of GCardinals. Each element of the array is the record number of a GAnnotations structures. Nannotations is the number of elements allocated in the annotations array. free_annotations is the record number of the first free annotation, which forms the head of a linked list of free annotations.
- Ntemplates
- templates
- templates is the record number of an array of GCardinals. Each element of the array is the record number of a GTemplates structures. Ntemplates is the number of elements allocated in the templates array.
- Nclones
- clones
- clones is the record number of an array of GCardinals. Each element of the array is the record number of a GClones structures. Nclones is the number of elements allocated in the clones array.
- Nvectors
- vectors
- vectors is the record number of an array of GCardinals. Each element of the array is the record number of a GVectors structures. Nvectors is the number of elements allocated in the vectors array.
- contig_order
- This is the record number of an array of GCardinals of size NContigs. Each element of the array is a contig number. The index of the array element indicates the position of this contig. Thus the contigs are displayed in the order that they appear in this array.





This page is maintained by staden-package. Last generated on 25 April 2003.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/scripting_113.html