




Count Sequence Composition
When a sequence is read into the program its composition is displayed in the Output Window to provide a simple check that the data has been read correctly. The values can also be requested from the "Statistics" menu, when a dialogue will allow subsections of the sequence to be analysed. The results are displayed as shown below.
============================================================ Wed 12 Nov 17:10:25 1997: sequence composition ------------------------------------------------------------ A 1966 (24.17%) C 1996 (24.54%) G 2185 (26.86%) T 1987 (24.43%) - 0 (0.00%) Or for protein sequences: ============================================================ Mon 14 Oct 17:11:04 2002: sequence composition ------------------------------------------------------------ Sequence MYSA_DROME: 1 to 2411 Protein AA A B C D E F G H I K L M N N 201 0 30 150 281 74 127 45 126 233 243 43 121 % 8.3 0.0 1.2 6.2 11.7 3.1 5.3 1.9 5.2 9.7 10.1 1.8 5.0 M 14287 0 3094 17263 36281 10891 7246 6171 14258 29865 27498 5642 13807 AA P Q R S T V W Y Z X * - N 55 167 141 96 93 108 14 63 0 0 0 0 % 2.3 6.9 5.8 4.0 3.9 4.5 0.6 2.6 0.0 0.0 0.0 0.0 M 5341 21398 22022 8360 9403 10706 2607 10280 0 0 0 0 M 5341 21398 22022 8360 9403 10706 2607 10280 0 0 0 0
Count Dinucleotide Frequencies
This routine simply counts dinucleotide frequencies for the selected region of the sequence. It also calculates an expected distribution based on the base composition. The output looks like:
A C G T Obs Expected Obs Expected Obs Expected Obs Expected A 7.91 5.84 5.64 5.93 5.05 6.49 5.57 5.91 C 5.91 5.93 5.14 6.02 7.38 6.59 6.10 5.99 G 6.11 6.49 7.56 6.59 6.30 7.22 6.90 6.56 T 4.24 5.91 6.18 5.99 8.14 6.56 5.86 5.97





This page is maintained by staden-package. Last generated on 25 April 2003.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/spin_unix_11.html