Sixteen P1 and TAC clones assigned to Arabidopsis thaliana chromosome 5 were sequenced, and their sequence features were analyzed using various computer programs. The total length of the sequences determined was 1,013,767 bp. Together with the nucleotide sequences of 109 clones previously reported, the regions of chromosome 5 sequenced so far now total 9,072,622 bp, which presumably covers approximately one-third of the chromosome. A similarity search against the reported gene sequences predicted the presence of a total of 225 protein-coding genes and/or gene segments in the newly sequenced regions, indicating an average gene density of one gene per 4.5 kb. Introns were identified in 72.4% of the potential protein genes for which the entire gene structure was predicted, and the average number per gene and the average length of the introns were 3.3 and 163 bp, respectively. These sequence features are essentially identical to those in the previously reported sequences. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
- Arabidopsis thaliana chromosome 5
- Gene prediction
- Genomic sequence
- P1 genomic library
- TAC genomic library