TY - JOUR
T1 - Structural analysis of Arabidopsis thaliana chromosome 5. V. Sequence features of the regions of 1,381,565 bp covered by twenty one physically assigned P1 and TAC clones
AU - Kaneko, Takakazu
AU - Kotani, Hirokazu
AU - Nakamura, Yasukazu
AU - Sato, Shusei
AU - Asamizu, Erika
AU - Miyajima, Nobuyuki
AU - Tabata, Satoshi
PY - 1998
Y1 - 1998
N2 - The nucleotide sequences of 21 PI and TAC clones which have been precisely localized to the fine physical map of the Arabidopsis thaliana chromosome 5, were determined, and their sequence features were analyzed. The total length of the regions sequenced in this study were 1,381,565 bp, bringing the total length of the chromosome 5 sequences determined so far to 6,691,670 bp together with the regions of the 69 clones previously reported. By computer-aided analyses including similarity search against protein and EST databases and gene modeling with computer programs, a total of 337 potential protein-coding genes and/or gene segments were identified on the basis of similarity to the reported gene sequences. An average density of the genes and/or gene segments thus assigned was 1 gene / 4,100 bp. Introns were identified in 76.7% of the potential protein genes for which the entire gene structure were predicted, and the average number per gene and the average length of the introns were 3.9 and 176 bp, respectively. These sequence features are essentially identical to those in the previously reported sequences. The numbers of the Arabidopsis ESTs matched to each of the predicted genes have been counted to monitor the transcription level. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
AB - The nucleotide sequences of 21 PI and TAC clones which have been precisely localized to the fine physical map of the Arabidopsis thaliana chromosome 5, were determined, and their sequence features were analyzed. The total length of the regions sequenced in this study were 1,381,565 bp, bringing the total length of the chromosome 5 sequences determined so far to 6,691,670 bp together with the regions of the 69 clones previously reported. By computer-aided analyses including similarity search against protein and EST databases and gene modeling with computer programs, a total of 337 potential protein-coding genes and/or gene segments were identified on the basis of similarity to the reported gene sequences. An average density of the genes and/or gene segments thus assigned was 1 gene / 4,100 bp. Introns were identified in 76.7% of the potential protein genes for which the entire gene structure were predicted, and the average number per gene and the average length of the introns were 3.9 and 176 bp, respectively. These sequence features are essentially identical to those in the previously reported sequences. The numbers of the Arabidopsis ESTs matched to each of the predicted genes have been counted to monitor the transcription level. The sequence data and gene information are available on the World Wide Web database KAOS (Kazusa Arabidopsis data Opening Site) at http://www.kazusa.or.jp/arabi/.
KW - Arabidopsis thaliana chromosome 5
KW - Gene prediction
KW - Genomic sequence
KW - P1 genomic library
KW - TAC genomic library
UR - http://www.scopus.com/inward/record.url?scp=0032580052&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0032580052&partnerID=8YFLogxK
U2 - 10.1093/dnares/5.2.131
DO - 10.1093/dnares/5.2.131
M3 - Article
C2 - 9679202
AN - SCOPUS:0032580052
SN - 1340-2838
VL - 5
SP - 131
EP - 145
JO - DNA Research
JF - DNA Research
IS - 2
ER -