TY - JOUR
T1 - Survey of the genetic information carried in the genome of Eucalyptus camaldulensis
AU - Hirakawa, Hideki
AU - Nakamura, Yasukazu
AU - Kaneko, Takakazu
AU - Isobe, Sachiko
AU - Sakai, Hiroe
AU - Kato, Tomohiko
AU - Hibino, Takashi
AU - Sasamoto, Shigemi
AU - Watanabe, Akiko
AU - Yamada, Manabu
AU - Nakayama, Shinobu
AU - Fujishiro, Tsunakazu
AU - Kishida, Yoshie
AU - Kohara, Mitsuyo
AU - Tabata, Satoshi
AU - Sato, Shusei
PY - 2011
Y1 - 2011
N2 - The genetic information in the genome of Eucalyptus camaldulensis was investigated by sequencing the genome and the cDNA using a combination of the conventional Sanger method and next-generation sequencing methods, followed by intensive bioinformatics analyses. The total length of the non-redundant genomic sequences thus obtained was 654,922,307 bp consisting of 81,246 scaffolds and 121,194 singlets. These sequences accounted for approximately 92% of the gene-containing regions with an average G + C content of 33.6%. A total of 77,121 complete and partial structures of protein-encoding genes have been deduced. Comparison of the genes mapped on the KEGG pathways or located in the KOG classification with those in other plant species revealed the characteristics of the E. camaldulensis genome, and it was found that 23 pathways contained enzymes present only in the E. camaldulensis genome. Polymorphism analysis using microsatellite markers developed from the genomic sequence data obtained was performed with six Eucalyptus species collected from various parts of the world to estimate their genetic diversity, and the usefulness of these markers was demonstrated. The genomic sequence and accompanying information presented here are expected to serve as valuable resources for the acceleration of fundamental and applied research with Eucalyptus, especially in the fields of paper production and industrial materials. Further information on the genomic and cDNA sequences and microsatellite markers is available at http://www.kazusa.or.jp/eucaly/.
AB - The genetic information in the genome of Eucalyptus camaldulensis was investigated by sequencing the genome and the cDNA using a combination of the conventional Sanger method and next-generation sequencing methods, followed by intensive bioinformatics analyses. The total length of the non-redundant genomic sequences thus obtained was 654,922,307 bp consisting of 81,246 scaffolds and 121,194 singlets. These sequences accounted for approximately 92% of the gene-containing regions with an average G + C content of 33.6%. A total of 77,121 complete and partial structures of protein-encoding genes have been deduced. Comparison of the genes mapped on the KEGG pathways or located in the KOG classification with those in other plant species revealed the characteristics of the E. camaldulensis genome, and it was found that 23 pathways contained enzymes present only in the E. camaldulensis genome. Polymorphism analysis using microsatellite markers developed from the genomic sequence data obtained was performed with six Eucalyptus species collected from various parts of the world to estimate their genetic diversity, and the usefulness of these markers was demonstrated. The genomic sequence and accompanying information presented here are expected to serve as valuable resources for the acceleration of fundamental and applied research with Eucalyptus, especially in the fields of paper production and industrial materials. Further information on the genomic and cDNA sequences and microsatellite markers is available at http://www.kazusa.or.jp/eucaly/.
KW - Eucalyptus camaldulensis
KW - Genetic diversity
KW - Genome sequencing
KW - Microsatellite markers
KW - cDNA sequencing
UR - http://www.scopus.com/inward/record.url?scp=84857171896&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84857171896&partnerID=8YFLogxK
U2 - 10.5511/plantbiotechnology.11.1027b
DO - 10.5511/plantbiotechnology.11.1027b
M3 - Article
AN - SCOPUS:84857171896
SN - 1342-4580
VL - 28
SP - 471
EP - 480
JO - Plant Biotechnology
JF - Plant Biotechnology
IS - 5
ER -