TY - JOUR
T1 - Genome-wide analysis reveals strong correlation between CpG islands with nearby transcription start sites of genes and their tissue specificity
AU - Yamashita, Riu
AU - Suzuki, Yutaka
AU - Sugano, Sumio
AU - Nakai, Kenta
N1 - Funding Information:
We thank H. Wakaguri, T. Ebata, and the members of Dynacom for technical support in the database construction. We also thank M.J.L. de Hoon, K. Tsuritani, and Y. Makita for helpful discussion and reading of the manuscript. This work was supported by Grants-in-Aid for Scientific Research on Priority Areas from the Ministry of Education, Science, Sports, and Culture of Japan.
PY - 2005/5/9
Y1 - 2005/5/9
N2 - It has been envisaged that CpG islands are often observed near the transcriptional start sites (TSS) of housekeeping genes. However, neither the precise positions of CpG islands relative to TSS of genes nor the correlation between the presence of the CpG islands and the expression specificity of these genes is well-understood. Using thousands of sequences with known TSS in human and mouse, we found that there is a clear peak in the distribution of CpG islands around TSS in the genes of these two species. Thus, we classified human (mouse) genes into 6600 (2948) CpG+ genes and 2619 (1830) CpG- ones, based on the presence of a CpG island within the -100: +100 region. We estimated the degree of each gene being a housekeeper by the number of cDNA libraries where its ESTs were detected. Then, the tendency that a gene lacking CpG islands around its TSS is expressed with a higher degree of tissue specificity turned out to be evolutionarily conserved. We also confirmed this tendency by analyzing the gene ontology annotation of classified genes. Since no such clear correlation was found in the control data (mRNAs, pre-mRNAs, and chromosome banding pattern), we concluded that the effect of a CpG island near the TSS should be more important than the global GC content of the region where the gene resides.
AB - It has been envisaged that CpG islands are often observed near the transcriptional start sites (TSS) of housekeeping genes. However, neither the precise positions of CpG islands relative to TSS of genes nor the correlation between the presence of the CpG islands and the expression specificity of these genes is well-understood. Using thousands of sequences with known TSS in human and mouse, we found that there is a clear peak in the distribution of CpG islands around TSS in the genes of these two species. Thus, we classified human (mouse) genes into 6600 (2948) CpG+ genes and 2619 (1830) CpG- ones, based on the presence of a CpG island within the -100: +100 region. We estimated the degree of each gene being a housekeeper by the number of cDNA libraries where its ESTs were detected. Then, the tendency that a gene lacking CpG islands around its TSS is expressed with a higher degree of tissue specificity turned out to be evolutionarily conserved. We also confirmed this tendency by analyzing the gene ontology annotation of classified genes. Since no such clear correlation was found in the control data (mRNAs, pre-mRNAs, and chromosome banding pattern), we concluded that the effect of a CpG island near the TSS should be more important than the global GC content of the region where the gene resides.
KW - CpG islands
KW - Housekeeping genes
KW - Isochores
KW - Tissue specificity
UR - http://www.scopus.com/inward/record.url?scp=19344368143&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=19344368143&partnerID=8YFLogxK
U2 - 10.1016/j.gene.2005.01.012
DO - 10.1016/j.gene.2005.01.012
M3 - Article
C2 - 15784181
AN - SCOPUS:19344368143
SN - 0378-1119
VL - 350
SP - 129
EP - 136
JO - Gene
JF - Gene
IS - 2
ER -