Clustering by phenotype and genome-wide association study in autism

Akira Narita, Masato Nagai, Satoshi Mizuno, Soichi Ogishima, Gen Tamiya, Masao Ueki, Rieko Sakurai, Satoshi Makino, Taku Obara, Mami Ishikuro, Chizuru Yamanaka, Hiroko Matsubara, Yasutaka Kuniyoshi, Keiko Murakami, Fumihiko Ueno, Aoi Noda, Tomoko Kobayashi, Mika Kobayashi, Takuma Usuzaki, Hisashi OhsetoAtsushi Hozawa, Masahiro Kikuya, Hirohito Metoki, Shigeo Kure, Shinichi Kuriyama

Research output: Contribution to journalArticlepeer-review

19 Citations (Scopus)


Autism spectrum disorder (ASD) has phenotypically and genetically heterogeneous characteristics. A simulation study demonstrated that attempts to categorize patients with a complex disease into more homogeneous subgroups could have more power to elucidate hidden heritability. We conducted cluster analyses using the k-means algorithm with a cluster number of 15 based on phenotypic variables from the Simons Simplex Collection (SSC). As a preliminary study, we conducted a conventional genome-wide association study (GWAS) with a data set of 597 ASD cases and 370 controls. In the second step, we divided cases based on the clustering results and conducted GWAS in each of the subgroups vs controls (cluster-based GWAS). We also conducted cluster-based GWAS on another SSC data set of 712 probands and 354 controls in the replication stage. In the preliminary study, which was conducted in conventional GWAS design, we observed no significant associations. In the second step of cluster-based GWASs, we identified 65 chromosomal loci, which included 30 intragenic loci located in 21 genes and 35 intergenic loci that satisfied the threshold of P < 5.0 × 10−8. Some of these loci were located within or near previously reported candidate genes for ASD: CDH5, CNTN5, CNTNAP5, DNAH17, DPP10, DSCAM, FOXK1, GABBR2, GRIN2A5, ITPR1, NTM, SDK1, SNCA, and SRRM4. Of these 65 significant chromosomal loci, rs11064685 located within the SRRM4 gene had a significantly different distribution in the cases vs controls in the replication cohort. These findings suggest that clustering may successfully identify subgroups with relatively homogeneous disease etiologies. Further cluster validation and replication studies are warranted in larger cohorts.

Original languageEnglish
Article number290
JournalTranslational Psychiatry
Issue number1
Publication statusPublished - 2020 Dec 1


Dive into the research topics of 'Clustering by phenotype and genome-wide association study in autism'. Together they form a unique fingerprint.

Cite this