TY - CHAP
T1 - Systematic exploration of an efficient amino acid substitution matrix
T2 - MIQS
AU - Tomii, Kentaro
AU - Yamada, Kazunori
N1 - Funding Information:
This work was partially supported by Platform Project for Supporting in Drug Discovery and Life Science Research (Platform for Drug Discovery, Informatics, and Structural Life Science) from the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) and Japan Agency for Medical Research and Development (AMED). We thank Drs. Somlata Gupta, Kumiko Nakada-Tsukui, and Tomoyoshi Nozaki of NIID for discussions related to IMD/IBAR domains in E. histolytica. We thank Toshiyuki Oda for conducting the HHblits search.
Publisher Copyright:
© Springer Science+Business Media New York 2016.
PY - 2016/8/1
Y1 - 2016/8/1
N2 - Amino acid sequence comparisons to find similarities between proteins are fundamental sequence information analyses for inferring protein structure and function. In this study, we improve amino acid substitution matrices to identify distantly related proteins. We systematically sampled and benchmarked substitution matrices generated from the principal component analysis (PCA) subspace based on a set of typical existing matrices. Based on the benchmark results, we identified a region of highly sensitive matrices in the PCA subspace using kernel density estimation (KDE). Using the PCA subspace, we were able to deduce a novel sensitive matrix, called MIQS, which shows better detection performance for detecting distantly related proteins than those of existing matrices. This approach to derive an efficient amino acid substitution matrix might influence many fields of protein sequence analysis. MIQS is available at http://csas.cbrc.jp/Ssearch/.
AB - Amino acid sequence comparisons to find similarities between proteins are fundamental sequence information analyses for inferring protein structure and function. In this study, we improve amino acid substitution matrices to identify distantly related proteins. We systematically sampled and benchmarked substitution matrices generated from the principal component analysis (PCA) subspace based on a set of typical existing matrices. Based on the benchmark results, we identified a region of highly sensitive matrices in the PCA subspace using kernel density estimation (KDE). Using the PCA subspace, we were able to deduce a novel sensitive matrix, called MIQS, which shows better detection performance for detecting distantly related proteins than those of existing matrices. This approach to derive an efficient amino acid substitution matrix might influence many fields of protein sequence analysis. MIQS is available at http://csas.cbrc.jp/Ssearch/.
KW - Amino acid substitution matrix
KW - Pairwise alignment
KW - Protein sequence comparison
KW - Remote homology detection
UR - http://www.scopus.com/inward/record.url?scp=84985036838&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84985036838&partnerID=8YFLogxK
U2 - 10.1007/978-1-4939-3572-7_11
DO - 10.1007/978-1-4939-3572-7_11
M3 - Chapter
C2 - 27115635
AN - SCOPUS:84985036838
T3 - Methods in Molecular Biology
SP - 211
EP - 223
BT - Methods in Molecular Biology
PB - Humana Press Inc.
ER -