TY - JOUR
T1 - Variational Bayesian mixture model on a subspace of exponential family distributions
AU - Watanabe, Kazuho
AU - Akaho, Shotaro
AU - Omachi, Shinichiro
AU - Okada, Masato
N1 - Funding Information:
Manuscript received June 30, 2008; revised May 21, 2009; accepted July 30, 2009. First published September 18, 2009; current version published November 04, 2009. This work was supported by the Grant-in-Aid for Scientific Research on Priority Areas “Deepening and Expansion of Statistical Mechanical Informatics (DEX-SMI)” and for Young Scientists (Startup) 20800012.
Publisher Copyright:
© 2009 IEEE.
Copyright:
Copyright 2015 Elsevier B.V., All rights reserved.
PY - 2009/11/1
Y1 - 2009/11/1
N2 - Exponential principal component analysis (e-PCA) has been proposed to reduce the dimension of the parameters of probability distributions using Kullback information as a distance between two distributions. It also provides a framework for dealing with various data types such as binary and integer for which the Gaussian assumption on the data distribution is inappropriate. In this paper, we introduce a latent variable model for the e-PCA. Assuming the discrete distribution on the latent variable leads to mixture models with constraint on their parameters. This provides a framework for clustering on the lower dimensional subspace of exponential family distributions. We derive a learning algorithm for those mixture models based on the variational Bayes (VB) method. Although intractable integration is required to implement the algorithm for a subspace, an approximation technique using Laplace's method allows us to carry out clustering on an arbitrary subspace. Combined with the estimation of the subspace, the resulting algorithm performs simultaneous dimensionality reduction and clustering. Numerical experiments on synthetic and real data demonstrate its effectiveness for extracting the structures of data as a visualization technique and its high generalization ability as a density estimation model.
AB - Exponential principal component analysis (e-PCA) has been proposed to reduce the dimension of the parameters of probability distributions using Kullback information as a distance between two distributions. It also provides a framework for dealing with various data types such as binary and integer for which the Gaussian assumption on the data distribution is inappropriate. In this paper, we introduce a latent variable model for the e-PCA. Assuming the discrete distribution on the latent variable leads to mixture models with constraint on their parameters. This provides a framework for clustering on the lower dimensional subspace of exponential family distributions. We derive a learning algorithm for those mixture models based on the variational Bayes (VB) method. Although intractable integration is required to implement the algorithm for a subspace, an approximation technique using Laplace's method allows us to carry out clustering on an arbitrary subspace. Combined with the estimation of the subspace, the resulting algorithm performs simultaneous dimensionality reduction and clustering. Numerical experiments on synthetic and real data demonstrate its effectiveness for extracting the structures of data as a visualization technique and its high generalization ability as a density estimation model.
KW - Exponential family
KW - Exponential principal component analysis (e-PCA)
KW - Mixture model
KW - Variational Bayes (VB) method.
UR - http://www.scopus.com/inward/record.url?scp=75549089874&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=75549089874&partnerID=8YFLogxK
U2 - 10.1109/TNN.2009.2029694
DO - 10.1109/TNN.2009.2029694
M3 - Article
C2 - 19770092
AN - SCOPUS:75549089874
SN - 2162-237X
VL - 20
SP - 1783
EP - 1796
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 11
M1 - 5247016
ER -