TY - JOUR
T1 - A state space representation of VAR models with sparse learning for dynamic gene networks.
AU - Kojima, Kaname
AU - Yamaguchi, Rui
AU - Imoto, Seiya
AU - Yamauchi, Mai
AU - Nagasaki, Masao
AU - Yoshida, Ryo
AU - Shimamura, Teppei
AU - Ueno, Kazuko
AU - Higuchi, Tomoyuki
AU - Gotoh, Noriko
AU - Miyano, Satoru
PY - 2010/1
Y1 - 2010/1
N2 - We propose a state space representation of vector autoregressive model and its sparse learning based on L1 regularization to achieve efficient estimation of dynamic gene networks based on time course microarray data. The proposed method can overcome drawbacks of the vector autoregressive model and state space model; the assumption of equal time interval and lack of separation ability of observation and systems noises in the former method and the assumption of modularity of network structure in the latter method. However, in a simple implementation the proposed model requires the calculation of large inverse matrices in a large number of times during parameter estimation process based on EM algorithm. This limits the applicability of the proposed method to a relatively small gene set. We thus introduce a new calculation technique for EM algorithm that does not require the calculation of inverse matrices. The proposed method is applied to time course microarray data of lung cells treated by stimulating EGF receptors and dosing an anticancer drug, Gefitinib. By comparing the estimated network with the control network estimated using non-treated lung cells, perturbed genes by the anticancer drug could be found, whose up- and down-stream genes in the estimated networks may be related to side effects of the anticancer drug.
AB - We propose a state space representation of vector autoregressive model and its sparse learning based on L1 regularization to achieve efficient estimation of dynamic gene networks based on time course microarray data. The proposed method can overcome drawbacks of the vector autoregressive model and state space model; the assumption of equal time interval and lack of separation ability of observation and systems noises in the former method and the assumption of modularity of network structure in the latter method. However, in a simple implementation the proposed model requires the calculation of large inverse matrices in a large number of times during parameter estimation process based on EM algorithm. This limits the applicability of the proposed method to a relatively small gene set. We thus introduce a new calculation technique for EM algorithm that does not require the calculation of inverse matrices. The proposed method is applied to time course microarray data of lung cells treated by stimulating EGF receptors and dosing an anticancer drug, Gefitinib. By comparing the estimated network with the control network estimated using non-treated lung cells, perturbed genes by the anticancer drug could be found, whose up- and down-stream genes in the estimated networks may be related to side effects of the anticancer drug.
UR - http://www.scopus.com/inward/record.url?scp=77954653646&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954653646&partnerID=8YFLogxK
U2 - 10.1142/9781848165786_0006
DO - 10.1142/9781848165786_0006
M3 - Article
C2 - 20238419
AN - SCOPUS:77954653646
SN - 0919-9454
VL - 22
SP - 56
EP - 68
JO - Genome informatics. International Conference on Genome Informatics
JF - Genome informatics. International Conference on Genome Informatics
ER -