Abstract
In gene-gene interaction analysis using single nucleotide polymorphism (SNP) data, empty cells arise in the genotype contingency table more frequently than in single SNP association studies. Empty cells lead to unidentifiable regression coefficients in regression model fitting. It is unclear whether the degrees of freedom (d.f.) for testing interactions are reduced for such sparse contingency tables. Boolean Operation based Screening and Testing is an exhaustive gene-gene interaction search method in which a fixed d.f. of four (the most conservative choice) is used in the chi-squared null distribution for the likelihood ratio test for gene-gene interactions under a logistic regression model. In this paper, the choice of d.f. is investigated theoretically by introducing a decomposition of type I error. An adaptive method using the observed d.f. can be less conservative than the fixed d.f. method, thereby enhancing power. In simulated data, type I error rates for the adaptive method were usually better controlled under various scenarios for Gaussian linear regression and logistic regression, including prospective and retrospective sampling designs, as well as for artificial data that mimic actual genome-wide SNPs. When the adaptive method was applied to public datasets generated from simulations, it exhibited an improvement in power over the fixed method.
Original language | English |
---|---|
Pages (from-to) | 4934-4948 |
Number of pages | 15 |
Journal | Statistics in Medicine |
Volume | 33 |
Issue number | 28 |
DOIs | |
Publication status | Published - 2014 Dec 10 |
Keywords
- Decomposition of type I error
- Degrees of freedom
- Gene-gene interaction
- Prospective sampling
- Retrospective sampling
ASJC Scopus subject areas
- Epidemiology
- Statistics and Probability