ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Mathematics/Management 02 July 2022

An empirical Bayes method for genetic association analysis using case-control mother-child pair data

Cite this:
https://doi.org/10.52396/JUSTC-2022-0007
More Information
  • Author Bio:

    Yanan Zhao is currently a PhD student under the tutelage of Prof. Hong Zhang at the University of Science and Technology of China. Her research interests focus on empirical Bayes procedures and statistical genetics

    Hong Zhang is a Full Professor with the University of Science and Technology of China (USTC). He received his Bachalor’s degree in Mathematics and Ph.D. degree in Statistics from USTC in 1997 and 2003, respectively. His major reseach interests include statistical genetics, causal inference, and machine learning

  • Corresponding author: E-mail: zhangh@ustc.edu.cn
  • Received Date: 08 January 2022
  • Accepted Date: 30 March 2022
  • Available Online: 02 July 2022
  • Case-control mother-child pair data are often used to investigate the effects of maternal and child genetic variants and environmental risk factors on obstetric and early life phenotypes. Retrospective likelihood can fully utilize available information such as Mendelian inheritance and conditional independence between maternal environmental risk factors (covariates) and children’s genotype given maternal genotype, thus effectively improving statistical inference. Such a method is robust to some extent if no relationship assumption is imposed between the maternal genotype and covariates. Statistical efficiency can be considerably improved by assuming independence between maternal genotype and covariates, but false-positive findings would be inflated if the independence assumption was violated. In this study, two empirical Bayes (EB) estimators are derived by appropriately weighting the above retrospective-likelihood-based estimators, which intuitively balance the statistical efficiency and robustness. The asymptotic normality of the two EB estimators is established, which can be used to construct confidence intervals and association tests of genetic effects and gene-environment interactions. Simulations and real-data analyses are conducted to demonstrate the performance of our new method.
    Based on a dependent-model estimator and an independent-model estimator, we obtaintwo Bayes-type estimators that balance robustness and efficiency.
    Case-control mother-child pair data are often used to investigate the effects of maternal and child genetic variants and environmental risk factors on obstetric and early life phenotypes. Retrospective likelihood can fully utilize available information such as Mendelian inheritance and conditional independence between maternal environmental risk factors (covariates) and children’s genotype given maternal genotype, thus effectively improving statistical inference. Such a method is robust to some extent if no relationship assumption is imposed between the maternal genotype and covariates. Statistical efficiency can be considerably improved by assuming independence between maternal genotype and covariates, but false-positive findings would be inflated if the independence assumption was violated. In this study, two empirical Bayes (EB) estimators are derived by appropriately weighting the above retrospective-likelihood-based estimators, which intuitively balance the statistical efficiency and robustness. The asymptotic normality of the two EB estimators is established, which can be used to construct confidence intervals and association tests of genetic effects and gene-environment interactions. Simulations and real-data analyses are conducted to demonstrate the performance of our new method.
    • Retrospective likelihood method is employed to improve statistical efficiency by fullyutilizing available information.
    • An efficient estimator and a robust estimator are combined to construct twonovel empirical Bayes-type estimators using empirical the Bayes method.
    • We establish the asymptotic properties of the proposed estimators.

  • loading
  • [1]
    Goddard K A, Tromp G, Romero R, et al. Candidate-gene association study of mothers with pre-eclampsia, and their infants, analyzing 775 SNPs in 190 genes. Human Heredity, 2007, 63 (1): 1–16. doi: 10.1159/000097926
    [2]
    Kanayama N, Takahashi K, Matsuura T, et al. Deficiency in p57Kip2 expression induces preeclampsia-like symptoms in mice. Molecular Human Reproduction, 2002, 8 (12): 1129–1135. doi: 10.1093/molehr/8.12.1129
    [3]
    Saftlas A F, Beydoun H, Triche E. Immunogenetic determinants of preeclampsia and related pregnancy disorders: A systematic review. Obstetrics and Gynecology, 2005, 106 (1): 162–172. doi: 10.1097/01.AOG.0000167389.97019.37
    [4]
    Wangler M F, Chang A S, Moley K H, et al. Factors associated with preterm delivery in mothers of children with Beckwith-Wiedemann syndrome: A case cohort study from the BWS registry. American Journal of Medical Genetics Part A, 2005, 134 (2): 187–191. doi: 10.1002/ajmg.a.30595
    [5]
    Goldenberg R L, Culhane J F, Iams J D, et al. Epidemiology and causes of preterm birth. The Lancet, 2008, 371 (9606): 75–84. doi: 10.1016/S0140-6736(08)60074-4
    [6]
    Zhang G, Feenstra B, Bacelis J, et al. Genetic associations with gestational duration and spontaneous preterm birth. The New England Journal of Medicine, 2017, 377 (12): 1156–1167. doi: 10.1056/NEJMoa1612665
    [7]
    Hong X, Hao K, Ji H, et al. Genome-wide approach identifies a novel gene-maternal pre-pregnancy BMI interaction on preterm birth. Nature Communications, 2017, 8 (1): 15608. doi: 10.1038/ncomms15608
    [8]
    Chen J, Zheng H, Wilson M L. Likelihood ratio tests for maternal and fetal genetic effects on obstetric complications. Genetic Epidemiology, 2009, 33 (6): 526–538. doi: 10.1002/gepi.20405
    [9]
    Fu W, Li M, Sun K, et al. Testing maternal-fetal genotype incompatibility with mother-offspring pair data. Journal of Proteomics and Genomics Research, 2013, 1 (2): 40–56. doi: 10.14302/issn.2326-0793.jpgr-12-160
    [10]
    Chen J, Lin D, Hochner H. Semiparametric maximum likelihood methods for analyzing genetic and environmental effects with case-control mother-child pair data. Biometrics, 2012, 68 (3): 869–877. doi: 10.1111/j.1541-0420.2011.01728.x
    [11]
    Lin D, Weinberg C R, Feng R, et al. A multi-locus likelihood method for assessing parent-of-origin effects using case-control mother-child pairs. Genetic Epidemiology, 2013, 37 (2): 152–162. doi: 10.1002/gepi.21700
    [12]
    Prentice R L, Pyke R. Logistic disease incidence models and case-control studies. Biometrika, 1979, 66 (3): 403–411. doi: 10.1093/biomet/66.3.403
    [13]
    Shi M, Umbach D M, Vermeulen S H, et al. Making the most of case-mother/control-mother studies. American Journal of Epidemiology, 2008, 168 (5): 541–547. doi: 10.1093/aje/kwn149
    [14]
    Zhang H, Mukherjee B, Arthur V, et al. An efficient and computationally robust statistical method for analyzing case-control mother-offspring pair genetic association studies. Annals of Applied Statics, 2020, 14 (2): 560–584. doi: 10.1214/19-AOAS1298
    [15]
    Chen Y H, Chatterjee N, Carroll R J. Shrinkage estimators for robust and efficient inference in haplotype-based case-control studies. Journal of the American Statistical Association, 2009, 104 (485): 220–233. doi: 10.1198/jasa.2009.0104
    [16]
    Owen A B. Empirical Likelihood. New York: Chapman and Hall/ CRC, 2001.
    [17]
    Zhang H, Chatterjee N, Rader D, et al. Adjustment of nonconfounding covariates in case-control genetic association studies. Annals of Applied Statistics, 2018, 12 (1): 200–221. doi: 10.1214/17-AOAS1065
    [18]
    Casella G, Berger R L. Statistical Inference. 2nd edition. Boston, MA: Cengage Learning, 2001.
    [19]
    Mukherjee B, Chatterjee N. Exploiting gene-environment independence for analysis of case-control studies: An empirical Bayes-type shrinkage estimator to trade-off between bias and efficiency. Biometrics, 2008, 64 (3): 685–694. doi: 10.1111/j.1541-0420.2007.00953.x
    [20]
    Zhang K, Zhang H, Hochner H, et al. Covariate adjusted inference of parent-of-origin effects using case-control mother-child paired multilocus genotype data. Genetic Epidemiology, 2021, 45 (8): 830–847. doi: 10.1002/gepi.22428
    [21]
    Engel S A M, Erichsen H C, Savitz D A, et al. Risk of spontaneous preterm birth is associated with common proinflammatory cytokine polymorphisms. Epidemiology, 2005, 16 (4): 469–477. doi: 10.1097/01.ede.0000164539.09250.31
    [22]
    Frey H A, Stout M J, Pearson L N, et al. Genetic variation associated with preterm birth in African-American women. American Journal of Obstetrics and Gynecology, 2016, 215 (2): 235.e1–235.e8. doi: 10.1016/j.ajog.2016.03.008
    [23]
    Haataja R, Karjalainen M K, Luukkonen A, et al. Mapping a new spontaneous preterm birth susceptibility gene, IGF1R, using linkage, haplotype sharing, and association analysis. PLoS Genetics, 2011, 7 (2): e1001293. doi: 10.1371/journal.pgen.1001293
    [24]
    Menon R, Velez D R, Simhan H, et al. Multilocus interactions at maternal tumor necrosis factor-α, tumor necrosis factor receptors, interleukin-6 and interleukin-6 receptor genes predict spontaneous preterm labor in European-American women. American Journal of Obstetrics and Gynecology, 2006, 194 (6): 1616–1624. doi: 10.1016/j.ajog.2006.03.059
    [25]
    Hendler I, Goldenberg R L, Mercer B M, et al. The preterm prediction study: Association between maternal body mass index and spontaneous and indicated preterm birth. American Journal of Obstetrics and Gynecology, 2005, 192 (3): 882–886. doi: 10.1016/j.ajog.2004.09.021
    [26]
    Frayling T M, Timpson N J, Weedon M N, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science, 2007, 316 (5826): 889–894. doi: 10.1126/science.1141634
    [27]
    Purcell S, Neale B, Todd-Brown K, et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 2007, 81 (3): 559–575. doi: 10.1086/519795
    [28]
    Hamilton B E, Martin J A, Ventura S J. Births: Preliminary data for 2005. National Vital Statistics Reports, 2006, 55 (11): 1–18.
    [29]
    Slattery M M, Morrison J J. Preterm delivery. The Lancet, 2002, 360 (9344): 1489–1497. doi: 10.1016/S0140-6736(02)11476-0
    [30]
    Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nature Genetics, 2008, 40 (6): 695–701. doi: 10.1038/ng.f.136
    [31]
    Lee S, Abecasis G R, Boehnke M, et al. Rare-variant association analysis: Study designs and statistical tests. American Journal of Human Genetics, 2014, 95 (1): 5–23. doi: 10.1016/j.ajhg.2014.06.009
    [32]
    Schork N J, Murray S S, Frazer K A, et al. Common vs. rare allele hypotheses for complex diseases. Current Opinion in Genetics and Development, 2009, 19 (3): 212–219. doi: 10.1016/j.gde.2009.04.010
    [33]
    Ionita-Laza I, Lee S, Makarov V, et al. Family-based association tests for sequence data, and comparisons with population-based association tests. European Journal of Human Genetics, 2013, 21 (10): 1158–1162. doi: 10.1038/ejhg.2012.308
    [34]
    Jiang D, McPeek M S. Robust rare variant association testing for quantitative traits in samples with related individuals. Genetic Epidemiology, 2014, 38 (1): 10–20. doi: 10.1002/gepi.21775
    [35]
    Wang X, Lee S, Zhu X, et al. GEE-based SNP set association test for continuous and discrete traits in family-based association studies. Genetic Epidemiology, 2013, 37 (8): 778–786. doi: 10.1002/gepi.21763
    [36]
    Wang X, Zhang Z, Morris N, et al. Rare variant association test in family-based sequencing studies. Briefings in Bioinformatics, 2016, 18 (6): 954–961. doi: 10.1093/bib/bbw083
  • 加载中

Catalog

    Figure  1.  Type-I error rates for the significance tests of $ \beta_{mx_1} $ with $ \rho=0 $ (HWE) or $ \rho=0.1 $ (HWD), various sample sizes ($ n_1=n_0=150 $, $ n_1=n_0=300 $, and $ n_1=n_0=1000 $), and various $ \eta $ values ($ 0 $ through $ \log(2.5) $). The other parameters were fixed: $ \beta_{c}=\text{log}(1.8) $, $ \beta_{m}=\text{log}(1.3) $, $ \beta_{x_1}=\beta_{x_2}=-\text{log}(1.2) $, $ \beta_{cx_1}=\beta_{cx_2}=-\text{log}(1.5) $, and $ \beta_{mx_2}=\text{log}(1.2) $.

    Figure  2.  Powers for the significance tests of maternal gene-environment interaction ($ \beta_{mx_1} $) with HWE ($ \rho=0 $), various sample sizes ($ n_1=n_0=150 $, $ n_1=n_0=300 $, and $ n_1=n_0=1000 $), and various $ \eta $ values ($ 0 $, $ \log(1.2) $, and $ \log(1.5) $). The other parameters were fixed: $ \beta_{c}=\text{log}(1.8) $, $ \beta_{m}=\text{log}(1.3) $, $ \beta_{x_1}=\beta_{x_2}=-\text{log}(1.2) $, $ \beta_{cx_1}=\beta_{cx_2}=-\text{log}(1.5) $, and $ \beta_{mx_2}=\text{log}(1.2) $.

    Figure  3.  Powers for the significance tests of maternal gene-environment interaction ($ \beta_{mx_1} $) with HWD ($ \rho=0.1 $), various sample sizes ($ n_1=n_0=150 $, $ n_1=n_0=300 $, and $ n_1=n_0=1000 $), and various $ \eta $ values ($ 0 $, $ \log(1.2) $, and $ \log(1.5) $). The other parameters were fixed: $ \beta_{c}=\text{log}(1.8) $, $ \beta_{m}=\text{log}(1.3) $, $ \beta_{x_1}=\beta_{x_2}=-\text{log}(1.2) $, $ \beta_{cx_1}=\beta_{cx_2}=-\text{log}(1.5) $, and $ \beta_{mx_2}=\text{log}(1.2) $.

    [1]
    Goddard K A, Tromp G, Romero R, et al. Candidate-gene association study of mothers with pre-eclampsia, and their infants, analyzing 775 SNPs in 190 genes. Human Heredity, 2007, 63 (1): 1–16. doi: 10.1159/000097926
    [2]
    Kanayama N, Takahashi K, Matsuura T, et al. Deficiency in p57Kip2 expression induces preeclampsia-like symptoms in mice. Molecular Human Reproduction, 2002, 8 (12): 1129–1135. doi: 10.1093/molehr/8.12.1129
    [3]
    Saftlas A F, Beydoun H, Triche E. Immunogenetic determinants of preeclampsia and related pregnancy disorders: A systematic review. Obstetrics and Gynecology, 2005, 106 (1): 162–172. doi: 10.1097/01.AOG.0000167389.97019.37
    [4]
    Wangler M F, Chang A S, Moley K H, et al. Factors associated with preterm delivery in mothers of children with Beckwith-Wiedemann syndrome: A case cohort study from the BWS registry. American Journal of Medical Genetics Part A, 2005, 134 (2): 187–191. doi: 10.1002/ajmg.a.30595
    [5]
    Goldenberg R L, Culhane J F, Iams J D, et al. Epidemiology and causes of preterm birth. The Lancet, 2008, 371 (9606): 75–84. doi: 10.1016/S0140-6736(08)60074-4
    [6]
    Zhang G, Feenstra B, Bacelis J, et al. Genetic associations with gestational duration and spontaneous preterm birth. The New England Journal of Medicine, 2017, 377 (12): 1156–1167. doi: 10.1056/NEJMoa1612665
    [7]
    Hong X, Hao K, Ji H, et al. Genome-wide approach identifies a novel gene-maternal pre-pregnancy BMI interaction on preterm birth. Nature Communications, 2017, 8 (1): 15608. doi: 10.1038/ncomms15608
    [8]
    Chen J, Zheng H, Wilson M L. Likelihood ratio tests for maternal and fetal genetic effects on obstetric complications. Genetic Epidemiology, 2009, 33 (6): 526–538. doi: 10.1002/gepi.20405
    [9]
    Fu W, Li M, Sun K, et al. Testing maternal-fetal genotype incompatibility with mother-offspring pair data. Journal of Proteomics and Genomics Research, 2013, 1 (2): 40–56. doi: 10.14302/issn.2326-0793.jpgr-12-160
    [10]
    Chen J, Lin D, Hochner H. Semiparametric maximum likelihood methods for analyzing genetic and environmental effects with case-control mother-child pair data. Biometrics, 2012, 68 (3): 869–877. doi: 10.1111/j.1541-0420.2011.01728.x
    [11]
    Lin D, Weinberg C R, Feng R, et al. A multi-locus likelihood method for assessing parent-of-origin effects using case-control mother-child pairs. Genetic Epidemiology, 2013, 37 (2): 152–162. doi: 10.1002/gepi.21700
    [12]
    Prentice R L, Pyke R. Logistic disease incidence models and case-control studies. Biometrika, 1979, 66 (3): 403–411. doi: 10.1093/biomet/66.3.403
    [13]
    Shi M, Umbach D M, Vermeulen S H, et al. Making the most of case-mother/control-mother studies. American Journal of Epidemiology, 2008, 168 (5): 541–547. doi: 10.1093/aje/kwn149
    [14]
    Zhang H, Mukherjee B, Arthur V, et al. An efficient and computationally robust statistical method for analyzing case-control mother-offspring pair genetic association studies. Annals of Applied Statics, 2020, 14 (2): 560–584. doi: 10.1214/19-AOAS1298
    [15]
    Chen Y H, Chatterjee N, Carroll R J. Shrinkage estimators for robust and efficient inference in haplotype-based case-control studies. Journal of the American Statistical Association, 2009, 104 (485): 220–233. doi: 10.1198/jasa.2009.0104
    [16]
    Owen A B. Empirical Likelihood. New York: Chapman and Hall/ CRC, 2001.
    [17]
    Zhang H, Chatterjee N, Rader D, et al. Adjustment of nonconfounding covariates in case-control genetic association studies. Annals of Applied Statistics, 2018, 12 (1): 200–221. doi: 10.1214/17-AOAS1065
    [18]
    Casella G, Berger R L. Statistical Inference. 2nd edition. Boston, MA: Cengage Learning, 2001.
    [19]
    Mukherjee B, Chatterjee N. Exploiting gene-environment independence for analysis of case-control studies: An empirical Bayes-type shrinkage estimator to trade-off between bias and efficiency. Biometrics, 2008, 64 (3): 685–694. doi: 10.1111/j.1541-0420.2007.00953.x
    [20]
    Zhang K, Zhang H, Hochner H, et al. Covariate adjusted inference of parent-of-origin effects using case-control mother-child paired multilocus genotype data. Genetic Epidemiology, 2021, 45 (8): 830–847. doi: 10.1002/gepi.22428
    [21]
    Engel S A M, Erichsen H C, Savitz D A, et al. Risk of spontaneous preterm birth is associated with common proinflammatory cytokine polymorphisms. Epidemiology, 2005, 16 (4): 469–477. doi: 10.1097/01.ede.0000164539.09250.31
    [22]
    Frey H A, Stout M J, Pearson L N, et al. Genetic variation associated with preterm birth in African-American women. American Journal of Obstetrics and Gynecology, 2016, 215 (2): 235.e1–235.e8. doi: 10.1016/j.ajog.2016.03.008
    [23]
    Haataja R, Karjalainen M K, Luukkonen A, et al. Mapping a new spontaneous preterm birth susceptibility gene, IGF1R, using linkage, haplotype sharing, and association analysis. PLoS Genetics, 2011, 7 (2): e1001293. doi: 10.1371/journal.pgen.1001293
    [24]
    Menon R, Velez D R, Simhan H, et al. Multilocus interactions at maternal tumor necrosis factor-α, tumor necrosis factor receptors, interleukin-6 and interleukin-6 receptor genes predict spontaneous preterm labor in European-American women. American Journal of Obstetrics and Gynecology, 2006, 194 (6): 1616–1624. doi: 10.1016/j.ajog.2006.03.059
    [25]
    Hendler I, Goldenberg R L, Mercer B M, et al. The preterm prediction study: Association between maternal body mass index and spontaneous and indicated preterm birth. American Journal of Obstetrics and Gynecology, 2005, 192 (3): 882–886. doi: 10.1016/j.ajog.2004.09.021
    [26]
    Frayling T M, Timpson N J, Weedon M N, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science, 2007, 316 (5826): 889–894. doi: 10.1126/science.1141634
    [27]
    Purcell S, Neale B, Todd-Brown K, et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 2007, 81 (3): 559–575. doi: 10.1086/519795
    [28]
    Hamilton B E, Martin J A, Ventura S J. Births: Preliminary data for 2005. National Vital Statistics Reports, 2006, 55 (11): 1–18.
    [29]
    Slattery M M, Morrison J J. Preterm delivery. The Lancet, 2002, 360 (9344): 1489–1497. doi: 10.1016/S0140-6736(02)11476-0
    [30]
    Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nature Genetics, 2008, 40 (6): 695–701. doi: 10.1038/ng.f.136
    [31]
    Lee S, Abecasis G R, Boehnke M, et al. Rare-variant association analysis: Study designs and statistical tests. American Journal of Human Genetics, 2014, 95 (1): 5–23. doi: 10.1016/j.ajhg.2014.06.009
    [32]
    Schork N J, Murray S S, Frazer K A, et al. Common vs. rare allele hypotheses for complex diseases. Current Opinion in Genetics and Development, 2009, 19 (3): 212–219. doi: 10.1016/j.gde.2009.04.010
    [33]
    Ionita-Laza I, Lee S, Makarov V, et al. Family-based association tests for sequence data, and comparisons with population-based association tests. European Journal of Human Genetics, 2013, 21 (10): 1158–1162. doi: 10.1038/ejhg.2012.308
    [34]
    Jiang D, McPeek M S. Robust rare variant association testing for quantitative traits in samples with related individuals. Genetic Epidemiology, 2014, 38 (1): 10–20. doi: 10.1002/gepi.21775
    [35]
    Wang X, Lee S, Zhu X, et al. GEE-based SNP set association test for continuous and discrete traits in family-based association studies. Genetic Epidemiology, 2013, 37 (8): 778–786. doi: 10.1002/gepi.21763
    [36]
    Wang X, Zhang Z, Morris N, et al. Rare variant association test in family-based sequencing studies. Briefings in Bioinformatics, 2016, 18 (6): 954–961. doi: 10.1093/bib/bbw083

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return