Identifying lineage effects when controlling for population structure improves power in bacterial association studies.
Earle SG., Wu CH., Charlesworth J., Stoesser N., Gordon NC., Walker TM., Spencer CCA., Iqbal Z., Clifton DA., Hopkins KL., Woodford N., Smith EG., Ismail N., Llewelyn MJ., Peto TE., Crook DW., McVean G., Walker AS., Wilson DJ.
Bacteria pose unique challenges for genome-wide association studies because of strong structuring into distinct strains and substantial linkage disequilibrium across the genome(1,2). Although methods developed for human studies can correct for strain structure(3,4), this risks considerable loss-of-power because genetic differences between strains often contribute substantial phenotypic variability(5). Here, we propose a new method that captures lineage-level associations even when locus-specific associations cannot be fine-mapped. We demonstrate its ability to detect genes and genetic variants underlying resistance to 17 antimicrobials in 3,144 isolates from four taxonomically diverse clonal and recombining bacteria: Mycobacterium tuberculosis, Staphylococcus aureus, Escherichia coli and Klebsiella pneumoniae. Strong selection, recombination and penetrance confer high power to recover known antimicrobial resistance mechanisms and reveal a candidate association between the outer membrane porin nmpC and cefazolin resistance in E. coli. Hence, our method pinpoints locus-specific effects where possible and boosts power by detecting lineage-level differences when fine-mapping is intractable.