Lecture 6 - Mixed Effects Models and LDSC
lecture
bios25328
Lecture 6 - Methods for correcting population structure in genetic association studies
Find the lecture notes here.
Learning Objectives
Population Structure and Association Studies
- Explain why correcting for population structure is necessary in genetic association studies
- Describe the assumptions and limitations of the Genomic Control method
- Outline the approach of inferring latent sub-populations and performing association within them
- Understand how Principal Components Analysis (PCA) is used to adjust for population structure
Mixed Effects Models (MEMs)
- Explain the concept of a Mixed Effects Model in the context of genetic association studies
- Define the components of the MEM equation (Y=Xβ+u+ϵ) and the role of the genetic relatedness matrix (K)
- Describe how MEMs account for population structure, family structure, and cryptic relatedness
LD Score Regression
- Define Linkage Disequilibrium (LD) Score and explain how it’s calculated
- Explain the principle behind LD Score Regression and how it distinguishes between polygenicity and confounding
- Interpret the components of the LD Score Regression equation (E[χ2∣l j]=Nh2lj/M+Na+1)
- Understand how LD Score Regression can be used to estimate SNP-based heritability
- Explain how LD Score Regression can be extended to estimate genetic correlation between traits
Summary of Lecture Notes
Methods for Correcting Population Structure
- Genomic Control
- Older method assuming constant effects of population stratification across the genome
- Uses correction factor (λ) based on distribution of test statistics
- GWAS results considered acceptable if λ values close to 1
- Now largely superseded by LD score regression
- Inferring Latent Sub-populations
- Identifies underlying sub-populations (e.g., using STRUCTURE)
- Performs association analysis within each sub-population
- Followed by meta-analysis of results
- Principal Components Analysis (PCA)
- Uses genetic principal components as covariates in regression models
- Accounts for population structure through linear combinations of genetic variants
- Mixed Effects Models (MEMs)
- Incorporates random effect term to account for genetic relatedness
- Uses genetic relatedness matrix (K) to define covariance structure
- Effectively corrects for inflation in test statistics
- Captures population structure, family structure, and cryptic relatedness
- LD Score Regression
- Distinguishes between inflation due to confounding and true polygenicity
- Regresses association test statistic (Chi2) against LD score
- Intercept estimates inflation from confounding
- Slope relates to heritability
- Can estimate genetic correlations between traits
- Mathematical basis includes random effect variance structure and LD Score regression equation
Applications
- Examples of applying LD Score regression to estimate heritability
- Genetic correlation analysis between diseases (e.g., schizophrenia, cancers, psychiatric disorders)