lab 3
bios25328
lab
In this lab, you will run all the steps of a GWAS analysis using the Marees et al tutorial.
Instructions
- Read and summarize in a few sentences this GWAS tutorial paper https://onlinelibrary.wiley.com/doi/10.1002/mpr.1608
- Download plink from LINK (choose the one corresponding to your operating system. If running on posit.cloud, you should choose the linux version even if you are accessing posit from a different operating system)
- Create a github user if you don’t already have one
- Git clone the tutorial from the command line (run
git clone <https://github.com/MareesAT/GWA_tutorial.git
> on the terminal). You will need to unzip files that look like *.zip - run the QC and Association components of the tutorial
- 1_Main_script_QC_GWAS.txt and 3_Main_script_association_GWAS.txt
- you may want to download the hapmap data from here https://uchicago.box.com/s/hawatrohw6fthytguaww83njrf5i6ace
- you may need to download plink (https://www.cog-genomics.org/plink/1.9/)
- The tutorial is designed so that you need to run all the steps but since 2_Population_stratification.txt is quite computationally time consuming, you can skip it and just download the files you need to run associations here https://uchicago.box.com/s/ux2xkab6zhth0csazixqoj98xtjh7h0x
- 1_Main_script_QC_GWAS.txt and 3_Main_script_association_GWAS.txt
Notes
Some modifications to the code may be needed. Use them if you encounter errors.
Relatedness.R needs to change lines 7 and 15 to
legend(1,1, xjust=1, yjust=1, legend=levels(factor(relatedness$RT)), pch=16, col=c(4,3))
legend(0.02,1, xjust=1, yjust=1, legend=levels(factor(relatedness$RT)), pch=16, col=c(4,3))
line 31 of 2_Main_script_MDS.txt replace
plink --bfile ALL.2of4intersection.20100804.genotypes --set-missing-var-ids @:#[b37]\$1,\$2 --make-bed --out ALL.2of4intersection.20100804.genotypes_no_missing_IDs
with
plink --bfile ALL.2of4intersection.20100804.genotypes --set-missing-var-ids '@:#[b37]$1,$2' --make-bed --out ALL.2of4intersection.20100804.genotypes_no_missing_IDs
line 144 in 2_Main_script_MDS.txt replace
(base) haekyungim@Im-Lab-016 1_QC_GWAS % cat race_1kG14.txt racefile_own.txt | sed -e '1i\FID IID race' > racefile.txt
sed: 1: "1i\FID IID race
": extra characters after \ at the end of i command
(base) haekyungim@Im-Lab-016 1_QC_GWAS % cat race_1kG14.txt racefile_own.txt | sed -e '1i\
FID IID race' > racefile.txt
install qqman in R and comment out first line on Manhattan_plot.R and QQ_plot.R
##install.packages("qqman",repos="http://cran.cnr.berkeley.edu/",lib="~" ) # location of installation can be changed but has to correspond with the library location
##library("qqman",lib.loc="~")
library("qqman")
Questions
Using the output from the tutorial or using the commands you learned from it, answer the following questions. Show the command you used to create the result.
- How many individuals are in the genotype file you downloaded? (5 pts)
- Explain the contents of
.fam
,.bim
,.bed
files (5 pts) - Write the captions for the figures generated by the commands in 1_Main_script_QC_GWAS.txt and 3_Main_script_association_GWAS (5 pts per figure caption)
- Explain what you accomplished with the tutorial and explain the results/figures you obtained. (20 pts)