Phasing of Genotyping Data
|Principal Investigator(s)||Prof. Karsten Suhre|
With the advent of the new Affymetrix Genotyping 500k Arrays (http://www.affymetrix.com/products/arrays/specific/500k.affx), the era of research of human genome variants took a quantum jump. With these devices it is now possible to genotype 500,000 single nucleotide polymorphisms (SNPs) in one single experiment at moderate prices, yielding for the first time dense whole genome coverage over large cohorts. However, to deduce haplotypes (genomics allele variants) from genotype data (SNPs), very time consuming computational methods have to be applied, which are based largely on Gibbs sampling approaches. The state of the art in this domain is the fastPHASE software (http://www.stat.washington.edu/stephens/software.html), which we will use in this project.
The scientific background of our project is the KORA study (http://www.gsf.de/KORA/), which monitors the health status of several thousands of voluntary patients since the 1980s. Recently, we have genotyped 1644 of these patients using the Affymetrix 500k arrays, so that we are now in a position to use this data for whole genome association studies (i.e. with a focus on multi-genetic diseases such as diabetes and coronary heart disease). By phasing the genotype data we will actually be in a position to move from looking at a single SNP to the medically much more relevant level of allele variances and haplotype (see http://www.hapmap.org/). We expect to detect new associations between genomics variants in the KORA population and complex phenotypic traits.