Quality control of samples used for genome-wide association study

  • Nguyễn Ngọc Trung Đại học Quốc gia Thành phố Hồ Chí Minh
  • Lê Gia Hoàng Linh Đại học Y Dược Thành phố Hồ Chí Minh
  • Trần Quang Nam Đại học Y Dược Thành phố Hồ Chí Minh
  • Mai Phương Thảo Đại học Y Dược Thành phố Hồ Chí Minh
  • Hoàng Anh Vũ Đại học Y Dược Thành phố Hồ Chí Minh
  • Đỗ Đức Minh Đại học Y Dược Thành phố Hồ Chí Minh

Main Article Content

Keywords

GenomeStudio, PLINK, Genome-wide association study, quality control

Abstract

Objetive: Genome-wide association study (GWAS) is a very effective tool to investigate the role of genetic contribution to the etiology of complex multifactorial diseases. However, due to the large amount of single nucleotide polymorphisms in microarray bead chip, the quality control process of the samples in GWAS is extremely necessary. In this study, bioinformatic tools were used to assess the quality of microarray samples including 503 type 2 diabetic patients and 494 controls. Subject and method: 997 subjects (494 controls and 503 type 2 diabetes cases) were genotyped using Infinium Global Screening Array (GSA) containing 644303 genetic markers. By using GenomeStudio and PLINK softwares, the standard for quality control of these samples was set for sample quality, polymorphism quality, sex-matching, heterozygousity, relationship. Result: Samples with any of the specific parameters CallRate < 0.98, GenTrain Score < 0.7, Cluster Sep Score < 0.3, Call Freq < 0.95, sex unmatching, very heterozygous, or potential relatives were considered not qualified. Finally, 213 samples and 264,390 polymorphisms were excluded from our data. Conclusion: With the quality threshold described above, we have successfully performed the quality control for GWAS study including 503 type 2 diabetic patients and 494 controls. These quality control steps are crucial for accurate genome analysis as well as polygenic risk score calculation.

Article Details