- HPC-Medical & Bioinformatics Applications Group, Centre for Development of Advanced Computing, SP Pune University Campus, Pune 411007, INDIA
Can J Biotech, Volume 1, Special Issue-Supplement, Page 218, DOI: https://doi.org/10.24870/cjb.2017-a203
Variant calling is a major challenge in data-sets pertaining to large populations due to the difficulty in providing a consistent set of calls at all possible sites, particularly when the data is of low coverage. A further challenge is the computational cost associated with variant calling which increases exponentially with increase in the number of samples. 1000 Genomes Project provides data of 26 ethnic groups spread across the globe with an aim to capture genetic variants with frequencies of at least 1% in population. Samples sequenced have varied coverage ranging from low (2-4X) to high coverage (50X).
The present work includes variant calling for a South Asian population named GIH (Gujarati Indian from Houston, Texas). The main objective is to call genetic variants using different strategies viz., joint calling, multi-sample pooled calling and single sample calling of the GIH population. The predicted variants promise to provide clues to find biological markers in complex multi-gene diseases.