Classification of enhancer promoter interaction pairs based on expression patterns and distances involved in disease manifestation in human

  1. Abhishek Das,
  2. Subhadeep Das,
  3. Samrat Ghosh,
  4. Sucheta Tripathy*

Authors Affiliation(s)

  • Structural Biology and Bioinformatics Division, CSIR – Indian Institute of Chemical Biology, Kolkata 700032, INDIA

Can J Biotech, Volume 1, Special Issue, Page 107, DOI: https://doi.org/10.24870/cjb.2017-a93

*Corresponding author: tsucheta@gmail.com, tsucheta@iicb.res.in

Abstract

Enhancers-the non-coding regions of genomes’ are responsible for regulation of transcription of interacting genes. In different cell lines different regions act as enhancers. Enhancer-promoter interaction (EPI) models suggest that enhancer helps in the assembly of transcription factors along with RNA polymerase II and interacts with promoters to increase the expression of corresponding genes. During transcription, enhancer itself undergoes transcription giving rise to small RNAs, known as enhancer RNA. Presence of Transcription Start Sites (TSS) in annotated enhancer regions is also defined as active enhancers. Three different human cell-lines namely, Gm12878, K562 and H1-hesc which are normal, cancerous and stem cell-lines, respectively were studied. K-medoids algorithm was used to segregate EPI in all the cell-lines. Three clusters were derived on the basis of expression of enhancer, expression of their interacting promoters and distance between the two. Statistical t-test analysis showed that all clusters were different from each other. Cluster-1 (expression of enhancer Mean (eeMean) =59.12, Median (eeMedian) =12.09) and cluster-2 (eeMean=1799.9, eeMedian=1468) differ from each other on the basis of enhancer’s expression. Cluster-2 (distance mean=20521, eeMean=1799.9; distance median=7984, eeMedian=1468.5) was different from cluster-3 (distance mean=180798, distance median=162626) on the basis of distance and the expression of TSS at enhancer. Finally cluster-1 (distance mean=18030, distance median=6966) and cluster-3 (distance mean=180798, distance median=162626) differ from each other on the basis of distance. RNAseq analysis showed 7 upregulated genes in K562 compared to Gm12878. Further, EPI distributions of MYC, RAD23B and Insulin like growth factors showed similar pattern in K562 and H1hesc, and they were present in cluster-1. Whereas EPI of MDN1, CDKN1C, and eukaryotic translation elongation factor2 in K562 were present in cluster-1 and EPI of H1hesc were present in cluster-1 and cluster-3. EPI of Erythrocyte membrane protein were segregated into cluster-1 and cluster-3 for both K562 and H1hesc, whereas all of these interactions were absent in Gm12878. Overall these results suggests that enhancer activities are mainly responsible for carcinogenesis in K562 cell-lines otherwise absent in normal cell-lines.