Global transcriptome analysis in rice (Oryza sativa. L) through RNASeq analysis

Narottam Dey

Authors Affiliation(s)

Department of Biotechnology, Visva-Bharati University, Santiniketan731235, WB, INDIA

Can J Biotech, Volume 1, Special Issue-Supplement, Page 290, DOI: https://doi.org/10.24870/cjb.2017-a274

Presenting author: narottam.dey@visva-bharati.ac.in

Abstract

The NCBI-SRA database, one of the most significant and effective public repository of short reads generated through high throughput NGS analysis is at present a valuable global resource for study of raw transcripts, are being used to validate experimental results, determine genetic variants has open up a new avenues of Bioinformatics research.

In this present work a publicly available transcriptome sequence data (BioProject: PRJNA272732) that was conducted on leaf tissue of rice mutant for heat stress transcription factor (OsHsfA2e) under well-watered and drought stressed conditions at vegetative stage has been used to study the differential gene expression through recent bioinformatics pipelines of RNAseq analysis. The sequenced reads were processed through an RNASeq analysis pipeline based on negative binomial algorithms and visualized through R based package (Deseq). The transcripts showing significant differential gene expression were analyzed further for gene ontology and pathway enrichment analysis. Of the different pipelines the most common one is the Tuxedo pipeline where the reads from two or more different conditions are first mapped to the ref. genome to generate assembled transfrags for each replicates using TopHat and Cufflinks respectively, followed by quantification of merged annotation by CuffDiff. Finally, the generated files were indexed and visualized with CummeRbund to facilitate exploration of genes identified by CuffDiff as differentially expressed, spliced, or transcriptionally regulated genes. FPKM, fragments per kilobase of transcript per million fragments mapped. In an alternative approach the reads that fall into annotated genes were used to generate read counts for each different condition. The counts generated were analyzed through an R based statistical analysis with two dedicated packages (DESeq and edgeR). All the different analysis that was carried out insilico of the mention BioProject will be presented in this upcoming conference.

References

Anders, S., McCarthy, D.J., Chen, Y., Okoniewski, M., Smyth, G.K., Huber, W. and Robinson, M.D. (2013) Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols 8: 1765-1786. Crossref