Back close

A tf-idf based topic model for identifying lncRNAs from genomic background

Publication Type : Conference Proceedings

Publisher : SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, p.40 - 46.

Source : SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, p.40 - 46 (2018)

Url : https://dl.acm.org/doi/10.1145/3167132.3167133

ISBN : 9781450351911

Campus : Coimbatore

School : School of Engineering

Department : Computer Science

Year : 2018

Abstract : The developments in high throughput technologies identified a large number of long non-coding RNAs (lncRNAs) whose functional characterization remains an open problem. The available research confirmed that lncRNA plays a major role in genetic and epigenetic regulation, and its expression level has a significant association with some complex diseases like cancers. The identification of lncRNA and their functional characterization is an important task in RNA Bioinformatics. In spite of their abundance in the cell, lncRNAs are less conserved at their sequence level which makes the analysis challenging. Many machine learning based models are developed in the literature for the identification and analysis of lncRNAs. This paper proposes a topic model based method for the identification of lncRNAs. To investigate the applicability of topic model in lncRNA analysis, this work develops an LDA based topic model to group lncRNAs from a collection of transcriptome sequences. The features derived from transformed k-mer patterns and secondary structure of lncRNA sequences are used for the topic model. The results are promising compared to the classic algorithms and prove that the topic models are reasonable for lncRNA analysis.

Cite this Research Publication : Manu Madhavan and Gopakumar G., “A tf-idf based topic model for identifying lncRNAs from genomic background”, SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing. pp. 40 - 46, 2018.

Admissions Apply Now