Publication Type : Conference Paper
Source : 2022 IEEE 3rd Global Conference for Advancement in Technology, GCAT 2022
Url : https://ieeexplore.ieee.org/abstract/document/9971894
Campus : Amritapuri
School : School of Computing
Center : AI (Artificial Intelligence) and Distributed Systems
Year : 2022
Abstract : Gene expression data is biological data on the quantities of various transcription factors and other chemicals inside a cell at any particular time. It comes from a study of DNA microarrays. The amount of many chemical components' approaches shown by gene expression data reveals a range of facts about the cell's health. The difficulty with gene expression data is that it contains noise, missing values, and has an extremely high dimensionality since each gene in an organism's genome has a value in the thousands, despite the fact that the number of samples is considerably fewer. This leads to mistakes in the computational analysis due to the curse of dimensionality. We have utilised the feature selection approach to fix these issues. It is used to choose the most appropriate genes for the subject being studied from the large number of genes whose values are provided. Our idea is to use the Boruta feature selection algorithm, a random forest wrapper class approach, to select a collection of features from many samples produced by gene expression profiles.
Cite this Research Publication : Kavitha K.R, Sajith S, Variar N.H. , An Efficient Boruta-Based Feature Selection and Classification of Gene Expression Data, 2022 IEEE 3rd Global Conference for Advancement in Technology, GCAT 2022.