Publication Type : Book Chapter
Publisher : Springer, Singapore
Source : In: Raj J.S., Palanisamy R., Perikos I., Shi Y. (eds) Intelligent Sustainable Systems. Lecture Notes in Networks and Systems, vol 213. Springer, Singapore. https://doi.org/10.1007/978-981-16-2422-3_52
Url : https://link.springer.com/chapter/10.1007/978-981-16-2422-3_52
Campus : Kochi
School : School of Computing
Department : Computer Science
Year : 2021
Abstract : There are several diseases that the world faces presently and a critical one is Diabetes mellitus. The current diagnostic practice involves various tests at a lab or a hospital and a treatment based on the outcome of the diagnosis. This study proposes a machine learning model to classify a patient as diabetic or not, utilizing the popular PIMA Indian Dataset. The dataset contains features like Pregnancy, Blood Pressure, Skin Thickness, Age and Diabetes Pedigree Function along with regular factors like Glucose, BMI and Insulin. The objective of this study is to make use of several pre-processing techniques resulting in improved accuracy over simple models. The study compares different classification models namely GaussianNB, Logistic Regression, KNN, Decision Tree Classifier, Random Forest Classifier, Gradient Boosting Classifier in several ways. Initially, missing values in the significant features are replaced by computing median of the input variables based on the outcome of whether the patient is diabetic or not. After this, feature engineering is performed by adding new features which are obtained by categorizing the existing features based on its range. Finally, Hyperparameter tuning is carried out to optimize the model. Performance metrics such as Accuracy and area under the ROC Curve (AUC) is used to validate the effectiveness of the proposed framework. Results indicate that XGBoosting Classifier is concluded as the optimum model with 88% accuracy and AUC value of 0.948. The performance of the model is evaluated using Confusion Matrix and ROC Curve.
Cite this Research Publication : James D.E., Vimina E.R. (August 2021), "Machine Learning-Based Early Diabetes Prediction," In: Raj J.S., Palanisamy R., Perikos I., Shi Y. (eds) Intelligent Sustainable Systems. Lecture Notes in Networks and Systems, vol 213. Springer, Singapore. https://doi.org/10.1007/978-981-16-2422-3_52