Syllabus
Unit I
R Programming Basics: Conditional expressions, For, if else, do and while loops, defining functions, Vectorization and functionals, Data Frames, R Scripts, list, repeats, vector indexing, sorting and ordering, factors, creating matrices and basic matrix operations in R, 2d/3d plotting.
Unit II
Data Handling: Creating, importing/exporting and merging of datasets, Data Collection, Sampling methods, classification of data, Dealing with missing values. Data Visualization: Bar and Pie charts – histogram, frequency polygon – Box plot – Stem and leaf plot.
Unit III
Data Analysis: Measures of Central tendency and dispersion – Mean, median, mode, absolute, quartile and standard deviations, skewness and kurtosis for both grouped and ungrouped data. Association of attributes. Generating random samples from standard distributions (such as Bernoulli, Poisson, Normal, Exponential etc.).
Unit IV
Curve fitting and interpolation – Fitting of straight lines and curves – Correlation, regression, fitting of simple linear lines, polynomials and logarithmic functions – Interpolation and extrapolation methods – Binomial expansion, Newton and Gauss methods.
Unit V
Supervised Learning using R(Regression/Classification): Na??ve Bayes, Linear models: Linear Regression, Logistic Regression, Generalized Linear Models – Case Study.
Objectives and Outcomes
Course Outcomes:
CO1: Understanding the fundamentals of R software
CO2: Exploring and implementing graphical visualization using R
CO3: Exploring and implementing basic Statistics using R
CO4: Understanding correlation and regression and visualising them using R
CO5: Exploring and implementing supervised learning through classification/regressions
using R
CO-PO Mapping:
|
PO1
|
PO2
|
PO3
|
PO4
|
PO5
|
PO5
|
PO6
|
PO7
|
PO8
|
PO9
|
PO10
|
PO11
|
PO12
|
CO1
|
3
|
3
|
3
|
2
|
2
|
2
|
2
|
|
|
|
|
3
|
2
|
CO2
|
3
|
3
|
3
|
2
|
2
|
2
|
2
|
|
|
|
|
3
|
2
|
CO3
|
3
|
3
|
3
|
2
|
2
|
3
|
3
|
|
|
|
|
3
|
2
|
CO4
|
3
|
2
|
3
|
2
|
2
|
2
|
3
|
|
|
|
|
3
|
2
|
CO5
|
2
|
2
|
2
|
2
|
3
|
2
|
3
|
|
|
|
|
3
|
2
|
Text Books / References
Text Books /Reference Books:
- Rafael A. Irizarry, Introduction to Data Science: Data Analysis and Prediction Algorithms with R, CRC Press, 2019.
- Douglas C. Montgomery and George C. Runger, Applied Statistics and Probability for Engineers, John Wiley and Sons Inc., 2005.
- Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012.
- Data Science and big data analytics: Discovering, analyzing, visualizing and presenting data, EMC Education Services, John Wiley 2015.
- John Hopcroft and Ravi Kannan, “Foundations of Data Science”, eBook, Publisher, 2013.