PROFESSIONAL ELECTIVES
Electives Electives in Data Science
Course Name | Foundations of Data Science |
Course Code | 23CSE351 |
Program | B. Tech. in Computer Science and Engineering (CSE) |
Credits | 3 |
Campus | Amritapuri ,Coimbatore,Bengaluru, Amaravati, Chennai |
Electives Electives in Data Science
Introduction to Data Science, Causality and Experiments, Data Preprocessing: Data cleaning, Data reduction, Data transformation, Data discretization. Exploratory Data Analysis in python: Visualizing categorical data, numerical data, summary statistics of data, overlaid graphs. Random Variables: Random variables, Functions of Random variables Probability Distributions: Discrete and continuous distributions, Sampling: Sampling Concepts, The Central Limit Theorem and Applications. Sample Means and Sample Sizes.
Descriptive statistics: Central tendency, dispersion, variance, covariance, kurtosis, five-point summary, Distributions, Bayes Theorem, Error Probabilities; Permutation Testing, Hypothesis and Inference: P-Values, Hypothesis Testing, Assessing Models, Decisions and Uncertainty, Comparing Samples, Chisquared Test, A/B Testing.
Linear Regression: Building the regression model – Least square line, Predictions using regression models – Uncertainties in regression coefficients, checking assumptions and transforming data, web scrapping, Introduction to Data Visualization Tools: Tableau, PowerBI.
Course Objectives
Course Outcomes
CO1: Understand the statistical foundations of data science.
CO2: Apply pre-processing techniques over raw data to enable further analysis.
CO3: Conduct exploratory data analysis and create insightful visualizations to identify patterns.
CO4: Identify machine learning algorithms for regression/classification tasks and to get into insights.
CO5: Analyze the degree of certainty of predictions using statistical tests and models.
CO-PO Mapping
PO/PSO | PO1 | PO2 | PO3 | PO4 | PO5 | PO6 | PO7 | PO8 | PO9 | PO10 | PO11 | PO12 | PSO1 | PSO2 |
CO | ||||||||||||||
CO1 | 1 | 2 | 2 | |||||||||||
CO2 | 1 | 1 | 1 | 3 | 2 | 2 | ||||||||
CO3 | 3 | 1 | 1 | 2 | 3 | 2 | 2 | |||||||
CO4 | 3 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | ||||||
CO5 | 3 | 3 | 1 | 3 | 2 | 2 | 2 |
Evaluation Pattern: 70:30
Assessment |
Internal |
End Semester |
Midterm |
20 |
|
*Continuous Assessment Theory (CAT) |
10 |
|
*Continuous Assessment Lab (CAL) |
40 |
|
**End Semester |
30 (50 Marks; 2 hours exam) |
* CAT – Can be Quizzes, Assignments, and Tutorials
* CAL – Can be Lab Assessments, Projects, and Reports
**End Semester can be theory examination/ lab-based examination/ project presentation
Textbook(s)
Ani Adhikari and John DeNero, David Wagner. “Computational and Inferential Thinking: The Foundations of Data Science”,2nd Edition, e-book 2021. https://inferentialthinking.com/chapters/intro.html.
Reference(s)
William Navidi, “Statistics for Engineers and Scientists”, Fifth Edition, McGraw Hill, 2020.
Galit Shmueli, Peter C. Bruce, Inbal Yahav, Nitin R. Patel, Kenneth C. Lichtendahl Jr. “Data Mining for Business Analytics: Concepts, Techniques and Applications in R”, Wiley India, 2018.
Rachel Schutt & Cathy O’Neil, “Doing Data Science”, O’ Reilly, First Edition, 2013.
Joel Grus, “Data Science from Scratch”, Second edition, O’Reilly Media, Inc. 2019.
Wes McKinney, “Python for Data Analysis”, Wes McKinney, Third Edition, O’Reilly, 2022.
DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.