PROFESSIONAL ELECTIVES
Electives Electives in Data Science
Course Name | Database Management Systems for Data Science |
Course Code | 23CSE354 |
Program | B. Tech. in Computer Science and Engineering (CSE) |
Credits | 3 |
Campus | Amritapuri ,Coimbatore,Bengaluru, Amaravati, Chennai |
Electives Electives in Data Science
Overview of Database and Database management systems – SQL – SQL for data science –Analysis with SQL – Data Analysis Workflow – Database types – Preparing data for analysis – Types of data – SQL query structure – Profiling – Distributions – Data quality – Deduplication with GROUP BY and DISTINCT – Data cleaning – Dealing with Nulls: coalesce, nullif, nvl Functions – Missing Data – Preparing: Shaping Data – BI, Visualization, Statistics, ML – Pivoting with CASE Statements – Unpivoting with UNION Statements – pivot and unpivot Functions
Time Series Analysis – Date, Datetime, and Time Manipulations – Trending the Data – Cohorts – Cohort Analysis – Analysis Framework – Rolling Time Windows – Sparse Data – Analyzing with Seasonality – Retention – SQL for a Basic Retention Curve – Adjusting Time Series to Increase Retention Accuracy – Cohorts Derived from the Time Series – Defining the Cohort from a Separate Table – Dealing with Sparse Cohorts – Defining Cohorts from Dates Other Than the First Date – Related Cohort Analyses – Survivorship – Returnship, or Repeat Purchase Behavior – Cumulative Calculations – Cross-Section Analysis, Through a Cohort Lens
Text Analysis with SQL – What Is Text Analysis – Why SQL Is a Good Choice for Text Analysis – When SQL Is Not a Good Choice – The UFO Sightings Data Set – Text Characteristics – Text Parsing – Text Transformations – Finding Elements Within Larger Blocks of Text – Wildcard Matches: LIKE, ILIKE – Exact Matches: IN, NOT IN – Regular Expressions – Constructing and Reshaping Text – Concatenation – Reshaping Text – Database and cloud – Built-in functions – python support for accessing databases.
SQL for anomaly detection – Experiment Analysis with SQL – Correlation Is Not Causation – Experiments with Binary Outcomes: The Chi-Squared Test – Experiments with Continuous Outcomes: The t-Test – Challenges – Variant Assignment – Outliers – Time Boxing – Pre-/Post-Analysis – Natural Experiment Analysis
Course Objectives
Course Outcomes
CO1: Understand how to use SQL query for data preparation, data cleaning, and profiling the data stored in
databases.
CO2: Apply SQL features to output data for business Intelligence tool for reports and dashboards creation.
CO3: Conduct Time series Data Analysis and cohort analysis to calculate rolling time windows, identify seasonal patterns, repeat behaviour, and cumulative actions.
CO4: Carry out text analysis using SQL functions.
CO5: Analyse data using experiment analysis techniques.
CO-PO Mapping
PO/PSO | PO1 | PO2 | PO3 | PO4 | PO5 | PO6 | PO7 | PO8 | PO9 | PO10 | PO11 | PO12 | PSO1 | PSO2 |
CO | ||||||||||||||
CO1 | 3 | 3 | 2 | 3 | 3 | 3 | 3 | 2 | ||||||
CO2 | 1 | 3 | 3 | 3 | 3 | 3 | 2 | 3 | 2 | |||||
CO3 | 2 | 3 | 2 | 3 | 2 | 2 | 2 | 2 | 3 | 2 | ||||
CO4 | 1 | 1 | 1 | 2 | 3 | 2 | ||||||||
CO5 | 1 | 1 |
Evaluation Pattern: 70:30
Assessment | Internal | End Semester |
Midterm | 20 | – |
*Continuous Assessment (Theory) (CAT) | 10 | – |
*Continuous Assessment (Lab) (CAL) | 40 | – |
**End Semester | 30 (50 Marks; 2 hours exam) |
*CAT – Can be Quizzes, Assignments, and Reports
*CAL – Can be Lab Assessments, Project, and Report
**End Semester can be theory examination/ lab-based examination/ project presentation
Textbook(s)
Cathy Tanimura, “SQL for Data Analysis: Advanced Techniques for Transforming Data into Insights”, O’Reilly Media, 2021.
Richard Machina, “SQL Programming For Beginners: The Guide With Step by Step Processes on Data Analysis”, 2020.
Reference(s)
Anthony DeBarros, “Practical SQL, A Beginner’s Guide to Storytelling with Data”, 2nd Edition, No starch press, 2022.
Upom Malik, Matt Goldwasser, Benjamin Johnston, “SQL for Data Analytics: Perform fast and efficient data analysis with the power of SQL”, Packt Publishing, Year: 2019.
Silberschatz. A., Korth, H. F. and Sudharshan, S., “Database System Concepts”, 6th Edition, TMH, 2010.
Elmasri, R. and Navathe, S. B., “Fundamentals of Database Systems”, 5th Edition, Addison Wesley, 2006.
Date, C. J. , “An Introduction to Database Systems”, 8th Edition, Addison Wesley, 2003.
Ramakrishnan, R. and Gehrke, J., “Database Management Systems”, 3rd Edition, McGrawHill, 2003.
DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.