Back close

Course Detail

Course Name Data Wrangling
Course Code 24CSC544
Program Integrated M. Sc. Mathematics and Computing
Credits 3
Campus Coimbatore

Syllabus

INTRODUCTION TO DATA WRANGLING: What Is Data Wrangling?- Importance of Data Wrangling -How is Data Wrangling performed?- Tasks of Data Wrangling-Data Wrangling Tools-

Introduction to Python-Python Basics-Data Meant to Be Read by Machines-CSV Data-JSON Data- XML Data.

WORKING WITH EXCEL FILES AND PDFS: Installing Python Packages-Parsing Excel Files-Parsing Excel Files -Getting Started with Parsing-PDFs and Problem Solving in Python- Programmatic Approaches to PDF Parsing-Converting PDF to Text-Parsing PDFs Using pdf

miner-Acquiring and Storing Data-Databases: A Brief Introduction-Relational Databases: MySQL and PostgreSQL-Non-Relational Databases: NoSQL-When to Use a Simple File- Alternative Data Storage.

DATA CLEANUP: Why Clean Data?- Data Clean up Basics-Identifying Values for Data Clean up-Formatting Data-Finding Outliers and Bad Data-Finding Duplicates-Fuzzy Matching-RegEx Matching-Normalizing and Standardizing the Data-Saving the Data-Determining suitable Data Clean up-Scripting the Clean up-Testing with New Data.

DATA EXPLORATION AND ANALYSIS: Exploring Data-Importing Data-Exploring Table Functions-Joining Numerous Datasets-Identifying Correlations-Identifying Outliers-Creating Groupings-Analyzing Data-Separating and Focusing the Data-Presenting Data-Visualizing the Data-Charts-Time-Related Data-Maps-Interactives-Words-Images, Video, and Illustrations- Presentation Tools-Publishing the Data-Open Source Platforms.

WEB SCRAPING: What to Scrape and How-Analyzing a Web Page-Network/Timeline- Interacting with JavaScript-In-Depth Analysis of a Page-Getting Pages-Reading a Web Page- Reading a Web Page with

LXML-XPath-Advanced Web Scraping-Browser-Based Parsing-Screen Reading with Selenium- Screen Reading with Ghost. Py-Spidering the Web-Building a Spider with Scrapy-Crawling Whole Websites with Scrapy.

Text Books / References

Text Books:

  1. Jacqueline Kazil & Katharine Jarmul,” Data Wrangling with Python”, O’Reilly Media, Inc,2016
  2. Tirthajyoti Sarkar, Shubhadeep,” Data Wrangling with Python: Creating actionable data fromrawsources”,PacktPublishingLtd,2019.
  3. Stefanie Molin,” Hands-On Data Analysis with Pandas”, Packt Publishing Ltd,2019
  4. Allan Visochek,” Practical Data Wrangling”, Packt Publishing Ltd,2017.

DISCLAIMER: The appearance of external links on this web site does not constitute endorsement by the School of Biotechnology/Amrita Vishwa Vidyapeetham or the information, products or services contained therein. For other than authorized activities, the Amrita Vishwa Vidyapeetham does not exercise any editorial control over the information you may find at these locations. These links are provided consistent with the stated purpose of this web site.

Admissions Apply Now