### Lesson 1: Introduction to Data Science - Introduction to Data Science - What is a Data Scientist - Pi-Chaun (Data Scientist @ Google): What is Data Science? - Gabor (Data Scientist @ Twitter): What is Data Science? - Problems Solved by Data Science - Pandas - Dataframes - Create a New Dataframe ### Lesson 2: Data Wrangling - What is Data Wrangling? - Acquiring Data - Common Data Formats - What are Relational Databases? - Aadhaar Data - Aadhaar Data and Relational Databases - Introduction to Databases Schemas - API’s - Data in JSON Format - How to Access an API efficiently - Missing Values - Easy Imputation - Impute using Linear Regression - Tip of the Imputation Iceberg ### Lesson 3: Data Analysis - Statistical Rigor - Kurt (Data Scientist @ Twitter) - Why is Stats Useful? - Introduction to Normal Distribution - T Test - Welch T Test - Non-Parametric Tests - Non-Normal Data - Stats vs. Machine Learning - Different Types of Machine Learning - Prediction with Regression - Cost Function - How to Minimize Cost Function - Coefficients of Determination ### Lesson 4: Data Visualization - Effective Information Visualization - Napoleon's March on Russia - Don (Principal Data Scientist @ AT&T): Communicating Findings - Rishiraj (Principal Data Scientist @ AT&T): Communicating Findings Well - Visual Encodings - Perception of Visual Cues - Plotting in Python - Data Scales - Visualizing Time Series Data ### Lesson 5: MapReduce - Big Data and MapReduce - Basics of MapReduce - Mapper - Reducer - MapReduce with Aadhaar Data - MapReduce with Subway Data
The Introduction to Data Science class will survey the foundational topics in data science, namely: * Data Manipulation * Data Analysis with Statistics and Machine Learning * Data Communication with Information Visualization * Data at Scale -- Working with Big Data The class will focus on breadth and present the topics briefly instead of focusing on a single topic in depth. This will give you the opportunity to sample and apply the basic techniques of data science. This course is also a part of our Data Analyst Nanodegree.