Python for Data Preprocessing

125 students

Before beginning any analyses of your data, it is important to get an overall feel of the look and usability of your dataset. This workshop will provide you with an introduction to preparing data for analysis. After completing this workshop, you will be able to use Python to perform preprocessing on datasets. Specifically, you’ll learn

  • How to handle missing data
  • How to manage bad data entries
  • How to estimate data characteristics and summary statistics
  • How to identify outliers and estimate data distribution

You will learn these tools within the context of solving real world data science problems. This workshop features:

  • Two (2) lesson modules with video lectures and interactive Jupyter notebooks
  • Four (4) lesson exercises
  • One (1) end of workshop project that demonstrates data preprocessing in the real world

This is a self-paced workshop that contains assignments and quizzes without specified due dates. You can progress through the workshop at your own pace or at the speed set by your instructor. This workshop is great for young and/or beginner data scientists and requires no prerequisite programming knowledge.

Prerequisites suggested: Python for Data Science and Python for Data Visualizations


Math and Technology @ Work is learning simplified.