Python for Data Science

About Us

We work on open-source data projects and do education, consulting, and training in machine learning and statistics. From time to time, we release our open data work through free and open-source materials(projects) which we share through our data lab.

Introduction

Do you want to see the data analysis process in action using Python?

The book introduces the data analysis process using the Python data ecosystem and an interesting open dataset. The intended audience includes SQL and R users as well as experienced or new Python users and people new to data analysis. Pandas excels at data analysis on small to medium sized datasets. In the book, simplicity in code is valued over complexity often using functional programming inspired constructs including mapping, reducing, and lambda functions.

Learning outcomes of the book

  • understand the strengths of the python data ecosystem for data analytics
  • gain knowledge of aspects of the data analysis process
  • understand the role of data analysis in scientific inquiry
  • letting the data inform research questions and hypotheses
  • seeing the data analysis process as an iterative one

Topics covered in the book

  • data preparation and carpentry
  • the data analysis process
  • interactive querying and reporting
  • advanced grouping and aggregating operations
  • data-driven decision-making and insight
  • exploratory visualization
  • working with dates and times