Do you want to see the data analysis process in action using Python?
This guide introduces the data analysis process using the Python data ecosystem and an interesting open dataset. The intended audience includes SQL and R users as well as experienced or new Python users and people new to data analysis. Pandas excels at data analysis on small to medium sized datasets. In this guide, simplicity in code is valued over complexity often using functional programming inspired constructs including mapping, reducing, and lambda functions.If you are interested in a pdf or epub, you can get that here. The cleaned datasets used in the book are available at that link as well. The best place to start is with the data munging notebook which gives a little background on the dataset.
We work on open-source data projects and do education,consulting, and training in machine learning and statistics. From time to time, we release our open data work through free and open-source materials(projects) which we share through our data lab. Follow us here for announcements on our latest projects.