In this post, I’ll perform some analysis by using native python functions along with some of the modules that you, as data analyst/scientist, would be working with extensively.
In real life, you’ll be working with more advanced modules. Advance in terms of richness and easiness. However, I personally believe it’s important to learn how to do things the hard way first, before entering the world of magical modules that python provides you with.
We’ll be working with a dataset that contains information on gun deaths in the US from 2012 to 2014. Each row in the dataset represents a single fatality. The columns contain demographic and other information about the victim.
To make this more readable, I’ve used Jupyter notebook and uploaded it into my github repository. You’ll be able to download the dataset from there as well, if you wish to practice it.
NOTE: This is not a tutorial nor a “how-to” guide. It’s just a completed exercise that I’m sharing.
Here are the notebooks:
Exploring Gun Deaths in the US
Summarizing Data Basics (bonus to my previous visualization post)