Analyzing The World Factbook by CIA
“The World Factbook provides information on the history, people, government, economy, geography, communications, transportation, military, and transnational issues for 267 world entities.” In this blog I’m going to work on dataset provided by the CIA, public information, and can be obtained from https://www.cia.gov/library/publications/the-world-factbook/ The data provided, which you’ll be able to download from my notebook, is […]
Hacker News Data Analysis using Python
This is going to be a short and quick one before the weekend. I’ll be working with a dataset that has submissions to Hacker News from 2006 to 2015. Hacker News is a site where “users can submit articles from across the internet (usually about technology and startups), and others can “upvote” the articles, signifying […]
US Public Schools Civil Rights Data Analysis using Python
In this post I’m going to make some analysis on the 2013-14 Civil Rights Data Collection (CRDC). The CRDC is “a survey of all public schools and school districts in the United States. It measures student access to courses, programs, instructional and other staff, and resources — as well as school climate factors, such as […]
Star Wars Analytics using Python
Yes, you have read that correctly. In his post I’m going to clean up a dataset that has been collected from 835 people; a survey that has several questions around Star Wars 1 to 6. This will enable me to answer questions like “Does the rest of America realize that “The Empire Strikes Back” is […]
Cleaning, Enriching, Analyzing & Visualizing NYC High School Data using Python
Data preparation is a very important step in any data analysis/science project, it enables us to do proper analysis and visualization in order to come up with answers to questions that we propose. In this post, I’ll show you how I cleaned and enriched a dataset for NYC High School Data. The datasets used can […]
Visualizing The Gender Gap In College Degrees with Python
In this post, I’ll show a visualization that I’ve created using Python for the gender gap across college degrees between 1970 and 2012 in the USA. The dataset used contains the percentage of bachelor’s degrees granted to women for that period, and is made available, publicly, by The Department of Education Statistics. While the visualization […]