Recognize Faces in Video with Pentaho (ML in Action)

The term “unstructured data” is being extensively used nowadays when speaking of subjects related to big data and data analytics. Some people describe it as data that typically cannot fit in a relational database, others describe it as data that cannot be easily processed using conventional methods and tools. Both descriptions are correct. I’m not […]

Capitalizing on IoT using Oracle Stream Analytics – Oil&Gas In Action!

Introduction IoT is one of the main frontier in technology today, it provides endless benefits by converting “dumb” devices into smarter and more efficient ones. It’s already a widely adopted concept and can be seen in many applications such as fitness trackers to cars to home security cameras. IoT, on one hand, enables “things” to […]

Analyzing The World Factbook by CIA

“The World Factbook provides information on the history, people, government, economy, geography, communications, transportation, military, and transnational issues for 267 world entities.” In this blog I’m going to work on dataset provided by the CIA, public information, and can be obtained from https://www.cia.gov/library/publications/the-world-factbook/ The data provided, which you’ll be able to download from my notebook, is […]

Hacker News Data Analysis using Python

This is going to be a short and quick one before the weekend. I’ll be working with a dataset that has submissions to Hacker News from 2006 to 2015. Hacker News is a site where “users can submit articles from across the internet (usually about technology and startups), and others can “upvote” the articles, signifying […]

US Public Schools Civil Rights Data Analysis using Python

In this post I’m going to make some analysis on the 2013-14 Civil Rights Data Collection (CRDC). The CRDC is “a survey of all public schools and school districts in the United States. It measures student access to courses, programs, instructional and other staff, and resources — as well as school climate factors, such as […]

Star Wars Analytics using Python

Yes, you have read that correctly. In his post I’m going to clean up a dataset that has been collected from 835 people; a survey that has several questions around Star Wars 1 to 6. This will enable me to answer questions like “Does the rest of America realize that “The Empire Strikes Back” is […]