The Steam Engine of the 21st Century
Congratulations! If you’re reading this article, it means you’ve made it to 2020, and to a new decade. Over the past 2-3 decades, we’ve lived through and experienced magnificent changes in the world in many areas, and certain technologies have been accelerating faster than they did over the past 300 hundred years, combined. If you […]
Recognize Faces in Video with Pentaho (ML in Action)
The term “unstructured data” is being extensively used nowadays when speaking of subjects related to big data and data analytics. Some people describe it as data that typically cannot fit in a relational database, others describe it as data that cannot be easily processed using conventional methods and tools. Both descriptions are correct. I’m not […]
Real-Time Kafka / MapR Streams Data Ingestion into HBase / MapR-DB via PySpark
Streaming data is becoming an essential part of every data integration project nowadays, if not a focus requirement, a second nature. Advantages gained from real-time data streaming are so many. To name a few: real-time analytics and decision making, better resource utilization, data pipelining, facilitation for micro-services and much more. Python has many modules out […]
Perfecting Lambda Architecture with Oracle Data Integrator (and Kafka / MapR Streams)
Republished by: MapR Technologies Datafloq ——- Introduction “Lambda architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch– and stream-processing methods. This approach to architecture attempts to balance latency, throughput, and fault-tolerance by using batch processing to provide comprehensive and accurate views of batch data, while simultaneously using real-time stream processing to provide views of online […]
Capitalizing on IoT using Oracle Stream Analytics – Oil&Gas In Action!
Introduction IoT is one of the main frontier in technology today, it provides endless benefits by converting “dumb” devices into smarter and more efficient ones. It’s already a widely adopted concept and can be seen in many applications such as fitness trackers to cars to home security cameras. IoT, on one hand, enables “things” to […]
Analyzing The World Factbook by CIA
“The World Factbook provides information on the history, people, government, economy, geography, communications, transportation, military, and transnational issues for 267 world entities.” In this blog I’m going to work on dataset provided by the CIA, public information, and can be obtained from https://www.cia.gov/library/publications/the-world-factbook/ The data provided, which you’ll be able to download from my notebook, is […]