Data Fusion - breaking down the barriers to real-time data analysis
The rise of many easy-to-use, inexpensive, and open-source streaming-data platform components:
Apache Storm, a Hadoop compatible add-on (developed by Twitter) for rapid data transformation, has been implemented by The Weather
Channel, Spotify, WebMD, and Alibaba.com.
Apache Spark, a fast and general engine for large-scale data processing, supports SQL, machine learning, and streaming-data
analysis.
Apache Kafka, an open-source message broker, is widely used for consumption of streaming data.
Amazon Kinesis, a fully managed, cloud-based service for real-time data processing over large, distributed data streams, can continuously capture large volumes of data from streaming sources.
make multiple data sources appear as one. Businesses
shouldn’t have to distinguish between “Big Data” versus
other forms of data. There’s just data, period including
non-streaming and static data.