Batch Vs Stream Processing
Batch Processing
Intro
Ad-Hoc or Scheduled processing
Data Needs To Be Collected Before Hand Before Processing
Have Prior Knowledge About The Compute / Storage Resources
Batch processing is lengthy and is meant for large quantities of information that aren’t time-sensitive
Can Be Used On Compute Intensive Tasks
The Primary Performance Measure of a batch job is usually throughput
Due To Large Volume Of Data It Usually Requires A Distributed Storage like HDFS
Tools : Hadoop MR , Spark
Stream Processing
Last updated