Batch Vs Stream Processing

Batch Processing

Intro

  • Ad-Hoc or Scheduled processing

  • Data Needs To Be Collected Before Hand Before Processing

  • Have Prior Knowledge About The Compute / Storage Resources

  • Batch processing is lengthy and is meant for large quantities of information that aren’t time-sensitive

  • Can Be Used On Compute Intensive Tasks

  • The Primary Performance Measure of a batch job is usually throughput

  • Due To Large Volume Of Data It Usually Requires A Distributed Storage like HDFS

  • Tools : Hadoop MR , Spark

Stream Processing

Last updated