1 00:00:01,410 --> 00:00:07,920 Let's talk about Apache Spark an Apache flank you see before Apache Spark. 2 00:00:07,920 --> 00:00:17,780 Usually people used Hadoop map reduce to process everything spark came along and actually improved the 3 00:00:17,840 --> 00:00:26,960 map produce and who dupe by doing something called in memory processing which essentially allowed spark 4 00:00:27,230 --> 00:00:37,670 to run processing jobs much much faster than map reduce so Apache Spark became really really popular. 5 00:00:37,670 --> 00:00:41,600 And it's probably the go to batch processing framework. 6 00:00:41,630 --> 00:00:50,330 So if you want to process a lot of data well you can use Hadoop to store that data data and a DFS and 7 00:00:50,330 --> 00:00:59,900 use Apache Spark to run EDL jobs like extract transform load to clean and transform that data now. 8 00:00:59,930 --> 00:01:07,070 Up until now we had batch processing which essentially means give me a chunk of data and I'll process 9 00:01:07,100 --> 00:01:09,300 this data over a bit of time. 10 00:01:09,350 --> 00:01:15,890 Usually you'd run a batch processing job at the end of the night and in the morning after a couple hours 11 00:01:16,190 --> 00:01:17,020 the job is done. 12 00:01:17,450 --> 00:01:25,070 But as of a few years ago the idea of real time processing started to happen things like spark streaming 13 00:01:25,220 --> 00:01:32,060 came out which almost made real time processing possible that is instead of batch processing where you 14 00:01:32,060 --> 00:01:38,710 waited until the end of the night to run some jobs on amount of data instead every time data comes in. 15 00:01:38,750 --> 00:01:48,210 You process that data in this base Apache flank really took on the charge it offers real time stream 16 00:01:48,210 --> 00:01:49,220 processing. 17 00:01:49,230 --> 00:01:55,110 Now all those sparks still popular tools like Apache flank are certain to become popular. 18 00:01:55,110 --> 00:02:00,710 Now this idea of streaming data and stream processing is becoming more and more popular. 19 00:02:00,810 --> 00:02:05,220 It's still new but there are some exciting things happening in this landscape. 20 00:02:05,220 --> 00:02:07,100 So let's talk about that in the next video.