1 00:00:00,220 --> 00:00:02,820 ‫So now let's talk about a service that is named after 2 00:00:02,820 --> 00:00:05,400 ‫what it does it is AWS Batch. 3 00:00:05,400 --> 00:00:08,350 ‫So batch is a fully managed batch processing service 4 00:00:08,350 --> 00:00:11,330 ‫that can allow you to do batch processing at any scale. 5 00:00:11,330 --> 00:00:12,500 ‫And with the batch service, 6 00:00:12,500 --> 00:00:14,480 ‫you can efficiently run hundreds of thousands 7 00:00:14,480 --> 00:00:18,010 ‫of computing batch jobs on AWS very easily. 8 00:00:18,010 --> 00:00:19,510 ‫So what is a batch job? 9 00:00:19,510 --> 00:00:23,060 ‫Well, a batch job is a job that has a start and an end. 10 00:00:23,060 --> 00:00:25,370 ‫And that is opposed to say, a continuous 11 00:00:25,370 --> 00:00:28,210 ‫or a streaming job that really doesn't ever end 12 00:00:28,210 --> 00:00:29,650 ‫it's always running. 13 00:00:29,650 --> 00:00:30,590 ‫But a batch job say, 14 00:00:30,590 --> 00:00:34,470 ‫for example, starts at 1 a.m. and finishes at 3 a.m. 15 00:00:34,470 --> 00:00:38,150 ‫So a batch job has a point of time when it happens 16 00:00:38,150 --> 00:00:40,300 ‫and so the batch service will 17 00:00:40,300 --> 00:00:44,060 ‫dynamically launch EC2 instances or Spot Instances 18 00:00:44,060 --> 00:00:45,930 ‫to accommodate with the load 19 00:00:45,930 --> 00:00:48,500 ‫that you have to run these batch jobs. 20 00:00:48,500 --> 00:00:51,690 ‫So batch will provision the right amount of compute 21 00:00:51,690 --> 00:00:54,830 ‫and memory for you to deal with your batch queue. 22 00:00:54,830 --> 00:00:57,570 ‫And you just submit or scheduled batch jobs 23 00:00:57,570 --> 00:01:01,580 ‫into the batch queue and the batch service does the rest. 24 00:01:01,580 --> 00:01:03,050 ‫Now how do you define a batch job? 25 00:01:03,050 --> 00:01:05,470 ‫Well, it is simply a Docker image 26 00:01:05,470 --> 00:01:08,550 ‫and a test definition that you run on the ECS service. 27 00:01:08,550 --> 00:01:10,760 ‫So this is pretty much saying that anything 28 00:01:10,760 --> 00:01:13,150 ‫that can run on ECS can run on batch. 29 00:01:13,150 --> 00:01:15,200 ‫And this is going to be very helpful to use batch 30 00:01:15,200 --> 00:01:16,610 ‫to run these batch jobs. 31 00:01:16,610 --> 00:01:18,700 ‫And because it automatically scales 32 00:01:18,700 --> 00:01:21,960 ‫the right number of ECS2 instances or Spot Instances, 33 00:01:21,960 --> 00:01:23,260 ‫to do these jobs, 34 00:01:23,260 --> 00:01:25,450 ‫then you get lots of cost optimizations 35 00:01:25,450 --> 00:01:27,860 ‫and you focus a lot less on the infrastructure, 36 00:01:27,860 --> 00:01:30,090 ‫you just focus on your batch jobs. 37 00:01:30,090 --> 00:01:32,280 ‫So this should be more than enough for going to the exam, 38 00:01:32,280 --> 00:01:35,410 ‫but I just want to show you a small diagram that I made. 39 00:01:35,410 --> 00:01:38,710 ‫So for example, say we wanted to process images submitted 40 00:01:38,710 --> 00:01:41,740 ‫by users into Amazon S3 in a batch way. 41 00:01:41,740 --> 00:01:44,960 ‫So image will be put into Amazon S3, 42 00:01:44,960 --> 00:01:47,850 ‫and this will trigger a batch job. 43 00:01:47,850 --> 00:01:49,820 ‫And so batch will automatically have 44 00:01:49,820 --> 00:01:52,580 ‫an ECS cluster made of EC2 instances, 45 00:01:52,580 --> 00:01:54,880 ‫or Spot Instances and batch would make sure that 46 00:01:54,880 --> 00:01:56,900 ‫you have the right amount of instances 47 00:01:56,900 --> 00:01:58,870 ‫to accommodate the load of batch jobs 48 00:01:58,870 --> 00:02:00,340 ‫you have in the batch queue. 49 00:02:00,340 --> 00:02:02,880 ‫And then these instances will be running 50 00:02:02,880 --> 00:02:05,940 ‫your Docker images that will be doing your job. 51 00:02:05,940 --> 00:02:08,010 ‫And then maybe that job will be to insert 52 00:02:08,010 --> 00:02:09,190 ‫the processed object. 53 00:02:09,190 --> 00:02:11,190 ‫Maybe it's a filter on top of the image 54 00:02:11,190 --> 00:02:13,560 ‫into another Amazon S3 buckets. 55 00:02:13,560 --> 00:02:14,550 ‫So the question you may have is 56 00:02:14,550 --> 00:02:16,250 ‫what is the difference between batch and Lambda 57 00:02:16,250 --> 00:02:17,950 ‫because they look similar? 58 00:02:17,950 --> 00:02:19,710 ‫So Lambda has a time limit, 59 00:02:19,710 --> 00:02:21,090 ‫it's 15 minutes, 60 00:02:21,090 --> 00:02:24,230 ‫and you only get access to a few programming languages. 61 00:02:24,230 --> 00:02:27,680 ‫On top of it, you have limited temporary disk space 62 00:02:27,680 --> 00:02:29,080 ‫if you want to run your jobs, 63 00:02:29,080 --> 00:02:30,880 ‫and it's going to be serverless, 64 00:02:30,880 --> 00:02:32,320 ‫whereas batch is very different. 65 00:02:32,320 --> 00:02:33,750 ‫So batch has no time limit, 66 00:02:33,750 --> 00:02:36,540 ‫because it relies on EC2 instances. 67 00:02:36,540 --> 00:02:38,600 ‫It's any runtime that you want as long 68 00:02:38,600 --> 00:02:40,840 ‫as you package it as a Docker image. 69 00:02:40,840 --> 00:02:42,320 ‫And for storage, 70 00:02:42,320 --> 00:02:45,700 ‫you rely on the storage that comes with an EC2 instance. 71 00:02:45,700 --> 00:02:47,070 ‫So it could be an EBS volume, 72 00:02:47,070 --> 00:02:49,637 ‫or an EC2 instance store for disk space, 73 00:02:49,637 --> 00:02:52,810 ‫which can be a lot more than for Lambda functions. 74 00:02:52,810 --> 00:02:55,460 ‫And then finally, batch is not a serverless service. 75 00:02:55,460 --> 00:02:56,450 ‫It's a managed service, 76 00:02:56,450 --> 00:02:59,670 ‫but it relies on actual EC2 instances being created. 77 00:02:59,670 --> 00:03:03,330 ‫But these EC2 instances are managed by AWS 78 00:03:03,330 --> 00:03:04,230 ‫so we don't have to worry 79 00:03:04,230 --> 00:03:06,360 ‫about the auto scaling and so on. 80 00:03:06,360 --> 00:03:07,193 ‫So I hope that was helpful 81 00:03:07,193 --> 00:03:08,910 ‫and I will see you in the next lecture.