1 00:00:00,010 --> 00:00:03,100 ‫Okay, so now let's talk about AWS Limits 2 00:00:03,100 --> 00:00:05,210 ‫or called as well Quotas. 3 00:00:05,210 --> 00:00:06,910 ‫So, you have two types of limits. 4 00:00:06,910 --> 00:00:08,650 ‫You have API Rate Limits 5 00:00:08,650 --> 00:00:12,120 ‫which is how many times you can call an AWS API in a row. 6 00:00:12,120 --> 00:00:16,610 ‫So for example, the DescribeInstances API for Amazon EC2 7 00:00:16,610 --> 00:00:20,060 ‫has a limit of 100 calls per seconds. 8 00:00:20,060 --> 00:00:22,610 ‫And the GetObject on Amazon S3 9 00:00:22,610 --> 00:00:25,830 ‫has a limit of 5,500 GET 10 00:00:25,830 --> 00:00:27,800 ‫per second per prefix. 11 00:00:27,800 --> 00:00:29,600 ‫So when we go over, 12 00:00:29,600 --> 00:00:31,920 ‫we are getting into an Intermittent Error 13 00:00:31,920 --> 00:00:33,520 ‫because we'll be throttled. 14 00:00:33,520 --> 00:00:36,090 ‫And so we should use an Exponential Backoff Strategy, 15 00:00:36,090 --> 00:00:38,410 ‫and I'll describe this in the very next slide. 16 00:00:38,410 --> 00:00:41,080 ‫And in case we are getting these errors consistently 17 00:00:41,080 --> 00:00:44,400 ‫because we are having heavy usage of our application 18 00:00:44,400 --> 00:00:47,240 ‫and we consistently go over these limits, 19 00:00:47,240 --> 00:00:49,090 ‫then instead we should request 20 00:00:49,090 --> 00:00:51,520 ‫an API throttling limit increase 21 00:00:51,520 --> 00:00:53,170 ‫to make sure that we can for example, 22 00:00:53,170 --> 00:00:55,145 ‫issue more than 100 calls per second 23 00:00:55,145 --> 00:00:56,680 ‫for DescribeInstances, 24 00:00:56,680 --> 00:00:58,330 ‫maybe we need 300, okay? 25 00:00:58,330 --> 00:01:00,540 ‫And so we would ask AWS for this. 26 00:01:00,540 --> 00:01:03,330 ‫So this is for the API Rate Limits, 27 00:01:03,330 --> 00:01:05,710 ‫and the other kind of limits we have is Service Quotas 28 00:01:05,710 --> 00:01:06,900 ‫which is Service Limits, 29 00:01:06,900 --> 00:01:09,580 ‫which is how many resources we can run of something. 30 00:01:09,580 --> 00:01:13,120 ‫For example, for your On-Demand Standard Instances, 31 00:01:13,120 --> 00:01:16,670 ‫we can run up to 1,152 32 00:01:16,670 --> 00:01:18,100 ‫virtual CPUs, 33 00:01:18,100 --> 00:01:21,270 ‫and if you want to run a more vCPUs in your accounts 34 00:01:21,270 --> 00:01:23,750 ‫then you can request a Service Limit increase 35 00:01:23,750 --> 00:01:26,660 ‫by just simply opening a tickets. 36 00:01:26,660 --> 00:01:29,490 ‫And you can request a Service Quota increase 37 00:01:29,490 --> 00:01:32,060 ‫by using this Service Quota API as well 38 00:01:32,060 --> 00:01:34,530 ‫to do this programmatically, okay? 39 00:01:34,530 --> 00:01:36,020 ‫So we have API Rate Limits 40 00:01:36,020 --> 00:01:39,330 ‫and as well as Service Quotas for your resources. 41 00:01:39,330 --> 00:01:40,310 ‫Now, what did I say? 42 00:01:40,310 --> 00:01:42,130 ‫If we get Intermittent Errors 43 00:01:42,130 --> 00:01:44,080 ‫then we should use Exponential Backoff. 44 00:01:44,960 --> 00:01:47,780 ‫So when do we use Exponential Backoff? 45 00:01:47,780 --> 00:01:50,700 ‫Well, when we get a ThrottlingException. 46 00:01:50,700 --> 00:01:52,200 ‫And so this is an exam question, 47 00:01:52,200 --> 00:01:55,070 ‫anytime you see that there is a ThrottlingException 48 00:01:55,070 --> 00:01:57,030 ‫because we did too many API calls, 49 00:01:57,030 --> 00:01:59,770 ‫usually the answer is to do Exponential Backoff. 50 00:01:59,770 --> 00:02:02,840 ‫So if you're using the AWS SDK 51 00:02:02,840 --> 00:02:06,050 ‫then this retry mechanism is already included 52 00:02:06,050 --> 00:02:08,120 ‫into the SDK behavior. 53 00:02:08,120 --> 00:02:12,580 ‫But if you are using the AWs API as-is yourself 54 00:02:12,580 --> 00:02:14,410 ‫then you are the one responsible 55 00:02:14,410 --> 00:02:16,690 ‫for implementing the Exponential Backup. 56 00:02:16,690 --> 00:02:19,210 ‫And so a question in the exam may ask you 57 00:02:19,210 --> 00:02:21,590 ‫which kind of errors should you retry 58 00:02:21,590 --> 00:02:23,440 ‫on an Exponential Backoff? 59 00:02:23,440 --> 00:02:27,150 ‫And if you are explaining, implementing your own SDK, 60 00:02:27,150 --> 00:02:29,720 ‫your own custom HTTP calls 61 00:02:29,720 --> 00:02:32,240 ‫then you must only implement the retries 62 00:02:32,240 --> 00:02:34,220 ‫when you receive a server error 63 00:02:34,220 --> 00:02:37,814 ‫that has error code that start with 500. 64 00:02:37,814 --> 00:02:40,690 ‫So 503 or whatever, 65 00:02:40,690 --> 00:02:41,640 ‫5XX, 66 00:02:41,640 --> 00:02:44,830 ‫because these server errors and Throttling Errors 67 00:02:44,830 --> 00:02:46,770 ‫are the ones that can be retried, 68 00:02:46,770 --> 00:02:51,370 ‫but you should not implement a retry or Exponential Backoff 69 00:02:51,370 --> 00:02:53,610 ‫on the 4XX client errors, okay? 70 00:02:53,610 --> 00:02:56,030 ‫The 400 errors, because that means that something 71 00:02:56,030 --> 00:02:58,840 ‫has been sent wrong by your clients, 72 00:02:58,840 --> 00:03:00,160 ‫and so if you keep on retrying them 73 00:03:00,160 --> 00:03:03,230 ‫you will keep on receiving the same errors. 74 00:03:03,230 --> 00:03:05,220 ‫So how does Exponential Backoff work? 75 00:03:05,220 --> 00:03:08,690 ‫Well, we are trying the first request say for one second, 76 00:03:08,690 --> 00:03:10,750 ‫then we're going to double the time 77 00:03:10,750 --> 00:03:12,340 ‫to wait until the next request. 78 00:03:12,340 --> 00:03:15,090 ‫So two seconds maybe because we're doubling 79 00:03:15,090 --> 00:03:17,170 ‫for the next retry we're going to double again. 80 00:03:17,170 --> 00:03:19,590 ‫So four seconds and we double again. 81 00:03:19,590 --> 00:03:23,350 ‫So for the next retry, we're going to go to eight seconds, 82 00:03:23,350 --> 00:03:24,950 ‫and then for the next retry, 83 00:03:24,950 --> 00:03:27,480 ‫we're going to go to 16 seconds. 84 00:03:27,480 --> 00:03:29,880 ‫So the idea is with this Exponential Backoff, 85 00:03:29,880 --> 00:03:31,630 ‫the more we retry the more we wait, 86 00:03:31,630 --> 00:03:34,710 ‫and so if many clients are doing this at the same time 87 00:03:34,710 --> 00:03:35,920 ‫the result of this is that 88 00:03:35,920 --> 00:03:38,710 ‫there's going to be less and less load on your server, 89 00:03:38,710 --> 00:03:42,010 ‫allowing your server to serve as many answers as possible. 90 00:03:42,010 --> 00:03:44,920 ‫And this is the whole concept of Exponential Backoff. 91 00:03:44,920 --> 00:03:45,753 ‫So that's it. 92 00:03:45,753 --> 00:03:46,800 ‫I hope you liked this lecture, 93 00:03:46,800 --> 00:03:48,750 ‫and I will see you in the next lecture.