1
00:00:00,010 --> 00:00:03,100
‫Okay, so now let's talk about AWS Limits

2
00:00:03,100 --> 00:00:05,210
‫or called as well Quotas.

3
00:00:05,210 --> 00:00:06,910
‫So, you have two types of limits.

4
00:00:06,910 --> 00:00:08,650
‫You have API Rate Limits

5
00:00:08,650 --> 00:00:12,120
‫which is how many times you can call an AWS API in a row.

6
00:00:12,120 --> 00:00:16,610
‫So for example, the DescribeInstances API for Amazon EC2

7
00:00:16,610 --> 00:00:20,060
‫has a limit of 100 calls per seconds.

8
00:00:20,060 --> 00:00:22,610
‫And the GetObject on Amazon S3

9
00:00:22,610 --> 00:00:25,830
‫has a limit of 5,500 GET

10
00:00:25,830 --> 00:00:27,800
‫per second per prefix.

11
00:00:27,800 --> 00:00:29,600
‫So when we go over,

12
00:00:29,600 --> 00:00:31,920
‫we are getting into an Intermittent Error

13
00:00:31,920 --> 00:00:33,520
‫because we'll be throttled.

14
00:00:33,520 --> 00:00:36,090
‫And so we should use an Exponential Backoff Strategy,

15
00:00:36,090 --> 00:00:38,410
‫and I'll describe this in the very next slide.

16
00:00:38,410 --> 00:00:41,080
‫And in case we are getting these errors consistently

17
00:00:41,080 --> 00:00:44,400
‫because we are having heavy usage of our application

18
00:00:44,400 --> 00:00:47,240
‫and we consistently go over these limits,

19
00:00:47,240 --> 00:00:49,090
‫then instead we should request

20
00:00:49,090 --> 00:00:51,520
‫an API throttling limit increase

21
00:00:51,520 --> 00:00:53,170
‫to make sure that we can for example,

22
00:00:53,170 --> 00:00:55,145
‫issue more than 100 calls per second

23
00:00:55,145 --> 00:00:56,680
‫for DescribeInstances,

24
00:00:56,680 --> 00:00:58,330
‫maybe we need 300, okay?

25
00:00:58,330 --> 00:01:00,540
‫And so we would ask AWS for this.

26
00:01:00,540 --> 00:01:03,330
‫So this is for the API Rate Limits,

27
00:01:03,330 --> 00:01:05,710
‫and the other kind of limits we have is Service Quotas

28
00:01:05,710 --> 00:01:06,900
‫which is Service Limits,

29
00:01:06,900 --> 00:01:09,580
‫which is how many resources we can run of something.

30
00:01:09,580 --> 00:01:13,120
‫For example, for your On-Demand Standard Instances,

31
00:01:13,120 --> 00:01:16,670
‫we can run up to 1,152

32
00:01:16,670 --> 00:01:18,100
‫virtual CPUs,

33
00:01:18,100 --> 00:01:21,270
‫and if you want to run a more vCPUs in your accounts

34
00:01:21,270 --> 00:01:23,750
‫then you can request a Service Limit increase

35
00:01:23,750 --> 00:01:26,660
‫by just simply opening a tickets.

36
00:01:26,660 --> 00:01:29,490
‫And you can request a Service Quota increase

37
00:01:29,490 --> 00:01:32,060
‫by using this Service Quota API as well

38
00:01:32,060 --> 00:01:34,530
‫to do this programmatically, okay?

39
00:01:34,530 --> 00:01:36,020
‫So we have API Rate Limits

40
00:01:36,020 --> 00:01:39,330
‫and as well as Service Quotas for your resources.

41
00:01:39,330 --> 00:01:40,310
‫Now, what did I say?

42
00:01:40,310 --> 00:01:42,130
‫If we get Intermittent Errors

43
00:01:42,130 --> 00:01:44,080
‫then we should use Exponential Backoff.

44
00:01:44,960 --> 00:01:47,780
‫So when do we use Exponential Backoff?

45
00:01:47,780 --> 00:01:50,700
‫Well, when we get a ThrottlingException.

46
00:01:50,700 --> 00:01:52,200
‫And so this is an exam question,

47
00:01:52,200 --> 00:01:55,070
‫anytime you see that there is a ThrottlingException

48
00:01:55,070 --> 00:01:57,030
‫because we did too many API calls,

49
00:01:57,030 --> 00:01:59,770
‫usually the answer is to do Exponential Backoff.

50
00:01:59,770 --> 00:02:02,840
‫So if you're using the AWS SDK

51
00:02:02,840 --> 00:02:06,050
‫then this retry mechanism is already included

52
00:02:06,050 --> 00:02:08,120
‫into the SDK behavior.

53
00:02:08,120 --> 00:02:12,580
‫But if you are using the AWs API as-is yourself

54
00:02:12,580 --> 00:02:14,410
‫then you are the one responsible

55
00:02:14,410 --> 00:02:16,690
‫for implementing the Exponential Backup.

56
00:02:16,690 --> 00:02:19,210
‫And so a question in the exam may ask you

57
00:02:19,210 --> 00:02:21,590
‫which kind of errors should you retry

58
00:02:21,590 --> 00:02:23,440
‫on an Exponential Backoff?

59
00:02:23,440 --> 00:02:27,150
‫And if you are explaining, implementing your own SDK,

60
00:02:27,150 --> 00:02:29,720
‫your own custom HTTP calls

61
00:02:29,720 --> 00:02:32,240
‫then you must only implement the retries

62
00:02:32,240 --> 00:02:34,220
‫when you receive a server error

63
00:02:34,220 --> 00:02:37,814
‫that has error code that start with 500.

64
00:02:37,814 --> 00:02:40,690
‫So 503 or whatever,

65
00:02:40,690 --> 00:02:41,640
‫5XX,

66
00:02:41,640 --> 00:02:44,830
‫because these server errors and Throttling Errors

67
00:02:44,830 --> 00:02:46,770
‫are the ones that can be retried,

68
00:02:46,770 --> 00:02:51,370
‫but you should not implement a retry or Exponential Backoff

69
00:02:51,370 --> 00:02:53,610
‫on the 4XX client errors, okay?

70
00:02:53,610 --> 00:02:56,030
‫The 400 errors, because that means that something

71
00:02:56,030 --> 00:02:58,840
‫has been sent wrong by your clients,

72
00:02:58,840 --> 00:03:00,160
‫and so if you keep on retrying them

73
00:03:00,160 --> 00:03:03,230
‫you will keep on receiving the same errors.

74
00:03:03,230 --> 00:03:05,220
‫So how does Exponential Backoff work?

75
00:03:05,220 --> 00:03:08,690
‫Well, we are trying the first request say for one second,

76
00:03:08,690 --> 00:03:10,750
‫then we're going to double the time

77
00:03:10,750 --> 00:03:12,340
‫to wait until the next request.

78
00:03:12,340 --> 00:03:15,090
‫So two seconds maybe because we're doubling

79
00:03:15,090 --> 00:03:17,170
‫for the next retry we're going to double again.

80
00:03:17,170 --> 00:03:19,590
‫So four seconds and we double again.

81
00:03:19,590 --> 00:03:23,350
‫So for the next retry, we're going to go to eight seconds,

82
00:03:23,350 --> 00:03:24,950
‫and then for the next retry,

83
00:03:24,950 --> 00:03:27,480
‫we're going to go to 16 seconds.

84
00:03:27,480 --> 00:03:29,880
‫So the idea is with this Exponential Backoff,

85
00:03:29,880 --> 00:03:31,630
‫the more we retry the more we wait,

86
00:03:31,630 --> 00:03:34,710
‫and so if many clients are doing this at the same time

87
00:03:34,710 --> 00:03:35,920
‫the result of this is that

88
00:03:35,920 --> 00:03:38,710
‫there's going to be less and less load on your server,

89
00:03:38,710 --> 00:03:42,010
‫allowing your server to serve as many answers as possible.

90
00:03:42,010 --> 00:03:44,920
‫And this is the whole concept of Exponential Backoff.

91
00:03:44,920 --> 00:03:45,753
‫So that's it.

92
00:03:45,753 --> 00:03:46,800
‫I hope you liked this lecture,

93
00:03:46,800 --> 00:03:48,750
‫and I will see you in the next lecture.