1 00:00:00,250 --> 00:00:01,400 ‫Welcome to this session 2 00:00:01,400 --> 00:00:04,330 ‫on Elastic Load Balancing and Auto Scaling Groups. 3 00:00:04,330 --> 00:00:06,830 ‫This is a section where we really see the power 4 00:00:06,830 --> 00:00:09,850 ‫of the AWS cloud and the cloud in general, 5 00:00:09,850 --> 00:00:12,460 ‫and see how these new paradigms we saw 6 00:00:12,460 --> 00:00:13,670 ‫really help us shine 7 00:00:13,670 --> 00:00:16,330 ‫and make our application scales seamlessly. 8 00:00:16,330 --> 00:00:17,750 ‫So let's discuss the concept 9 00:00:17,750 --> 00:00:20,460 ‫of Scalability and High Availability. 10 00:00:20,460 --> 00:00:22,180 ‫So if your applications can scale, 11 00:00:22,180 --> 00:00:25,460 ‫that means that it can handle greater loads by adapting. 12 00:00:25,460 --> 00:00:28,100 ‫And there are two kinds of scalability in the cloud. 13 00:00:28,100 --> 00:00:31,580 ‫There is vertical scalability and horizontal scalability, 14 00:00:31,580 --> 00:00:33,340 ‫also called elasticity. 15 00:00:33,340 --> 00:00:35,350 ‫So the scalability is going to be linked, 16 00:00:35,350 --> 00:00:37,460 ‫but different to high availability. 17 00:00:37,460 --> 00:00:40,100 ‫And these things are going to be discussed in this lecture. 18 00:00:40,100 --> 00:00:41,530 ‫So let's do a deep dive. 19 00:00:41,530 --> 00:00:43,490 ‫And we'll be using a call center as an example. 20 00:00:43,490 --> 00:00:46,990 ‫So imagine, we have a call center and we just receive calls. 21 00:00:46,990 --> 00:00:49,950 ‫Now let's see what it means to be scalable in that case. 22 00:00:49,950 --> 00:00:52,415 ‫So first, let's deal with vertical scalability. 23 00:00:52,415 --> 00:00:55,880 ‫In AWS, when you are vertically scalable, 24 00:00:55,880 --> 00:00:59,560 ‫that means that you can increase the size of the instance. 25 00:00:59,560 --> 00:01:03,370 ‫So for our call center, say we have a junior operator 26 00:01:03,370 --> 00:01:06,040 ‫and say we were able to upgrade that operator, 27 00:01:06,040 --> 00:01:07,670 ‫we would get a senior operator. 28 00:01:07,670 --> 00:01:09,240 ‫And for example, the senior operator 29 00:01:09,240 --> 00:01:12,530 ‫can handle a lot more calls than the junior operator 30 00:01:12,530 --> 00:01:14,130 ‫because it's more experienced. 31 00:01:14,130 --> 00:01:16,900 ‫So this would be what vertical scalability looks like 32 00:01:16,900 --> 00:01:17,733 ‫in a call center. 33 00:01:17,733 --> 00:01:19,330 ‫If we could upgrade obviously, 34 00:01:19,330 --> 00:01:21,820 ‫a junior operator into a senior operator. 35 00:01:21,820 --> 00:01:23,720 ‫So in AWS, for example, 36 00:01:23,720 --> 00:01:26,370 ‫say your application runs on the t2.micro, 37 00:01:26,370 --> 00:01:30,050 ‫and to do a vertical scalability for that application, 38 00:01:30,050 --> 00:01:33,600 ‫that means that now we run our application on a t2.large. 39 00:01:33,600 --> 00:01:37,400 ‫So we've changed the size of our EC2 instance. 40 00:01:37,400 --> 00:01:39,770 ‫And vertical scalability is very common 41 00:01:39,770 --> 00:01:42,000 ‫when you have a non-distributed system, 42 00:01:42,000 --> 00:01:43,530 ‫for example, a database. 43 00:01:43,530 --> 00:01:46,580 ‫If you want to give more performance to your database, 44 00:01:46,580 --> 00:01:50,270 ‫you would just increase the size of your database. 45 00:01:50,270 --> 00:01:52,930 ‫But usually with vertical scalability, 46 00:01:52,930 --> 00:01:55,890 ‫there is a limit to how much you can vertically scale 47 00:01:55,890 --> 00:01:58,030 ‫and that is a limit of the hardware. 48 00:01:58,030 --> 00:01:59,120 ‫Even though nowadays 49 00:01:59,120 --> 00:02:01,480 ‫these limits can be very, very, very high. 50 00:02:01,480 --> 00:02:04,850 ‫Okay, next is horizontal scalability. 51 00:02:04,850 --> 00:02:05,700 ‫So that means that 52 00:02:05,700 --> 00:02:09,230 ‫now instead of increasing the size of your EC2 instance, 53 00:02:09,230 --> 00:02:11,200 ‫you increase the number of instances 54 00:02:11,200 --> 00:02:13,240 ‫or systems for your application. 55 00:02:13,240 --> 00:02:15,710 ‫So back into our call center example, 56 00:02:15,710 --> 00:02:17,140 ‫we have an operator, 57 00:02:17,140 --> 00:02:20,110 ‫and we want to do horizontal scalability for that operator, 58 00:02:20,110 --> 00:02:22,410 ‫that means we will add another operator. 59 00:02:22,410 --> 00:02:24,660 ‫And if we need to handle more calls, 60 00:02:24,660 --> 00:02:27,360 ‫we will add another operator, and so on. 61 00:02:27,360 --> 00:02:30,900 ‫So maybe we can scale horizontally from one operator 62 00:02:30,900 --> 00:02:33,060 ‫all the way to six operators. 63 00:02:33,060 --> 00:02:34,900 ‫So when you have horizontal scaling, 64 00:02:34,900 --> 00:02:37,030 ‫that implies as you can see on the right hand side, 65 00:02:37,030 --> 00:02:39,600 ‫that you need to have a distributed system. 66 00:02:39,600 --> 00:02:41,250 ‫And for call center, that makes sense. 67 00:02:41,250 --> 00:02:43,710 ‫You don't need these people to be talking constantly. 68 00:02:43,710 --> 00:02:46,540 ‫For a call center, each of the individual operators 69 00:02:46,540 --> 00:02:48,410 ‫can take calls on their own. 70 00:02:48,410 --> 00:02:50,440 ‫In AWS, or for web applications, 71 00:02:50,440 --> 00:02:52,130 ‫so this is going to be very common, 72 00:02:52,130 --> 00:02:54,750 ‫so if you have a web application or a modern application, 73 00:02:54,750 --> 00:02:58,510 ‫you usually design it with horizontal scalability in mind. 74 00:02:58,510 --> 00:03:01,320 ‫And it's super easy on AWS to scale, 75 00:03:01,320 --> 00:03:04,090 ‫thanks to Amazon EC2 and auto scaling groups, 76 00:03:04,090 --> 00:03:06,030 ‫and we'll see this in the section. 77 00:03:06,030 --> 00:03:08,850 ‫So now let's talk about High Availability. 78 00:03:08,850 --> 00:03:12,510 ‫And it goes hand in hand with horizontal scaling. 79 00:03:12,510 --> 00:03:13,630 ‫High availability means 80 00:03:13,630 --> 00:03:15,910 ‫that you are running your application and system 81 00:03:15,910 --> 00:03:19,570 ‫in at least two availability zones on AWS. 82 00:03:19,570 --> 00:03:21,870 ‫But for our call center, what does it mean? 83 00:03:21,870 --> 00:03:24,850 ‫That means that we have a call center in New York, 84 00:03:24,850 --> 00:03:28,650 ‫and maybe a second call center in San Francisco. 85 00:03:28,650 --> 00:03:32,260 ‫And somehow, if one of these call centers is down, 86 00:03:32,260 --> 00:03:35,180 ‫for example, say there's a power outage in New York, 87 00:03:35,180 --> 00:03:37,940 ‫then we can still take calls in San Francisco. 88 00:03:37,940 --> 00:03:40,160 ‫And so we are highly available. 89 00:03:40,160 --> 00:03:42,410 ‫Obviously, San Francisco will be more busy, 90 00:03:42,410 --> 00:03:47,010 ‫but we are at least surviving the disaster of a power outage 91 00:03:47,010 --> 00:03:48,590 ‫in one of our buildings. 92 00:03:48,590 --> 00:03:52,330 ‫So in AWS you use two availability zones, obviously. 93 00:03:52,330 --> 00:03:53,480 ‫And the goal of it 94 00:03:53,480 --> 00:03:56,810 ‫is to usually survive a data center loss, a disaster. 95 00:03:56,810 --> 00:03:58,700 ‫And in AWS, it could be an earthquake, 96 00:03:58,700 --> 00:04:01,620 ‫that could be a power outage that could a lot of things. 97 00:04:01,620 --> 00:04:03,580 ‫Okay, so to summarize, 98 00:04:03,580 --> 00:04:06,240 ‫High Availability and Scalability for EC2. 99 00:04:06,240 --> 00:04:07,900 ‫If we have vertical scaling, 100 00:04:07,900 --> 00:04:09,710 ‫that means that we're increasing the instance size. 101 00:04:09,710 --> 00:04:11,500 ‫So we can scale up if we're increasing it 102 00:04:11,500 --> 00:04:13,700 ‫or scale down if you're decreasing it. 103 00:04:13,700 --> 00:04:15,150 ‫So we can go all the way 104 00:04:15,150 --> 00:04:18,800 ‫from a T2.nano of 0.5 gigabytes of RAM, 105 00:04:18,800 --> 00:04:21,050 ‫and one vCPU, all the way to, 106 00:04:21,050 --> 00:04:22,740 ‫and obviously they can change over time, 107 00:04:22,740 --> 00:04:26,660 ‫a u-12tb1.metal, which is a very scary name, 108 00:04:26,660 --> 00:04:31,473 ‫but has 12.3 terabytes of RAM, and 448 vCPUs. 109 00:04:32,640 --> 00:04:34,790 ‫That is for vertical scalability. 110 00:04:34,790 --> 00:04:35,980 ‫Now for horizontal, 111 00:04:35,980 --> 00:04:38,280 ‫this is when you increase the number of instances, 112 00:04:38,280 --> 00:04:39,660 ‫it's called scaling out, 113 00:04:39,660 --> 00:04:41,320 ‫or when you decrease the number of instances, 114 00:04:41,320 --> 00:04:42,890 ‫it's called scaling in. 115 00:04:42,890 --> 00:04:44,810 ‫And for this, we'll be using an auto scaling group 116 00:04:44,810 --> 00:04:45,643 ‫and a load balancer. 117 00:04:45,643 --> 00:04:48,560 ‫This is the topic of this section, 118 00:04:48,560 --> 00:04:51,210 ‫and then when we have high availability, 119 00:04:51,210 --> 00:04:54,290 ‫that means that we run the instances of the same application 120 00:04:54,290 --> 00:04:56,380 ‫across multiple availability zones, 121 00:04:56,380 --> 00:04:58,920 ‫and this will be again leveraged by an auto scaling group 122 00:04:58,920 --> 00:05:00,570 ‫in multi AZ mode. 123 00:05:00,570 --> 00:05:02,633 ‫And a load balancer in multi AZ. 124 00:05:03,500 --> 00:05:04,333 ‫One last word. 125 00:05:04,333 --> 00:05:05,810 ‫So the exam will as could you figure out 126 00:05:05,810 --> 00:05:09,760 ‫is this scalability, is it elasticity, is it agility? 127 00:05:09,760 --> 00:05:11,530 ‫And so I just wanna give you some formal definitions, 128 00:05:11,530 --> 00:05:12,700 ‫so to summarize. 129 00:05:12,700 --> 00:05:16,090 ‫Scalability is the ability for a system 130 00:05:16,090 --> 00:05:18,030 ‫to accommodate a larger load 131 00:05:18,030 --> 00:05:20,820 ‫by making the hardware stronger or scaling up, 132 00:05:20,820 --> 00:05:22,860 ‫or by adding nodes scaling out. 133 00:05:22,860 --> 00:05:25,530 ‫This is when your application can scale. 134 00:05:25,530 --> 00:05:29,370 ‫Now, elasticity is something a bit more cloud native. 135 00:05:29,370 --> 00:05:31,880 ‫This is once a system is actually scalable, 136 00:05:31,880 --> 00:05:34,340 ‫so you can either scale up or scale it out. 137 00:05:34,340 --> 00:05:36,720 ‫Elasticity means that there will be some sort 138 00:05:36,720 --> 00:05:37,980 ‫of auto scaling in it, 139 00:05:37,980 --> 00:05:39,570 ‫so that the system can scale 140 00:05:39,570 --> 00:05:41,710 ‫based on the load that it's receiving. 141 00:05:41,710 --> 00:05:43,730 ‫And in this case, we're going to pay per use, 142 00:05:43,730 --> 00:05:45,760 ‫we're going to match the demand we're receiving 143 00:05:45,760 --> 00:05:47,260 ‫with a number of servers, 144 00:05:47,260 --> 00:05:49,550 ‫and obviously, we're going to pay just the right amount 145 00:05:49,550 --> 00:05:51,670 ‫so we will optimize cost. 146 00:05:51,670 --> 00:05:54,648 ‫So in AWS, elasticity is a key concept. 147 00:05:54,648 --> 00:05:58,290 ‫And agility is absolutely not related to scalability 148 00:05:58,290 --> 00:06:00,400 ‫or elasticities, it is a distractor. 149 00:06:00,400 --> 00:06:02,740 ‫But just to remind you what it means, 150 00:06:02,740 --> 00:06:04,870 ‫agility means that the new IT resources 151 00:06:04,870 --> 00:06:06,070 ‫are only a click away, 152 00:06:06,070 --> 00:06:07,650 ‫which means that you can reduce the time 153 00:06:07,650 --> 00:06:10,020 ‫to make these resources available to your developers 154 00:06:10,020 --> 00:06:11,540 ‫from weeks to just minutes. 155 00:06:11,540 --> 00:06:13,610 ‫And so your organization is more agile, 156 00:06:13,610 --> 00:06:15,180 ‫it can iterate more quickly 157 00:06:15,180 --> 00:06:17,770 ‫and you are going faster, okay? 158 00:06:17,770 --> 00:06:20,280 ‫So that's it for this section on introduction. 159 00:06:20,280 --> 00:06:21,300 ‫I hope you liked it 160 00:06:21,300 --> 00:06:23,250 ‫and I will see you in the next lecture.