1 00:00:00,150 --> 00:00:02,040 So now let's talk about Placement Groups. 2 00:00:02,040 --> 00:00:03,830 Placement groups are a little bit more advanced 3 00:00:03,830 --> 00:00:06,410 and we wanna use them once we want to have control 4 00:00:06,410 --> 00:00:09,800 over how our EC2 instances are going to be placed 5 00:00:09,800 --> 00:00:12,480 within the AWS infrastructure. 6 00:00:12,480 --> 00:00:14,640 So that strategy can be defined using 7 00:00:14,640 --> 00:00:15,970 these placement groups. 8 00:00:15,970 --> 00:00:19,920 So we don't get direct interaction with the hardware of AWS, 9 00:00:19,920 --> 00:00:23,440 but we let AWS know how we would like our EC2 instance 10 00:00:23,440 --> 00:00:25,860 to be placed compared to one another. 11 00:00:25,860 --> 00:00:27,720 So when you create a placement group, 12 00:00:27,720 --> 00:00:30,470 you have three strategies available for you. 13 00:00:30,470 --> 00:00:32,020 You have the cluster placement group 14 00:00:32,020 --> 00:00:34,900 in which your instances will be grouped together 15 00:00:34,900 --> 00:00:37,730 in a low-latency hardware setup 16 00:00:37,730 --> 00:00:39,810 within a single availability zone. 17 00:00:39,810 --> 00:00:42,240 This is going to give you high performance, but high risk. 18 00:00:42,240 --> 00:00:44,160 We'll see this in details in a second. 19 00:00:44,160 --> 00:00:46,370 Spread, means that your instances 20 00:00:46,370 --> 00:00:49,550 are going to be spread across different hardware. 21 00:00:49,550 --> 00:00:50,990 And that is a restriction on this, 22 00:00:50,990 --> 00:00:53,640 that means you can only have seven EC2 instance 23 00:00:53,640 --> 00:00:56,840 per placement group that's spread per AZ. 24 00:00:56,840 --> 00:00:59,230 So you would use a spread placement group 25 00:00:59,230 --> 00:01:01,620 when you have critical applications. 26 00:01:01,620 --> 00:01:04,010 Finally, the last one is a new kind of placement group 27 00:01:04,010 --> 00:01:06,480 that is really helpful, it's called partition. 28 00:01:06,480 --> 00:01:08,380 It's similar to the spread, 29 00:01:08,380 --> 00:01:10,770 meaning that you want to spread your instances 30 00:01:10,770 --> 00:01:13,600 but here they're spread across many different partitions. 31 00:01:13,600 --> 00:01:15,700 And these partitions rely on different 32 00:01:15,700 --> 00:01:19,070 sets of racks of hardware within an AZ. 33 00:01:19,070 --> 00:01:21,110 What does that mean is that, they're still spread, 34 00:01:21,110 --> 00:01:23,950 but they're not isolated one from another failure, 35 00:01:23,950 --> 00:01:25,710 but a partition should be isolated 36 00:01:25,710 --> 00:01:27,790 from another partition of failure. 37 00:01:27,790 --> 00:01:29,310 The idea with this, is that you can scale 38 00:01:29,310 --> 00:01:31,940 through hundreds of EC2 instances per group 39 00:01:31,940 --> 00:01:34,010 and that allows you to run applications 40 00:01:34,010 --> 00:01:37,010 such as Hadoop, Cassandra, or Kafka. 41 00:01:37,010 --> 00:01:38,350 Now let's have a look into 42 00:01:38,350 --> 00:01:40,450 each of these placement groups in details. 43 00:01:41,920 --> 00:01:45,440 For cluster, that means that's all our EC2 instances 44 00:01:45,440 --> 00:01:48,380 are on the same rack, which means same hardware 45 00:01:48,380 --> 00:01:50,780 and it's in the same availability zone. 46 00:01:50,780 --> 00:01:52,730 So as you can see, all these instances 47 00:01:52,730 --> 00:01:54,340 are on the same hardware. 48 00:01:54,340 --> 00:01:55,890 And so what would we do this? 49 00:01:55,890 --> 00:01:58,290 Well basically, we would place them on the same rack 50 00:01:58,290 --> 00:02:00,520 because we want to have a cluster, 51 00:02:00,520 --> 00:02:02,260 we want to have super low latency, 52 00:02:02,260 --> 00:02:05,850 and want to have maybe a 10 gigabytes speed network. 53 00:02:05,850 --> 00:02:08,160 So that means that we have an amazing network, right? 54 00:02:08,160 --> 00:02:12,370 But as a drawback of this great network that we get, 55 00:02:12,370 --> 00:02:15,040 we get the con that if the rack fails, 56 00:02:15,040 --> 00:02:17,020 if there's a failure on the hardware 57 00:02:17,020 --> 00:02:20,930 then all the EC2 instances will fail at the same time. 58 00:02:20,930 --> 00:02:23,200 So we have increased our risk 59 00:02:23,200 --> 00:02:25,590 to have a failure that's gonna be propagated 60 00:02:25,590 --> 00:02:27,210 across our entire stack. 61 00:02:27,210 --> 00:02:28,540 So when would we even use this? 62 00:02:28,540 --> 00:02:32,030 What's the benefit you're getting this increased risk. 63 00:02:32,030 --> 00:02:33,450 Well, we get great network. 64 00:02:33,450 --> 00:02:35,120 And so for this, that means that we can have 65 00:02:35,120 --> 00:02:38,250 a big data job that will need to complete very fast. 66 00:02:38,250 --> 00:02:40,010 Or maybe we have a requirement 67 00:02:40,010 --> 00:02:42,180 to have an application that needs extremely low latency 68 00:02:42,180 --> 00:02:44,170 and high network throughputs, 69 00:02:44,170 --> 00:02:48,117 and we're willing to take on the risk to have this failure. 70 00:02:48,117 --> 00:02:50,060 And so this is something you have to realize, 71 00:02:50,060 --> 00:02:51,980 it's not for every kind of application 72 00:02:51,980 --> 00:02:53,050 but if your application needs 73 00:02:53,050 --> 00:02:54,920 super high bandwidth and low latency, 74 00:02:54,920 --> 00:02:56,560 placement groups is kind of a nice 75 00:02:56,560 --> 00:02:57,450 the cluster placement group 76 00:02:57,450 --> 00:02:59,810 is kind of a nice way of doing it. 77 00:02:59,810 --> 00:03:01,620 Now spread is the complete opposites. 78 00:03:01,620 --> 00:03:04,260 In spread we want to minimize the failure risk. 79 00:03:04,260 --> 00:03:05,500 And so in this case, 80 00:03:05,500 --> 00:03:07,810 when we asked for a spread placement group, 81 00:03:07,810 --> 00:03:10,930 all the EC2 instances are going to be located 82 00:03:10,930 --> 00:03:12,330 on different hardware. 83 00:03:12,330 --> 00:03:13,180 So as you can see here, 84 00:03:13,180 --> 00:03:16,050 we have three AZ and we have six EC2 85 00:03:16,050 --> 00:03:19,520 and each EC2 instance is on a different hardware. 86 00:03:19,520 --> 00:03:20,610 So what does that mean? 87 00:03:20,610 --> 00:03:23,232 Well what we get, is that's we can span across multiple AZ 88 00:03:23,232 --> 00:03:26,910 and there is a reduced risk of simultaneous failure. 89 00:03:26,910 --> 00:03:30,180 Why? Well because if my hardware one fails, 90 00:03:30,180 --> 00:03:32,520 I'm pretty sure my hardware two will not fail. 91 00:03:32,520 --> 00:03:35,640 And so we've separated the risk of my two instances 92 00:03:35,640 --> 00:03:38,920 in the Us-east-1a, to fail at the same time. 93 00:03:38,920 --> 00:03:40,600 And so that's the benefit from it. 94 00:03:40,600 --> 00:03:43,500 The con is that from this configuration, 95 00:03:43,500 --> 00:03:46,130 we're limited to seven instances per AZ, 96 00:03:46,130 --> 00:03:47,300 per placement group. 97 00:03:47,300 --> 00:03:49,872 So there's a limit to how big your placement group can be. 98 00:03:49,872 --> 00:03:51,910 And so you need to have a application 99 00:03:51,910 --> 00:03:55,030 that's gonna be of good size, but not too big. 100 00:03:55,030 --> 00:03:57,529 The use case would be an application 101 00:03:57,529 --> 00:03:58,950 that needs to maximize high availability 102 00:03:58,950 --> 00:04:00,050 and reduce the risk. 103 00:04:00,050 --> 00:04:02,740 And in general, for critical applications 104 00:04:02,740 --> 00:04:04,330 where your instance failures 105 00:04:04,330 --> 00:04:06,820 must be isolated from one another. 106 00:04:06,820 --> 00:04:10,130 Remember here, we have a limitation of seven instances 107 00:04:10,130 --> 00:04:12,023 per AZ per placement group. 108 00:04:12,910 --> 00:04:15,090 Now for the partition placement group, 109 00:04:15,090 --> 00:04:18,170 we can have instances spread across partitions 110 00:04:18,170 --> 00:04:20,329 in multiple availability zones. 111 00:04:20,329 --> 00:04:22,800 So we can have up to seven partitions per AZ. 112 00:04:22,800 --> 00:04:24,340 So in this example, we have partition one 113 00:04:24,340 --> 00:04:27,560 and partition two in us-east-1a, 114 00:04:27,560 --> 00:04:30,720 and partitioner three in us-east-1b. 115 00:04:30,720 --> 00:04:33,720 And on each partition, you could have many EC2 instances. 116 00:04:33,720 --> 00:04:35,160 So in the first one, I have four 117 00:04:35,160 --> 00:04:36,210 and the second one I have four 118 00:04:36,210 --> 00:04:38,420 and the third one I have four as well. 119 00:04:38,420 --> 00:04:40,410 So why do we use a partition placement group? 120 00:04:40,410 --> 00:04:44,170 Well, each partition represents a rack in AWS. 121 00:04:44,170 --> 00:04:45,500 And so by having many partitions 122 00:04:45,500 --> 00:04:47,530 you're making sure that your instances 123 00:04:47,530 --> 00:04:50,180 are distributed across many hardware racks, 124 00:04:50,180 --> 00:04:51,200 and so therefore, 125 00:04:51,200 --> 00:04:54,400 they're safe from a rack failure from one another. 126 00:04:54,400 --> 00:04:56,872 So you can have up to seven partitions per AZ 127 00:04:56,872 --> 00:04:59,420 and these partitions can span 128 00:04:59,420 --> 00:05:03,600 across multiple availability zones in the same region. 129 00:05:03,600 --> 00:05:07,080 You can get up to hundreds of EC2 instances with a setup. 130 00:05:07,080 --> 00:05:08,190 So this is the difference 131 00:05:08,190 --> 00:05:11,000 versus the spread type of placement group. 132 00:05:11,000 --> 00:05:13,443 And as I said, the instances and the partition 133 00:05:13,443 --> 00:05:16,482 do not share the same hardware physical rack 134 00:05:16,482 --> 00:05:19,240 with the instances in the other partitions, 135 00:05:19,240 --> 00:05:23,760 and therefore, each partition is isolated from failure. 136 00:05:23,760 --> 00:05:25,570 So that means that's, yes if one goes down, 137 00:05:25,570 --> 00:05:28,230 if partition goes down, the partition number two, 138 00:05:28,230 --> 00:05:30,760 then partition number one should be fine. 139 00:05:30,760 --> 00:05:35,122 And to know which partition these EC2 instances are on, 140 00:05:35,122 --> 00:05:38,372 there is an option to access this information 141 00:05:38,372 --> 00:05:41,472 with using the metadata service. 142 00:05:41,472 --> 00:05:44,490 So when would you use a partition placement group? 143 00:05:44,490 --> 00:05:45,960 Well, when you have an application 144 00:05:45,960 --> 00:05:48,940 that it can be partition aware to distribute the data 145 00:05:48,940 --> 00:05:50,950 and your servers across partitions. 146 00:05:50,950 --> 00:05:53,150 And usually, the use cases are going to be 147 00:05:53,150 --> 00:05:56,780 big data applications, which are partition aware, 148 00:05:56,780 --> 00:06:01,780 such using HDFS, Hbase, Cassandra and Apache Kafka.