1
00:00:00,150 --> 00:00:02,040
So now let's talk about Placement Groups.

2
00:00:02,040 --> 00:00:03,830
Placement groups are a little bit more advanced

3
00:00:03,830 --> 00:00:06,410
and we wanna use them once we want to have control

4
00:00:06,410 --> 00:00:09,800
over how our EC2 instances are going to be placed

5
00:00:09,800 --> 00:00:12,480
within the AWS infrastructure.

6
00:00:12,480 --> 00:00:14,640
So that strategy can be defined using

7
00:00:14,640 --> 00:00:15,970
these placement groups.

8
00:00:15,970 --> 00:00:19,920
So we don't get direct interaction with the hardware of AWS,

9
00:00:19,920 --> 00:00:23,440
but we let AWS know how we would like our EC2 instance

10
00:00:23,440 --> 00:00:25,860
to be placed compared to one another.

11
00:00:25,860 --> 00:00:27,720
So when you create a placement group,

12
00:00:27,720 --> 00:00:30,470
you have three strategies available for you.

13
00:00:30,470 --> 00:00:32,020
You have the cluster placement group

14
00:00:32,020 --> 00:00:34,900
in which your instances will be grouped together

15
00:00:34,900 --> 00:00:37,730
in a low-latency hardware setup

16
00:00:37,730 --> 00:00:39,810
within a single availability zone.

17
00:00:39,810 --> 00:00:42,240
This is going to give you high performance, but high risk.

18
00:00:42,240 --> 00:00:44,160
We'll see this in details in a second.

19
00:00:44,160 --> 00:00:46,370
Spread, means that your instances

20
00:00:46,370 --> 00:00:49,550
are going to be spread across different hardware.

21
00:00:49,550 --> 00:00:50,990
And that is a restriction on this,

22
00:00:50,990 --> 00:00:53,640
that means you can only have seven EC2 instance

23
00:00:53,640 --> 00:00:56,840
per placement group that's spread per AZ.

24
00:00:56,840 --> 00:00:59,230
So you would use a spread placement group

25
00:00:59,230 --> 00:01:01,620
when you have critical applications.

26
00:01:01,620 --> 00:01:04,010
Finally, the last one is a new kind of placement group

27
00:01:04,010 --> 00:01:06,480
that is really helpful, it's called partition.

28
00:01:06,480 --> 00:01:08,380
It's similar to the spread,

29
00:01:08,380 --> 00:01:10,770
meaning that you want to spread your instances

30
00:01:10,770 --> 00:01:13,600
but here they're spread across many different partitions.

31
00:01:13,600 --> 00:01:15,700
And these partitions rely on different

32
00:01:15,700 --> 00:01:19,070
sets of racks of hardware within an AZ.

33
00:01:19,070 --> 00:01:21,110
What does that mean is that, they're still spread,

34
00:01:21,110 --> 00:01:23,950
but they're not isolated one from another failure,

35
00:01:23,950 --> 00:01:25,710
but a partition should be isolated

36
00:01:25,710 --> 00:01:27,790
from another partition of failure.

37
00:01:27,790 --> 00:01:29,310
The idea with this, is that you can scale

38
00:01:29,310 --> 00:01:31,940
through hundreds of EC2 instances per group

39
00:01:31,940 --> 00:01:34,010
and that allows you to run applications

40
00:01:34,010 --> 00:01:37,010
such as Hadoop, Cassandra, or Kafka.

41
00:01:37,010 --> 00:01:38,350
Now let's have a look into

42
00:01:38,350 --> 00:01:40,450
each of these placement groups in details.

43
00:01:41,920 --> 00:01:45,440
For cluster, that means that's all our EC2 instances

44
00:01:45,440 --> 00:01:48,380
are on the same rack, which means same hardware

45
00:01:48,380 --> 00:01:50,780
and it's in the same availability zone.

46
00:01:50,780 --> 00:01:52,730
So as you can see, all these instances

47
00:01:52,730 --> 00:01:54,340
are on the same hardware.

48
00:01:54,340 --> 00:01:55,890
And so what would we do this?

49
00:01:55,890 --> 00:01:58,290
Well basically, we would place them on the same rack

50
00:01:58,290 --> 00:02:00,520
because we want to have a cluster,

51
00:02:00,520 --> 00:02:02,260
we want to have super low latency,

52
00:02:02,260 --> 00:02:05,850
and want to have maybe a 10 gigabytes speed network.

53
00:02:05,850 --> 00:02:08,160
So that means that we have an amazing network, right?

54
00:02:08,160 --> 00:02:12,370
But as a drawback of this great network that we get,

55
00:02:12,370 --> 00:02:15,040
we get the con that if the rack fails,

56
00:02:15,040 --> 00:02:17,020
if there's a failure on the hardware

57
00:02:17,020 --> 00:02:20,930
then all the EC2 instances will fail at the same time.

58
00:02:20,930 --> 00:02:23,200
So we have increased our risk

59
00:02:23,200 --> 00:02:25,590
to have a failure that's gonna be propagated

60
00:02:25,590 --> 00:02:27,210
across our entire stack.

61
00:02:27,210 --> 00:02:28,540
So when would we even use this?

62
00:02:28,540 --> 00:02:32,030
What's the benefit you're getting this increased risk.

63
00:02:32,030 --> 00:02:33,450
Well, we get great network.

64
00:02:33,450 --> 00:02:35,120
And so for this, that means that we can have

65
00:02:35,120 --> 00:02:38,250
a big data job that will need to complete very fast.

66
00:02:38,250 --> 00:02:40,010
Or maybe we have a requirement

67
00:02:40,010 --> 00:02:42,180
to have an application that needs extremely low latency

68
00:02:42,180 --> 00:02:44,170
and high network throughputs,

69
00:02:44,170 --> 00:02:48,117
and we're willing to take on the risk to have this failure.

70
00:02:48,117 --> 00:02:50,060
And so this is something you have to realize,

71
00:02:50,060 --> 00:02:51,980
it's not for every kind of application

72
00:02:51,980 --> 00:02:53,050
but if your application needs

73
00:02:53,050 --> 00:02:54,920
super high bandwidth and low latency,

74
00:02:54,920 --> 00:02:56,560
placement groups is kind of a nice

75
00:02:56,560 --> 00:02:57,450
the cluster placement group

76
00:02:57,450 --> 00:02:59,810
is kind of a nice way of doing it.

77
00:02:59,810 --> 00:03:01,620
Now spread is the complete opposites.

78
00:03:01,620 --> 00:03:04,260
In spread we want to minimize the failure risk.

79
00:03:04,260 --> 00:03:05,500
And so in this case,

80
00:03:05,500 --> 00:03:07,810
when we asked for a spread placement group,

81
00:03:07,810 --> 00:03:10,930
all the EC2 instances are going to be located

82
00:03:10,930 --> 00:03:12,330
on different hardware.

83
00:03:12,330 --> 00:03:13,180
So as you can see here,

84
00:03:13,180 --> 00:03:16,050
we have three AZ and we have six EC2

85
00:03:16,050 --> 00:03:19,520
and each EC2 instance is on a different hardware.

86
00:03:19,520 --> 00:03:20,610
So what does that mean?

87
00:03:20,610 --> 00:03:23,232
Well what we get, is that's we can span across multiple AZ

88
00:03:23,232 --> 00:03:26,910
and there is a reduced risk of simultaneous failure.

89
00:03:26,910 --> 00:03:30,180
Why? Well because if my hardware one fails,

90
00:03:30,180 --> 00:03:32,520
I'm pretty sure my hardware two will not fail.

91
00:03:32,520 --> 00:03:35,640
And so we've separated the risk of my two instances

92
00:03:35,640 --> 00:03:38,920
in the Us-east-1a, to fail at the same time.

93
00:03:38,920 --> 00:03:40,600
And so that's the benefit from it.

94
00:03:40,600 --> 00:03:43,500
The con is that from this configuration,

95
00:03:43,500 --> 00:03:46,130
we're limited to seven instances per AZ,

96
00:03:46,130 --> 00:03:47,300
per placement group.

97
00:03:47,300 --> 00:03:49,872
So there's a limit to how big your placement group can be.

98
00:03:49,872 --> 00:03:51,910
And so you need to have a application

99
00:03:51,910 --> 00:03:55,030
that's gonna be of good size, but not too big.

100
00:03:55,030 --> 00:03:57,529
The use case would be an application

101
00:03:57,529 --> 00:03:58,950
that needs to maximize high availability

102
00:03:58,950 --> 00:04:00,050
and reduce the risk.

103
00:04:00,050 --> 00:04:02,740
And in general, for critical applications

104
00:04:02,740 --> 00:04:04,330
where your instance failures

105
00:04:04,330 --> 00:04:06,820
must be isolated from one another.

106
00:04:06,820 --> 00:04:10,130
Remember here, we have a limitation of seven instances

107
00:04:10,130 --> 00:04:12,023
per AZ per placement group.

108
00:04:12,910 --> 00:04:15,090
Now for the partition placement group,

109
00:04:15,090 --> 00:04:18,170
we can have instances spread across partitions

110
00:04:18,170 --> 00:04:20,329
in multiple availability zones.

111
00:04:20,329 --> 00:04:22,800
So we can have up to seven partitions per AZ.

112
00:04:22,800 --> 00:04:24,340
So in this example, we have partition one

113
00:04:24,340 --> 00:04:27,560
and partition two in us-east-1a,

114
00:04:27,560 --> 00:04:30,720
and partitioner three in us-east-1b.

115
00:04:30,720 --> 00:04:33,720
And on each partition, you could have many EC2 instances.

116
00:04:33,720 --> 00:04:35,160
So in the first one, I have four

117
00:04:35,160 --> 00:04:36,210
and the second one I have four

118
00:04:36,210 --> 00:04:38,420
and the third one I have four as well.

119
00:04:38,420 --> 00:04:40,410
So why do we use a partition placement group?

120
00:04:40,410 --> 00:04:44,170
Well, each partition represents a rack in AWS.

121
00:04:44,170 --> 00:04:45,500
And so by having many partitions

122
00:04:45,500 --> 00:04:47,530
you're making sure that your instances

123
00:04:47,530 --> 00:04:50,180
are distributed across many hardware racks,

124
00:04:50,180 --> 00:04:51,200
and so therefore,

125
00:04:51,200 --> 00:04:54,400
they're safe from a rack failure from one another.

126
00:04:54,400 --> 00:04:56,872
So you can have up to seven partitions per AZ

127
00:04:56,872 --> 00:04:59,420
and these partitions can span

128
00:04:59,420 --> 00:05:03,600
across multiple availability zones in the same region.

129
00:05:03,600 --> 00:05:07,080
You can get up to hundreds of EC2 instances with a setup.

130
00:05:07,080 --> 00:05:08,190
So this is the difference

131
00:05:08,190 --> 00:05:11,000
versus the spread type of placement group.

132
00:05:11,000 --> 00:05:13,443
And as I said, the instances and the partition

133
00:05:13,443 --> 00:05:16,482
do not share the same hardware physical rack

134
00:05:16,482 --> 00:05:19,240
with the instances in the other partitions,

135
00:05:19,240 --> 00:05:23,760
and therefore, each partition is isolated from failure.

136
00:05:23,760 --> 00:05:25,570
So that means that's, yes if one goes down,

137
00:05:25,570 --> 00:05:28,230
if partition goes down, the partition number two,

138
00:05:28,230 --> 00:05:30,760
then partition number one should be fine.

139
00:05:30,760 --> 00:05:35,122
And to know which partition these EC2 instances are on,

140
00:05:35,122 --> 00:05:38,372
there is an option to access this information

141
00:05:38,372 --> 00:05:41,472
with using the metadata service.

142
00:05:41,472 --> 00:05:44,490
So when would you use a partition placement group?

143
00:05:44,490 --> 00:05:45,960
Well, when you have an application

144
00:05:45,960 --> 00:05:48,940
that it can be partition aware to distribute the data

145
00:05:48,940 --> 00:05:50,950
and your servers across partitions.

146
00:05:50,950 --> 00:05:53,150
And usually, the use cases are going to be

147
00:05:53,150 --> 00:05:56,780
big data applications, which are partition aware,

148
00:05:56,780 --> 00:06:01,780
such using HDFS, Hbase, Cassandra and Apache Kafka.