1 00:00:00,260 --> 00:00:01,850 Now let's discuss CloudWatch Alarms. 2 00:00:01,850 --> 00:00:02,990 So alarms, as we know, 3 00:00:02,990 --> 00:00:06,060 they're used to trigger notifications from any metric. 4 00:00:06,060 --> 00:00:07,990 And you can define complex alarms 5 00:00:07,990 --> 00:00:10,210 and on various options such as sampling, 6 00:00:10,210 --> 00:00:13,200 or doing percentage, or max, min and so on. 7 00:00:13,200 --> 00:00:14,480 Alarm has three states, 8 00:00:14,480 --> 00:00:15,846 OK means that it's not triggered, 9 00:00:15,846 --> 00:00:18,230 insufficient data means that there's not enough data 10 00:00:18,230 --> 00:00:20,300 for the alarm to determine a state, 11 00:00:20,300 --> 00:00:23,630 and alarm, which is that your threshold has been breached 12 00:00:23,630 --> 00:00:26,190 and therefore a notification will be sent. 13 00:00:26,190 --> 00:00:27,970 The period is how long you want the alarm 14 00:00:27,970 --> 00:00:29,750 to evaluate for on the metric 15 00:00:29,750 --> 00:00:32,750 and so it could be very, very short or very, very long, 16 00:00:32,750 --> 00:00:35,780 and it can apply also to high resolution custom metrics. 17 00:00:35,780 --> 00:00:37,210 For example, 10 seconds, 30 seconds, 18 00:00:37,210 --> 00:00:39,509 or a multiple of 60 seconds. 19 00:00:39,509 --> 00:00:41,924 Now, alarms have three main targets, 20 00:00:41,924 --> 00:00:45,210 the first one is actions on EC2 Instances, 21 00:00:45,210 --> 00:00:48,110 such as stopping it, terminating it, rebooting it, 22 00:00:48,110 --> 00:00:50,010 or recovering an instance. 23 00:00:50,010 --> 00:00:52,450 The second one is to trigger an auto-scaling action, 24 00:00:52,450 --> 00:00:54,844 for example, a scale out or a scale in. 25 00:00:54,844 --> 00:00:57,330 And the last one is to send a notification 26 00:00:57,330 --> 00:00:59,366 to the SNS service, for example, 27 00:00:59,366 --> 00:01:02,060 and from the SNS service we can hook it 28 00:01:02,060 --> 00:01:04,590 to a lambda function and have the lambda function 29 00:01:04,590 --> 00:01:06,130 do pretty much anything we want based 30 00:01:06,130 --> 00:01:08,430 on an alarm being breached. 31 00:01:08,430 --> 00:01:10,040 So, let's talk about EC2 Instance Recovery. 32 00:01:10,040 --> 00:01:10,873 We've already seen it, 33 00:01:10,873 --> 00:01:13,410 but there's a status check to check the EC2 VM 34 00:01:13,410 --> 00:01:14,840 and the system status check 35 00:01:14,840 --> 00:01:16,694 to check the underlying hardware. 36 00:01:16,694 --> 00:01:18,690 And you can define a CloudWatch Alarm 37 00:01:18,690 --> 00:01:19,960 on both of these checks. 38 00:01:19,960 --> 00:01:22,592 Okay, so you will monitor a specific EC2 Instance 39 00:01:22,592 --> 00:01:25,232 and in case the alarm is being breached, 40 00:01:25,232 --> 00:01:28,550 then you can start an EC2 Instance Recovery 41 00:01:28,550 --> 00:01:29,530 to make sure, for example, 42 00:01:29,530 --> 00:01:31,130 that you move your EC2 Instance 43 00:01:31,130 --> 00:01:32,610 from one host to another. 44 00:01:32,610 --> 00:01:34,030 When you do a recovery, 45 00:01:34,030 --> 00:01:35,730 you get the same private, public, 46 00:01:35,730 --> 00:01:37,480 and elastic IP, the same metadata, 47 00:01:37,480 --> 00:01:39,104 and the same placement group for your instance. 48 00:01:39,104 --> 00:01:43,060 And you can also send an alarm and alerts to your SNS Topic 49 00:01:43,060 --> 00:01:46,713 to get alerted that the EC2 Instance was being recovered. 50 00:01:47,680 --> 00:01:49,110 Now the CloudWatch Alarm has some good stuff. 51 00:01:49,110 --> 00:01:49,943 So, you know, that first of all 52 00:01:49,943 --> 00:01:51,010 is that as we've seen, 53 00:01:51,010 --> 00:01:52,540 we can create an alarm on top 54 00:01:52,540 --> 00:01:54,020 of a CloudWatch Logs Metric Filter. 55 00:01:54,020 --> 00:01:54,853 So remember, 56 00:01:54,853 --> 00:01:57,210 the CloudWatch Logs are having a metric filter, 57 00:01:57,210 --> 00:01:58,920 which is hooked to a CloudWatch Alarm 58 00:01:58,920 --> 00:02:01,268 and then when we receive too many instances 59 00:02:01,268 --> 00:02:03,574 of a specific word, for example, the word error, 60 00:02:03,574 --> 00:02:08,479 then do an alert and send a message into Amazon SNS. 61 00:02:08,479 --> 00:02:10,530 And so if you want it to test alarm notifications, 62 00:02:10,530 --> 00:02:13,618 you can use a CLI Call called set alarm state. 63 00:02:13,618 --> 00:02:16,360 And this is helpful when you want to trigger an alarm, 64 00:02:16,360 --> 00:02:19,154 even though it didn't reach a specific specific threshold, 65 00:02:19,154 --> 00:02:22,340 because you wanted to see whether or not the alarm being 66 00:02:22,340 --> 00:02:24,680 triggered results in the correct action 67 00:02:24,680 --> 00:02:26,900 for your infrastructure. 68 00:02:26,900 --> 00:02:27,733 So that's it for alarms. 69 00:02:27,733 --> 00:02:28,566 I hope you liked it, 70 00:02:28,566 --> 00:02:31,023 and I will see you in the next lecture for some practice.