1 00:00:00,800 --> 00:00:04,810 So we left off the last video calculating our model's false positive rate. 2 00:00:04,890 --> 00:00:09,330 True positive rate and at what thresholds using the rock curve function. 3 00:00:09,330 --> 00:00:13,390 Now we kind of got back the false positive rate but it's in an array of numbers. 4 00:00:13,410 --> 00:00:19,080 So looking at this again doesn't really make much sense but plotting it and seeing the rock curve the 5 00:00:19,140 --> 00:00:21,560 actual rock curve will make a bit more sense. 6 00:00:21,570 --> 00:00:28,400 What we're gonna do is we're going to create a function for plotting RISC curves. 7 00:00:28,410 --> 00:00:34,050 Now in your practice when you need to make multiple plots as you saw in the map plot lib section you 8 00:00:34,050 --> 00:00:38,990 might want to create a function that does plots for you if you want to make more of the same plot. 9 00:00:39,150 --> 00:00:40,230 So that's what we're going to do here. 10 00:00:40,230 --> 00:00:45,340 Just see an example of how we might create rock curve plotting function. 11 00:00:45,360 --> 00:00:49,000 So first of all or import that plot lib pipeline. 12 00:00:50,860 --> 00:00:53,630 We're going to go as P L T beautiful. 13 00:00:53,690 --> 00:00:58,170 I'm gonna go call it nice and simple plot rock curve. 14 00:00:58,170 --> 00:01:03,910 It's gonna take the false positive rate and the true positive right now we're going to come here. 15 00:01:03,930 --> 00:01:08,990 We'll leave a nice dock string so it makes makes a bit of communicative sense. 16 00:01:09,000 --> 00:01:26,350 Plots are rock curve given the false positive rate NPR and true positive rate TPR of the model. 17 00:01:26,350 --> 00:01:28,160 Wonderful nice and simple. 18 00:01:28,300 --> 00:01:37,820 Then go plot rub curve BLT dot plot we're gonna pass it the false positive rate in the true positive 19 00:01:37,820 --> 00:01:42,820 rate and we'll give it a color score. 20 00:01:43,100 --> 00:01:51,820 Remember these are these libraries take Americans smelling your here orange and then label equals rosy 21 00:01:51,970 --> 00:01:58,040 for curve and then we're gonna go plot line with no predictive power. 22 00:01:58,040 --> 00:02:01,790 This is gonna be a baseline you'll see it when the plot comes out it'll make a bit more sense that way 23 00:02:01,790 --> 00:02:08,450 we can compare our model to some some other arbitrary model which just predicts nothing. 24 00:02:09,750 --> 00:02:11,470 0 1. 25 00:02:11,500 --> 00:02:20,470 Wonderful and the color is gonna be dark blue line style we'll keep it dotted. 26 00:02:20,470 --> 00:02:25,750 Actually this is only 2 so it looks different to our rock curve and the label is guessing. 27 00:02:25,750 --> 00:02:30,460 So this is this line you'll see when we plot is basically like our model was just guessing then we're 28 00:02:30,460 --> 00:02:32,780 gonna go customize the plot. 29 00:02:34,150 --> 00:02:37,400 Peyote don't ex label. 30 00:02:37,870 --> 00:02:45,650 This is going to be a false positive rate if PR and then we're getting a penalty. 31 00:02:45,720 --> 00:02:46,170 Why lie. 32 00:02:46,170 --> 00:02:47,760 Well you might be hard to guess what this is. 33 00:02:47,830 --> 00:02:49,810 This is the true positive right. 34 00:02:49,810 --> 00:02:58,300 This is because we got FBR as the ex label and TPR as the y data true positive right. 35 00:03:01,630 --> 00:03:11,890 And then we're gonna go receive our operating characteristic ROIC curve. 36 00:03:11,890 --> 00:03:15,870 Now this is the type of plot you might give someone when they're asking for the rock. 37 00:03:15,970 --> 00:03:16,350 Right. 38 00:03:16,690 --> 00:03:20,770 So this is why I'm showing you this show and you'll probably also come along. 39 00:03:20,800 --> 00:03:25,430 This sort of plot when you have a look at different classification problems online. 40 00:03:25,570 --> 00:03:28,920 I might use the rock curve as an example of how the model's doing. 41 00:03:28,930 --> 00:03:31,230 We want FBR first TPR. 42 00:03:31,270 --> 00:03:31,800 Beautiful. 43 00:03:31,810 --> 00:03:33,140 So this should work. 44 00:03:33,190 --> 00:03:34,810 Wonderful. 45 00:03:34,810 --> 00:03:38,840 What is going on here we finally get to see what a rock curve looks like. 46 00:03:38,980 --> 00:03:44,570 If we go to here is the false positive rate is zero point six the true positive rate is gonna be 1.0 47 00:03:44,590 --> 00:03:52,990 is the maximum score it can get is 1.0 up here and so this model here going from corner to corner is 48 00:03:52,990 --> 00:03:54,100 guessing. 49 00:03:54,100 --> 00:03:57,040 Can you guess where the most ideal rock curve might end up. 50 00:03:57,280 --> 00:04:04,360 If this is guessing and our model is doing far better than guessing by getting about 80 percent 85 percent 51 00:04:04,360 --> 00:04:06,640 82 percent something like that. 52 00:04:06,640 --> 00:04:11,030 Can you guess where the most ideal curve will go. 53 00:04:11,190 --> 00:04:11,970 That's all right. 54 00:04:11,970 --> 00:04:14,230 If you can't we're going to have a look at it. 55 00:04:14,250 --> 00:04:18,200 So before we do actually is let's have a look at the AUC score. 56 00:04:18,240 --> 00:04:23,560 We can do this from S.K. loan import rock a U.S. score. 57 00:04:23,570 --> 00:04:26,640 And so this is where the AUC comes into play from rock. 58 00:04:26,670 --> 00:04:33,290 So when you hear area on the curve or receive operating characteristic what is a UC school. 59 00:04:34,140 --> 00:04:35,540 And so we can do that. 60 00:04:36,000 --> 00:04:37,380 Let's just see it in action. 61 00:04:37,380 --> 00:04:41,160 This takes the test labels as well as the probabilities 62 00:04:44,410 --> 00:04:45,830 so we can do Y probs. 63 00:04:47,820 --> 00:04:48,270 We need 64 00:04:51,300 --> 00:04:54,160 positives. 65 00:04:54,170 --> 00:04:54,950 There we go. 66 00:04:55,190 --> 00:04:57,570 So how did I know this again doctoring. 67 00:04:57,650 --> 00:04:58,000 Why. 68 00:04:58,000 --> 00:04:59,540 True why school. 69 00:04:59,780 --> 00:05:00,700 Wonderful. 70 00:05:00,750 --> 00:05:03,570 And so what is this measuring you might be asking. 71 00:05:03,590 --> 00:05:04,400 Well this is. 72 00:05:04,510 --> 00:05:07,740 Remember AUC stands for area under curve. 73 00:05:08,090 --> 00:05:14,810 And so if we were to calculate take our curve ignore this one for a second let's just get rid of that 74 00:05:16,460 --> 00:05:22,660 so if we were to take out curve this year it's a bit of a jagged curve but it is a curve. 75 00:05:22,660 --> 00:05:28,450 And if we were to measure the area so shading everything under here shady and all of this and figure 76 00:05:28,450 --> 00:05:32,800 it out then we go one across here well missing some space up here. 77 00:05:32,800 --> 00:05:39,970 So our area areas definitely not going to be one our score is gonna get point eight four so point eight 78 00:05:39,970 --> 00:05:40,780 for nine. 79 00:05:40,780 --> 00:05:44,860 Now the maximum score you can get can you guess with a AUC score. 80 00:05:44,860 --> 00:05:49,450 So area under the curve the maximum score you can get is one point zero. 81 00:05:49,570 --> 00:05:55,120 And so that means that the curve goes straight up here to this corner and straight across here to that 82 00:05:55,120 --> 00:05:55,990 corner. 83 00:05:55,990 --> 00:06:05,100 Now let's see it in action so if we're gonna go plot perfect rock curve and a U C score. 84 00:06:05,950 --> 00:06:16,290 So when want go to FBR TPR thresholds equals rock curve y test. 85 00:06:18,020 --> 00:06:20,830 Now we're gonna go plot where you're gonna use our function here. 86 00:06:20,910 --> 00:06:21,540 Rock curve. 87 00:06:21,560 --> 00:06:27,770 See where it comes in handy because you want to plot multiple rock curves TPR FBR TPR. 88 00:06:27,800 --> 00:06:28,830 Let's do that. 89 00:06:28,850 --> 00:06:30,600 So this is a perfect rock curve here. 90 00:06:31,660 --> 00:06:33,080 The area under the curve here. 91 00:06:33,080 --> 00:06:34,730 Can you guess what it would be. 92 00:06:34,730 --> 00:06:35,560 We just discussed it. 93 00:06:35,770 --> 00:06:37,360 So perfect. 94 00:06:37,400 --> 00:06:48,580 Are you say score is a U C score why test why test now. 95 00:06:48,600 --> 00:06:50,220 That's one point zero. 96 00:06:50,240 --> 00:06:55,350 And so in reality a perfect rock curve is very unlikely that means you've got a perfect model it's got 97 00:06:55,350 --> 00:06:58,430 no false positives or everything's a true positive. 98 00:06:58,500 --> 00:07:01,980 If you saw a 1.0 here's a rock IESE score. 99 00:07:01,980 --> 00:07:06,640 Is that going to sort of maybe raise a flag to see if your model is predicting the right thing. 100 00:07:06,660 --> 00:07:11,370 The main details here that a rock curve is predicting is a true positive rate versus a false positive 101 00:07:11,370 --> 00:07:11,660 rate. 102 00:07:12,330 --> 00:07:17,660 And the main metric you can use to boil it down rather than just being a curve. 103 00:07:17,730 --> 00:07:20,010 You can use the AUC score. 104 00:07:20,250 --> 00:07:22,560 So that's what you do to calculate a rock curve. 105 00:07:22,560 --> 00:07:27,030 If someone asked you to calculate a rock curve for your model and it's going to compare the false positive 106 00:07:27,030 --> 00:07:28,650 rate versus the true positive rate. 107 00:07:29,340 --> 00:07:32,950 So now we've seen rock curves an area under the curve. 108 00:07:32,980 --> 00:07:38,760 Let's have a look and the next classification metric on the next way to evaluate classification model 109 00:07:39,120 --> 00:07:43,410 and that is with a confusion matrix.