1 00:00:00,360 --> 00:00:04,500 Now we've chewed the hyper parameters of our logistic regression model and it's getting pretty good 2 00:00:04,500 --> 00:00:05,350 score here. 3 00:00:05,370 --> 00:00:06,630 But remember this score. 4 00:00:06,630 --> 00:00:09,800 The default score for a classifier is accuracy. 5 00:00:09,810 --> 00:00:15,270 What we want to do now is create some evaluation metrics around our model that go a little bit beyond 6 00:00:15,270 --> 00:00:16,500 accuracy. 7 00:00:16,500 --> 00:00:23,130 More specifically let's create our tuned Shane learning model. 8 00:00:23,850 --> 00:00:28,650 Actually machine learning classifies for a better one because these evaluation metrics are specific 9 00:00:28,650 --> 00:00:29,900 to classification. 10 00:00:30,060 --> 00:00:33,060 So we're going to go beyond accuracy. 11 00:00:33,780 --> 00:00:40,310 So here and more specifically we want an ROIC curve and a UC school. 12 00:00:40,320 --> 00:00:47,310 We also want the confusion matrix putting flaws in there that actually looked like it was from The Matrix 13 00:00:47,310 --> 00:00:48,060 with lots of numbers. 14 00:00:48,060 --> 00:00:51,810 You know that net flowing green screen they want a classification report. 15 00:00:51,810 --> 00:01:05,780 What else do we want we want precision recall and f1 score and it would be great and it would be great 16 00:01:06,410 --> 00:01:11,610 if cross validation was used where possible. 17 00:01:12,390 --> 00:01:17,600 So let's turn that into markdown so that's what we're gonna be working on we'll use our churned grid 18 00:01:17,600 --> 00:01:23,180 search logistic regression model as well as the best type of parameters for it and we'll see where they 19 00:01:23,180 --> 00:01:24,290 come into play in a second. 20 00:01:24,740 --> 00:01:32,180 So first of all when we evaluate a model it's always comparing how a trained model's predictions compare 21 00:01:32,720 --> 00:01:34,530 to the truth labels. 22 00:01:34,550 --> 00:01:40,130 So what we have to do is make some predictions first so we can compare them to the truth labels a.k.a. 23 00:01:40,160 --> 00:01:42,590 the labels in the y test dataset. 24 00:01:43,130 --> 00:01:55,690 So let's do that or write a note here to make comparisons and evaluate our trained model. 25 00:01:55,810 --> 00:02:00,820 First we need to make predictions. 26 00:02:00,860 --> 00:02:04,790 So what we'll do is make predictions. 27 00:02:04,790 --> 00:02:10,460 We choose and model and there's a beautiful function called predict that we can use for that so we'll 28 00:02:10,460 --> 00:02:11,890 save it to y parades. 29 00:02:11,910 --> 00:02:16,190 We use G.S. log rig which is just our trained version of our grid search model. 30 00:02:16,880 --> 00:02:24,580 So G.S. log rig don't predict and we're gonna predict on the test data why. 31 00:02:24,620 --> 00:02:29,600 Wonderful and let's say them just to make sure we're not going crazy beautiful and now we need to compare 32 00:02:29,600 --> 00:02:30,860 them to the test dataset. 33 00:02:30,890 --> 00:02:32,300 So we have a look at this here. 34 00:02:32,970 --> 00:02:40,080 OK so if we wanted to go 0 0 we can know it's got this one wrong because that's supposed to be a zero 35 00:02:40,100 --> 00:02:43,510 there because this is the truth labels here and then we can keep going through there but we're not going 36 00:02:43,510 --> 00:02:44,220 to do that. 37 00:02:44,360 --> 00:02:46,710 We want to use some code to do that. 38 00:02:46,780 --> 00:02:51,590 The first thing first we want of rock curve because remember that we've got this little list up here 39 00:02:51,610 --> 00:02:54,650 so rock curve and AUC school. 40 00:02:55,160 --> 00:02:56,810 What is a rock curve. 41 00:02:56,810 --> 00:02:58,780 Well we looked that up. 42 00:02:58,800 --> 00:03:00,850 So what is a rock curve. 43 00:03:02,140 --> 00:03:08,120 We went through this process of understanding AC rock curves so if we were to read through this what 44 00:03:08,120 --> 00:03:14,090 we would get is the rock curve is created by plotting the true positive rate against the false positive 45 00:03:14,090 --> 00:03:16,610 rate at various threshold settings. 46 00:03:16,610 --> 00:03:17,250 Okay. 47 00:03:17,360 --> 00:03:24,700 Beautiful so let's see how we do that the rock curve is a way of understanding how your model is performing 48 00:03:24,940 --> 00:03:29,130 by comparing the true positive right to the false positive right. 49 00:03:29,260 --> 00:03:33,550 And if we want to figure out what a true positive and a false positive is we can have a look at our 50 00:03:33,550 --> 00:03:34,780 confusion matrix for that. 51 00:03:34,780 --> 00:03:40,750 So a true positive model predicts one when the truth is 1 and a false positive is just the model predicts 52 00:03:40,750 --> 00:03:42,270 one when the truth is supposed to be zero. 53 00:03:44,170 --> 00:03:46,540 And a perfect model is going to get an R U C score. 54 00:03:46,540 --> 00:03:49,140 What we see in a second of 1.0. 55 00:03:49,300 --> 00:03:50,320 So let's see how we do it. 56 00:03:50,320 --> 00:03:57,660 So import ROIC curve function from SBA loan dot matrix. 57 00:03:57,730 --> 00:03:58,470 That's where it's from. 58 00:03:58,480 --> 00:03:59,220 That's where it lives. 59 00:03:59,230 --> 00:04:00,710 But we've actually already done this. 60 00:04:00,730 --> 00:04:07,520 As with all the other SBA loan functions we've been using right back up the top right back up here. 61 00:04:07,550 --> 00:04:11,420 So this is Model evaluations right now in this next section. 62 00:04:11,420 --> 00:04:14,280 This next video we're going to tackle everything from here. 63 00:04:14,330 --> 00:04:20,330 So these are we can see here we've got randomized search of a grid search of a confusion matrix plot 64 00:04:20,330 --> 00:04:24,800 rock curve that's we're about to use precision school recall score F one school. 65 00:04:24,830 --> 00:04:27,430 So these are for our classification model. 66 00:04:27,430 --> 00:04:30,810 So let's go damn we can save ourselves a line of code. 67 00:04:30,820 --> 00:04:35,510 We don't actually have to import that come right back down here. 68 00:04:35,510 --> 00:04:38,450 We can just go plot rock curve. 69 00:04:38,450 --> 00:04:43,850 Now this is a relatively new addition to this socket line library which I'm very very happy with because 70 00:04:43,850 --> 00:04:46,250 usually you had to write these on your own. 71 00:04:46,370 --> 00:04:52,130 This function this plot rock curve and calculate it actually calculates the AEC metric for us so the 72 00:04:52,130 --> 00:04:54,660 area under the curve metric. 73 00:04:55,160 --> 00:05:01,410 Let's see it in action go plot rock curve G.S. log rig. 74 00:05:01,570 --> 00:05:07,460 And if we look at this in the documentation and tell us what it does pass it and estimate which is our 75 00:05:07,460 --> 00:05:13,880 machine learning model pass it X pass it y plot receive a operating characteristic a.k.a. rock curve 76 00:05:14,430 --> 00:05:21,110 that's that's what that's gonna do for us beautiful engineers log rank x test we wanted to do it on 77 00:05:21,110 --> 00:05:26,990 the test data set always evaluate this should be calculate always evaluate in machine learning models 78 00:05:26,990 --> 00:05:32,610 in the test dataset and let's see all beautiful. 79 00:05:33,320 --> 00:05:38,810 Okay so remember a perfect rock curve we've seen in previous videos we go up to this corner and then 80 00:05:38,810 --> 00:05:39,620 across like that. 81 00:05:39,620 --> 00:05:43,120 But as he is up pretty well then we're going to area under the curve. 82 00:05:43,130 --> 00:05:48,160 So if we calculated all this area under here of zero point nine three of course a perfect model will 83 00:05:48,170 --> 00:05:49,190 achieve one point zero. 84 00:05:49,190 --> 00:05:55,730 So our model is not perfect but it's an AUC score of zero point nine three when the average of just 85 00:05:57,180 --> 00:06:00,240 tossing a coin essentially would be zero point five. 86 00:06:00,270 --> 00:06:04,990 So we're edging close to a perfect model not bad for a model that just came out of the box. 87 00:06:05,010 --> 00:06:05,390 All right. 88 00:06:05,400 --> 00:06:09,270 The next thing we want to do is a confusion matrix. 89 00:06:09,390 --> 00:06:10,410 So let's say that. 90 00:06:10,410 --> 00:06:13,280 So confusion Matrix. 91 00:06:13,560 --> 00:06:14,610 How could we do that. 92 00:06:15,210 --> 00:06:20,820 So I think circuit line has a function and I don't think I know socket line has a function we can go 93 00:06:21,120 --> 00:06:26,360 confusion matrix and want to compare the ground truth labels with the predicted labels. 94 00:06:26,480 --> 00:06:27,060 OK. 95 00:06:27,270 --> 00:06:29,940 I could just leave it as that it's a bit bland. 96 00:06:29,940 --> 00:06:35,430 We can improve the visualization of this confusion matrix using seaborne. 97 00:06:35,430 --> 00:06:36,000 That's what you want. 98 00:06:36,180 --> 00:06:37,280 So we want to go. 99 00:06:37,560 --> 00:06:41,200 I believe we already have Seabourn here you're doing. 100 00:06:41,330 --> 00:06:43,870 Yes and yes yes we do. 101 00:06:43,870 --> 00:06:45,850 So we've you see bomb before in this project. 102 00:06:45,850 --> 00:06:52,690 So we can go what I might do is I know ahead of time that we need a bigger font size font size equals 103 00:06:52,750 --> 00:06:54,160 one point five. 104 00:06:54,250 --> 00:06:56,440 That's just s an S dot set. 105 00:06:56,440 --> 00:07:00,520 Now we're going to create a little function here in case we wanted to make another confusion matrix 106 00:07:01,330 --> 00:07:07,150 because socket lines confusion matrix function isn't up to scratch as is making this video could always 107 00:07:07,510 --> 00:07:10,290 maybe that's an opportunity for a pull request. 108 00:07:10,450 --> 00:07:13,640 So we go here nice and simple. 109 00:07:13,810 --> 00:07:20,160 So we're just gonna pass it this function our test libels and Al predicted libels. 110 00:07:20,170 --> 00:07:31,020 And then we're gonna create a nice looking plots a nice looking confusion matrix using C bonds. 111 00:07:31,300 --> 00:07:32,710 We've seen this before. 112 00:07:32,750 --> 00:07:35,660 Hate Matt wonderful. 113 00:07:35,820 --> 00:07:49,780 So we go fig X equals pale T dot sub plots fig size equals 3 3 0 shift in top trigger happy again X 114 00:07:49,810 --> 00:07:51,810 it S.A. dot heat map. 115 00:07:51,850 --> 00:07:52,470 Wonderful. 116 00:07:52,480 --> 00:08:03,040 We're gonna pass it a socket line function side confusion matrix y test y Fred's and then we want to 117 00:08:03,040 --> 00:08:05,370 add here and not so annotate. 118 00:08:05,410 --> 00:08:10,260 Yes please annotate Eagles true. 119 00:08:10,460 --> 00:08:15,180 Now we're gonna go see bar because we don't want to Calabar because I've seen the Calabar it doesn't 120 00:08:15,180 --> 00:08:24,320 look too great on the confusion matrix penalty don't ex label true liable and then we're gonna add peyote 121 00:08:24,390 --> 00:08:32,480 don't why liable gonna guy predicted libel and let's see it. 122 00:08:32,570 --> 00:08:41,290 So we're gonna plot comes from that confusion matrix y test y parades o f what have we done. 123 00:08:41,530 --> 00:08:50,150 Font Size that sack of font size wrong what is this tab or scale. 124 00:08:50,150 --> 00:08:56,120 There we go tab when I complete for Lent are classic and it's giving us this little cutoff thing here 125 00:08:56,120 --> 00:09:05,810 so we can fix this I believe by going bottom top because acts dot get Y limb Yeah and then we want to 126 00:09:05,810 --> 00:09:16,600 go ax dot set y limb make a little bit pretty set y lamb and it's going to be bottom last point five 127 00:09:16,610 --> 00:09:19,360 top minus zero point five. 128 00:09:19,370 --> 00:09:25,010 Now of course if your confusion Matrix came out looking well you might not need these lines but doesn't 129 00:09:25,010 --> 00:09:25,340 matter. 130 00:09:25,640 --> 00:09:26,480 There we go. 131 00:09:26,490 --> 00:09:32,030 It might look a little bit better so you can see that the model gets confused so predicts the wrong 132 00:09:32,030 --> 00:09:36,020 label relatively the same across both classes. 133 00:09:36,050 --> 00:09:42,710 So in essence there's four occasions here where the model predicted zero the predicted label so predicted 134 00:09:42,710 --> 00:09:46,080 someone didn't have heart disease when they should have been predicted as one. 135 00:09:46,160 --> 00:09:47,360 So that's a false negative. 136 00:09:47,390 --> 00:09:52,940 So it's predicting zero instead of one a false negative remember is we come back here false negative 137 00:09:52,940 --> 00:09:57,430 model predict zero when the truth is 1 and over here is the false positives. 138 00:09:57,530 --> 00:10:02,780 So okay you've got three instances here where a model predicts one someone does have heart disease when 139 00:10:02,780 --> 00:10:04,160 they actually don't. 140 00:10:04,160 --> 00:10:08,870 And you can see these are both things that we want to avoid right especially when I'm thinking about 141 00:10:08,870 --> 00:10:14,510 something as severe or as serious as heart disease even predicting when it's not present so here if 142 00:10:14,510 --> 00:10:16,820 we predict zero so no disease when it is present. 143 00:10:16,820 --> 00:10:22,850 Okay that's bad but also predicting that it is there when it's not actually there that's also bad. 144 00:10:22,880 --> 00:10:27,080 So that's something that you want to have to consider when you're building these type of models is a 145 00:10:27,080 --> 00:10:30,720 false negative worse or is a false positive worse. 146 00:10:30,800 --> 00:10:36,560 And again a perfect model would have none of these but in reality you're going to probably end up having 147 00:10:36,560 --> 00:10:41,720 some sort of confusion here with your model you're not going to ideally these would be perfect across 148 00:10:41,720 --> 00:10:43,370 here across the line. 149 00:10:43,370 --> 00:10:47,840 Now we've done a confusion matrix and got that there we could share that with our boss we could share 150 00:10:47,840 --> 00:10:50,390 that with our boss a rock curve there. 151 00:10:50,390 --> 00:10:51,010 What's next. 152 00:10:51,010 --> 00:10:52,310 Classification report. 153 00:10:52,550 --> 00:10:53,140 Okay. 154 00:10:53,150 --> 00:10:56,950 And it would be great if cross validation was used where possible. 155 00:10:56,960 --> 00:10:58,960 All right so that's what we might tackle in the next video.