1 00:00:00,390 --> 00:00:05,220 We've seen how to make predictions with our machine learning models once it's loans and patterns from 2 00:00:05,220 --> 00:00:08,390 the data a.k.a. using those patterns. 3 00:00:08,430 --> 00:00:14,820 Now how do we figure out whether those predictions are valid such as Could we use them in production. 4 00:00:14,850 --> 00:00:19,780 Or is our model just making things up or those predictions do they actually hold water. 5 00:00:19,800 --> 00:00:24,680 So what we're going to cover in this section is step for evaluating a model. 6 00:00:24,690 --> 00:00:35,690 So we'll get rid of this but we'll put in a little heading here evaluating machine learning model beautiful. 7 00:00:35,760 --> 00:00:40,860 Now the first place we're going to have a look at is up here and this is a socket loan documentation. 8 00:00:40,920 --> 00:00:48,070 We can actually find this by going socket learned evaluate a model that should come up three point three 9 00:00:48,130 --> 00:00:49,480 metrics and scoring. 10 00:00:49,480 --> 00:00:51,800 That's what we're after. 11 00:00:51,870 --> 00:00:58,050 So as you can see here there are three different API is for evaluating the quality of a model's prediction. 12 00:00:58,050 --> 00:00:59,880 We're going to have a look at each of these. 13 00:00:59,880 --> 00:01:02,250 So we're going to estimate a score method. 14 00:01:02,250 --> 00:01:06,510 We've got the scoring parameter and we've got metric functions 15 00:01:09,370 --> 00:01:14,030 we could read through this but I prefer as you probably do preferred to is getting hands on with the 16 00:01:14,030 --> 00:01:14,440 codes. 17 00:01:14,460 --> 00:01:22,280 Let's see it in action all right now to do so we're going to bring back our heart disease classification 18 00:01:22,280 --> 00:01:22,770 problem. 19 00:01:22,790 --> 00:01:24,400 We could scroll up and copy it. 20 00:01:24,620 --> 00:01:28,690 But again I want you to to get some practice writing it out right. 21 00:01:28,690 --> 00:01:32,830 Because this is what we're here for We're here to practice writing machine learning code. 22 00:01:32,840 --> 00:01:36,230 So we're going to import is a markdown sales. 23 00:01:36,230 --> 00:01:38,320 I'm going to change that to code. 24 00:01:38,360 --> 00:01:41,540 We're going to import the random forest classifier. 25 00:01:41,600 --> 00:01:45,230 We've seen this before and then we're going to set up a random seed. 26 00:01:45,470 --> 00:01:46,830 Wonderful. 27 00:01:47,270 --> 00:01:50,810 And then we're going to create our x and y our feature variables. 28 00:01:50,810 --> 00:01:52,550 Heart disease don't drop. 29 00:01:52,550 --> 00:01:57,500 We've already imported the data from heart disease and access equals one. 30 00:01:57,590 --> 00:01:58,130 Beautiful. 31 00:01:58,130 --> 00:02:00,080 And we'll create our labels. 32 00:02:00,110 --> 00:02:02,010 Heart disease. 33 00:02:02,180 --> 00:02:04,400 This is target. 34 00:02:04,460 --> 00:02:05,500 Excellent. 35 00:02:05,510 --> 00:02:07,600 Then we'll split it into train and test. 36 00:02:07,600 --> 00:02:19,850 So x test y train y in test Eagles train test split x y test size equals zero point two. 37 00:02:19,940 --> 00:02:21,080 Wonderful. 38 00:02:21,080 --> 00:02:27,100 Then we'll instantiate our random forest classifier random forest we can probably breast have here. 39 00:02:27,410 --> 00:02:28,340 We certainly can. 40 00:02:28,340 --> 00:02:29,870 And then we're going to fit it. 41 00:02:29,870 --> 00:02:30,860 That wasn't too hard right. 42 00:02:30,870 --> 00:02:35,600 By now we're becoming experts at writing this little section of code and I'm being realistic here. 43 00:02:35,600 --> 00:02:38,000 That's a full blown machine learning pipeline right there. 44 00:02:38,540 --> 00:02:43,660 As long as the data's in the right format and we've got the target column we can do this pretty quickly. 45 00:02:43,670 --> 00:02:45,690 So now we run this. 46 00:02:45,860 --> 00:02:48,020 We can see that our model fits itself to the data. 47 00:02:48,020 --> 00:02:54,060 So basically it's finding the patterns in X train and Y train or between those two. 48 00:02:54,140 --> 00:02:58,750 And so now we can use the scoring parameter what we might do actually is copy this. 49 00:02:58,760 --> 00:03:07,480 We go three ways to evaluate psychic loan models slash estimates. 50 00:03:07,490 --> 00:03:10,880 Now this is just from the documentation. 51 00:03:10,960 --> 00:03:16,220 So we want one is estimate a score method. 52 00:03:16,220 --> 00:03:23,620 This is what we'll have to look at first and then two is the scoring parameter scoring parameter. 53 00:03:23,630 --> 00:03:31,820 We'll have a look at that shortly and then three is problem specific metric functions. 54 00:03:31,820 --> 00:03:32,800 Beautiful. 55 00:03:32,840 --> 00:03:34,990 Now I've got a little heading we know what we're working with. 56 00:03:35,030 --> 00:03:40,670 So the first things first we're going to check out the score methods may you put another heading in 57 00:03:40,670 --> 00:03:41,610 here. 58 00:03:41,660 --> 00:03:50,860 One two three four point one evaluating a model with a score method. 59 00:03:50,870 --> 00:03:56,670 Now we've already seen this one right because this is basically the default it's a way to get a quick 60 00:03:56,670 --> 00:03:59,580 sniff a quick understanding of how our is doing. 61 00:03:59,580 --> 00:04:07,110 So if we call CSF dot score right because that's the score method every estimate in psychic loan has 62 00:04:07,110 --> 00:04:08,670 this little score method. 63 00:04:08,880 --> 00:04:14,160 So once you've instantiated machine learning model here and you fit it to some sort of data you can 64 00:04:14,160 --> 00:04:15,030 get its score. 65 00:04:15,330 --> 00:04:22,880 So look we could even get its score on the training data how does it go on here 1 so it fits the training 66 00:04:22,880 --> 00:04:24,230 data perfectly. 67 00:04:24,230 --> 00:04:25,370 And then if we go here. 68 00:04:25,370 --> 00:04:30,740 Score on the test data 85 percent. 69 00:04:30,770 --> 00:04:35,610 So we've seen this figures before right now what is happening here. 70 00:04:35,640 --> 00:04:36,330 Well let's have a look. 71 00:04:36,360 --> 00:04:37,560 Let's press shift tab. 72 00:04:37,570 --> 00:04:42,560 Remember you can press shift have within any method to see what it does or see its Doc string returns 73 00:04:42,560 --> 00:04:46,280 the mean accuracy on the given test data and labels. 74 00:04:46,500 --> 00:04:52,160 And so what's happening here is that a model that would predict perfectly would get 100 percent here 75 00:04:52,410 --> 00:04:57,750 and actually I would say you should be skeptical of any model that gets always 100 percent because no 76 00:04:57,750 --> 00:05:01,890 model is perfect right a huge machine learning model is always getting its predictions right. 77 00:05:01,890 --> 00:05:06,060 I'd say there's some sort of error in your data or some sort of error in the way you've trained it. 78 00:05:06,120 --> 00:05:12,030 So our model doesn't get everything correct but at 85 percent it's still far better than just guessing 79 00:05:12,180 --> 00:05:12,380 right. 80 00:05:12,390 --> 00:05:16,020 Because remember we've got two labels heart disease or not. 81 00:05:16,230 --> 00:05:20,090 And so guessing would be just getting about 50 per cent now. 82 00:05:20,190 --> 00:05:25,650 Let's do the same as above except with some regression code and this time I'll let you off of this video. 83 00:05:25,680 --> 00:05:31,350 We've already typed out a little machine learning pipeline will come up here and we'll copy our regression 84 00:05:31,350 --> 00:05:32,070 code. 85 00:05:32,070 --> 00:05:37,210 We know it's regression because we've got the random forest aggressor and so we write it in a little 86 00:05:37,210 --> 00:05:38,260 come in here. 87 00:05:38,260 --> 00:05:41,470 Let's do the same but for regression 88 00:05:45,780 --> 00:05:50,920 beautiful what we want do is just fit it all. 89 00:05:51,000 --> 00:05:54,810 We've already got the fit function they're saying confusing myself. 90 00:05:54,810 --> 00:05:55,930 Beautiful. 91 00:05:56,170 --> 00:05:57,330 So the model is now fit. 92 00:05:57,390 --> 00:06:03,780 And now we'll do the score because we've run this sell our x test data has been replaced with the Boston 93 00:06:04,080 --> 00:06:05,850 data frame rather in the heart disease. 94 00:06:05,850 --> 00:06:12,500 So now we can just call it on x test and then Y test wonderful. 95 00:06:12,610 --> 00:06:17,050 And so you might be thinking well these numbers are quite similar here. 96 00:06:17,050 --> 00:06:23,050 Point a five point eighty seven and you're right they are pretty close but in fact our regression model 97 00:06:23,050 --> 00:06:25,580 is actually when we call score. 98 00:06:25,880 --> 00:06:33,340 It's actually using a different metric returns the coefficient of determination or r squared of the 99 00:06:33,340 --> 00:06:34,660 prediction. 100 00:06:34,660 --> 00:06:39,410 Now we'll drive a little bit deeper into some specific metrics per problem. 101 00:06:39,640 --> 00:06:46,270 But the thing to remember here is that the score function on every machine learning model has some kind 102 00:06:46,270 --> 00:06:49,450 of default evaluation metric built into it. 103 00:06:49,480 --> 00:06:51,640 So if we call the random forest regress. 104 00:06:51,940 --> 00:06:57,590 Chances are it will use the coefficient of determination as the default score metric. 105 00:06:57,600 --> 00:06:59,280 Now if we call any regress. 106 00:06:59,410 --> 00:07:06,760 If we go back to our machine learning map we call any one of these estimates here in the green boxes 107 00:07:07,300 --> 00:07:14,350 the default metric will likely be the coefficient of determination because they will all be regression 108 00:07:14,350 --> 00:07:17,160 models and the same goes for classification here. 109 00:07:17,280 --> 00:07:17,990 Right. 110 00:07:18,010 --> 00:07:21,660 Returns the mean accuracy on the given test data and labels. 111 00:07:22,000 --> 00:07:28,840 So for all of these classification models in the green squares here the default evaluation metric is 112 00:07:28,930 --> 00:07:30,010 accuracy. 113 00:07:30,010 --> 00:07:37,180 And so what happens when the score method gets called the model makes predictions on X test creates 114 00:07:37,180 --> 00:07:44,710 y predictions like we've seen up here before y reds and then it compares those predictions to the test 115 00:07:44,710 --> 00:07:50,320 to the actual labels and then returns back some sort of metric to compare how well our model actually 116 00:07:50,320 --> 00:07:51,600 did. 117 00:07:51,750 --> 00:07:52,170 Alright. 118 00:07:52,360 --> 00:07:57,640 That's the score parameter in a nutshell make some predictions compares them to the actual real labels 119 00:07:57,760 --> 00:08:00,510 and then give us an idea of how well our models are doing. 120 00:08:00,520 --> 00:08:04,900 So this is probably the first one that you'll call when you first train and fit a model. 121 00:08:05,080 --> 00:08:06,580 You call the score parameter. 122 00:08:06,580 --> 00:08:11,350 That's why it's listed as the first one here in three point three The Psychic loan documentation for 123 00:08:11,350 --> 00:08:13,380 metrics and scoring. 124 00:08:13,420 --> 00:08:18,460 So now we've seen score let's check out the scoring parameter. 125 00:08:18,670 --> 00:08:22,030 So what we'll do here we'll create another heading ready for the next video. 126 00:08:22,150 --> 00:08:28,510 For point two evaluating a model using the scoring parameter. 127 00:08:29,830 --> 00:08:35,890 So before we get into that one I would say press shift tab here and have a read of the doctoring here 128 00:08:35,890 --> 00:08:41,140 and see if you can figure out what the coefficient of determination is and the same thing goes for the 129 00:08:41,140 --> 00:08:45,280 accuracy here or the classification default score metric. 130 00:08:45,280 --> 00:08:49,600 Press shift tab and have a read through here and see if you can understand what's going on. 131 00:08:49,600 --> 00:08:51,880 If the doctoring doesn't really help you try. 132 00:08:51,880 --> 00:08:53,620 Check out the documentation here. 133 00:08:53,620 --> 00:08:55,030 Model evaluation. 134 00:08:55,240 --> 00:08:57,550 But otherwise I'll see in the next video.