1 00:00:00,390 --> 00:00:08,970 Alrighty let's finish up this section on evaluating machine learning models so we can do so by tackling 2 00:00:09,330 --> 00:00:15,110 the third and final way to evaluate a machine learning more metric functions. 3 00:00:15,120 --> 00:00:21,960 So in essence we've kind of already covered this because all the metrics we've previously seen have 4 00:00:22,110 --> 00:00:24,700 their own function in psychic loan. 5 00:00:24,750 --> 00:00:25,940 Let's see what I mean by this. 6 00:00:25,940 --> 00:00:37,790 So using different valuation metrics as psychic loan functions four point three. 7 00:00:37,910 --> 00:00:38,960 Wonderful. 8 00:00:38,960 --> 00:00:46,190 So to do so for classification we had accuracy we had precision we had recall we had EF 1 and then for 9 00:00:46,190 --> 00:00:51,510 regression we had r squared mean absolute era and means squared error. 10 00:00:51,570 --> 00:00:52,320 All right. 11 00:00:52,340 --> 00:00:57,890 So let's do what we always do and say the code first and then we'll talk. 12 00:00:57,890 --> 00:00:58,120 Right. 13 00:00:58,120 --> 00:01:03,270 So from SBA loan import metrics I'm gonna do a full example as we always do right. 14 00:01:03,310 --> 00:01:05,680 Because that's what we like to do. 15 00:01:05,720 --> 00:01:11,270 We like to be complete with what we're working on precision score recall score you might be out to figure 16 00:01:11,270 --> 00:01:14,840 out what the one is for F1 score or typed in a bit too quick. 17 00:01:14,840 --> 00:01:22,710 So from S.K. loan we'll also import our model because what we might do is create a section here like 18 00:01:24,870 --> 00:01:29,940 classification evaluation functions. 19 00:01:29,940 --> 00:01:34,450 Now again this section here is just another way to do what we've done before. 20 00:01:34,560 --> 00:01:36,810 So ensemble but it's good to practice. 21 00:01:36,810 --> 00:01:43,650 It's always good to practice random forest classifier and then we want S.K. learned of model selection 22 00:01:45,150 --> 00:02:00,000 import train test split MP dot random seed 42 lovely X equals heart disease dot drop we can almost write 23 00:02:00,030 --> 00:02:01,080 this in our sleep. 24 00:02:01,090 --> 00:02:09,420 Now y equals heart disease target and if you can't that is more than okay write it then the reason I 25 00:02:09,420 --> 00:02:15,360 can write this sort of out is because I've had a fair bit of practice with it and you'll be the same 26 00:02:15,360 --> 00:02:15,750 too. 27 00:02:15,830 --> 00:02:19,950 If you're starting out now you might be looking at all these functions or this code and going whole 28 00:02:20,130 --> 00:02:25,590 league goodness there is so much to remember but the beautiful thing is is that it's here it's available 29 00:02:25,590 --> 00:02:30,450 for you you can run it in Jupiter a notebook and you can practice as much as you like. 30 00:02:30,480 --> 00:02:36,870 So really your only roadblock is just put in the work and practicing learning and the Don't forget learning 31 00:02:36,870 --> 00:02:41,520 something new especially machine learning takes time and it's not going away. 32 00:02:41,520 --> 00:02:44,130 So you've got plenty of time. 33 00:02:44,520 --> 00:02:51,120 So what we're doing here we've seen this code before importing some metrics specifically accuracy score 34 00:02:51,330 --> 00:02:59,190 precision score recall score f1 score and we're importing a model and we're importing trying to split 35 00:02:59,550 --> 00:03:05,320 we're splitting our data into x and y we're splitting it into train and test sets where instantiating 36 00:03:05,320 --> 00:03:11,130 a random forest classifier and fitting it to the training data beautiful running machine learning code 37 00:03:12,450 --> 00:03:14,190 and we'll make some predictions 38 00:03:16,740 --> 00:03:23,580 and then we'll go y spreads because remember what is an evaluation metric doing. 39 00:03:23,580 --> 00:03:30,000 If you said comparing our model's predictions to the truth labels to the actual labels you would be 40 00:03:30,000 --> 00:03:36,930 correct evaluate the classifier so now what we're going to do is going to take advantage of these inbuilt 41 00:03:37,290 --> 00:03:38,380 functions here. 42 00:03:38,570 --> 00:03:38,760 Right. 43 00:03:38,760 --> 00:03:42,150 We could use Skoal we could use this going parameter but we've already covered those. 44 00:03:42,150 --> 00:03:45,330 This is using psychic loan functions. 45 00:03:45,540 --> 00:03:54,660 So evaluate the classifier what we'll do is we'll print out something nice maybe classifier metrics 46 00:03:55,320 --> 00:04:01,860 on the test set wonderful and then we're going to print out we'll do a f string. 47 00:04:01,860 --> 00:04:10,140 Accuracy is going to be we use the accuracy score function on y test and Y parades. 48 00:04:10,170 --> 00:04:11,080 Wonderful. 49 00:04:11,080 --> 00:04:12,600 Then we're gonna times out by 100. 50 00:04:12,600 --> 00:04:19,620 So it comes out in a nice neat percentage because I prefer that or we'll prefer that than the decimals. 51 00:04:19,620 --> 00:04:22,380 Maybe you don't know we got here. 52 00:04:22,490 --> 00:04:24,320 We're getting a precision now. 53 00:04:24,330 --> 00:04:30,330 Again we could function something like this and we probably will in a future video but just for examples 54 00:04:30,330 --> 00:04:37,200 sake we'll type it out we'll practice typing it out back and say it like that well it needs the end 55 00:04:37,200 --> 00:04:38,410 of a string. 56 00:04:38,440 --> 00:04:39,000 There we go. 57 00:04:39,420 --> 00:04:40,710 Now we're gonna do a recall. 58 00:04:41,100 --> 00:04:48,120 How would you do this one if I start you off with the F string that's right. 59 00:04:48,220 --> 00:04:49,780 We'll keep going. 60 00:04:49,780 --> 00:04:52,510 So recall score why test. 61 00:04:52,510 --> 00:04:58,410 Remember just comparing our predictions to the test labels to the truth labels. 62 00:04:58,540 --> 00:05:01,350 And then finally we're going to go F one. 63 00:05:01,360 --> 00:05:06,760 So this is something you might do like if you're reporting to your colleague or to your boss or to your 64 00:05:06,760 --> 00:05:11,470 manager or something like that or to the greater public like how your model is doing you might give 65 00:05:11,470 --> 00:05:14,370 them all these different evaluation metrics so they can start to understand. 66 00:05:14,380 --> 00:05:14,940 Okay. 67 00:05:15,040 --> 00:05:19,840 The accuracy is a certain thing but the precision is there so they have an idea of how many false positives 68 00:05:19,840 --> 00:05:24,010 there are and the recall is there so they have an idea of how many false negatives there are. 69 00:05:24,010 --> 00:05:28,090 And the F one is kind of a combination between the precision and recall. 70 00:05:28,090 --> 00:05:30,810 So we'll hit shift and enter walla. 71 00:05:31,000 --> 00:05:38,260 Now we've taken advantage of the third method of evaluating models and that's by directly using functions 72 00:05:38,260 --> 00:05:45,590 such as accuracy score precision score recall score an F 1 score we're to appear in the documentation. 73 00:05:45,670 --> 00:05:49,180 This is a metric function right classification metrics. 74 00:05:49,180 --> 00:05:52,050 Here we go as a whole bunch more there if you want to check them out. 75 00:05:52,710 --> 00:05:56,140 But these are some of the main ones that we've covered and the principle is still the exact same for 76 00:05:56,140 --> 00:05:57,240 the rest of them. 77 00:05:57,280 --> 00:06:08,110 And so if you come in here we're going to do regression evaluation functions turn that into markdown. 78 00:06:08,410 --> 00:06:14,130 So same thing again you could almost do this yourself I reckon and if not don't why we're about to type 79 00:06:14,130 --> 00:06:16,210 it out but how would you go about it. 80 00:06:16,340 --> 00:06:21,420 If we look at our classification evaluation functions what you might do is from S K low end up metrics 81 00:06:21,450 --> 00:06:28,950 import some regression functions then import the regression model then import this trying to split create 82 00:06:28,950 --> 00:06:34,410 the data split it into training and test instantiate your regression model make some predictions and 83 00:06:34,410 --> 00:06:40,050 then evaluate them but this time instead of valuing classifier you're evaluating the regression model 84 00:06:40,170 --> 00:06:50,490 using regression metrics but just for completeness Let's type it out again metrics import to score wonderful 85 00:06:51,020 --> 00:06:58,920 mean absolutely around and we've seen these before mean absolute error mean squared error. 86 00:06:58,920 --> 00:07:03,990 BAIER You travel S.K. learn dot ensemble. 87 00:07:04,170 --> 00:07:12,860 Import random forest regress I so this is a kind of workflow you might do for your own problems right. 88 00:07:12,870 --> 00:07:16,880 If you're working on a regression problem you might have some sort of import statement at the top. 89 00:07:16,880 --> 00:07:20,360 Your notebook like this import train test split. 90 00:07:20,370 --> 00:07:23,100 In our case we've already got our data in a data frame. 91 00:07:23,100 --> 00:07:26,330 You may have some more lines of code getting your data to a proper data frame. 92 00:07:26,330 --> 00:07:27,920 Oh we almost forgot. 93 00:07:27,970 --> 00:07:32,130 MP random seed so that out you actually don't need a random seed. 94 00:07:32,140 --> 00:07:36,960 I'd just like to have one and you'll see them all over the place just so if you run the results the 95 00:07:36,960 --> 00:07:47,850 same as what someone else was getting target access equals one line equals Boston DLF target. 96 00:07:48,450 --> 00:07:49,220 Beautiful. 97 00:07:49,290 --> 00:07:51,240 So and another reason the random seed right. 98 00:07:51,240 --> 00:07:57,270 So if I ran these cells and then you took this notebook as a resource for the course and then you wanted 99 00:07:57,270 --> 00:08:01,920 to compare your results to mine without the random seed they'd probably be different because all of 100 00:08:01,920 --> 00:08:07,650 the randomness in this notebook such as train test split randomly splitting our data into training and 101 00:08:07,650 --> 00:08:12,810 test sets would use different samples for each and so we'd get different numbers and that would cause 102 00:08:12,810 --> 00:08:17,940 confusion which is not what we're about writing for all about communicating what we're finding. 103 00:08:18,120 --> 00:08:20,060 So we're instantiating a model here. 104 00:08:20,120 --> 00:08:22,000 So random forest Progresso. 105 00:08:23,250 --> 00:08:24,240 Wonderful. 106 00:08:24,240 --> 00:08:37,620 When we go model dot fit a train line train beautiful make predictions using our regression model model 107 00:08:37,620 --> 00:08:40,070 dot predict x test. 108 00:08:40,260 --> 00:08:41,460 Yes yes yes. 109 00:08:41,520 --> 00:08:46,260 And now evaluate the regression model. 110 00:08:46,260 --> 00:08:55,590 So I'm going to go here print regression model metrics on the test set again you could function eyes 111 00:08:55,590 --> 00:09:01,050 this to pass your regression model as well as metrics but we're just gonna ride it out here just for 112 00:09:01,860 --> 00:09:05,030 just for good practice to school. 113 00:09:05,050 --> 00:09:08,080 Why test y parades. 114 00:09:08,220 --> 00:09:09,500 Wonderful. 115 00:09:09,540 --> 00:09:11,640 Now we just need to end the string. 116 00:09:11,700 --> 00:09:20,040 We can do the same for main absolute error so m80 equals mean tab complete that one. 117 00:09:20,040 --> 00:09:21,470 Of course we will. 118 00:09:21,600 --> 00:09:22,500 Why parades 119 00:09:25,500 --> 00:09:26,830 wonderful guy. 120 00:09:26,940 --> 00:09:29,200 Print f MSE. 121 00:09:31,070 --> 00:09:39,540 Main squared error comparing it predictions to the actual labels finish and off with the string and 122 00:09:39,630 --> 00:09:40,030 boom. 123 00:09:40,050 --> 00:09:41,600 Oh we're going there of course we did. 124 00:09:42,120 --> 00:09:46,770 And this is gonna give us a warning because out an estimate is not equal to 100. 125 00:09:46,890 --> 00:09:54,840 And what is our other era found input variables with inconsistent number of samples 102. 126 00:09:55,320 --> 00:09:56,550 What has happened here. 127 00:09:58,940 --> 00:10:03,930 One hundred and two boy test y prints vs. sixty one. 128 00:10:04,040 --> 00:10:05,850 You know what it was happening. 129 00:10:05,980 --> 00:10:12,490 Sixty one is the number of ah here we go. 130 00:10:13,270 --> 00:10:13,820 There we go. 131 00:10:13,830 --> 00:10:20,400 You know how I knew that is because if we go up here and if our classification problem if we go Len 132 00:10:21,240 --> 00:10:25,130 why spreads before we instantiate our regression problem it's sixty one. 133 00:10:25,320 --> 00:10:27,820 So because we didn't set y spreads here. 134 00:10:27,820 --> 00:10:28,140 Right. 135 00:10:28,140 --> 00:10:31,410 Previously this was just this. 136 00:10:31,410 --> 00:10:34,230 It was using Y spreads from above. 137 00:10:34,230 --> 00:10:37,960 That's where I got called out from using the same variable names throughout the notebook right. 138 00:10:38,910 --> 00:10:44,880 Ideally we'd have different variable names for our classification and regression problems but just to 139 00:10:44,880 --> 00:10:48,320 illustrate purposes this is usually called Y produce something of the like. 140 00:10:48,410 --> 00:10:49,080 And there we go. 141 00:10:49,080 --> 00:10:50,970 Regression Model metrics on the test set. 142 00:10:50,970 --> 00:10:53,160 Now we've seen similar metrics before. 143 00:10:53,490 --> 00:10:54,180 What can we do now. 144 00:10:54,720 --> 00:10:57,630 Well we've covered a whole bunch right. 145 00:10:57,960 --> 00:11:03,910 And the reason being is because evaluating a machine learning model is paramount. 146 00:11:04,170 --> 00:11:09,390 It's one thing to train one but then again there's nothing worse than training a machine learning model 147 00:11:09,420 --> 00:11:13,520 and optimizing it for the wrong evaluation metric. 148 00:11:13,620 --> 00:11:18,320 So keep the metrics and evaluation methods we've gone through when training your future models. 149 00:11:18,330 --> 00:11:22,330 Make sure you keep them in mind go through them and have a little read here. 150 00:11:22,380 --> 00:11:28,710 This is probably the most important section that you read in the entire psychic loan documentation but 151 00:11:28,710 --> 00:11:34,170 after you've done that you'll naturally probably start to ask is that how do we improve these numbers. 152 00:11:34,170 --> 00:11:36,000 How do we make them better. 153 00:11:36,030 --> 00:11:37,190 They're kind of stagnant. 154 00:11:37,200 --> 00:11:42,330 We've been using a random seed and we've been seeing just the same numbers for accuracy precision recall 155 00:11:42,330 --> 00:11:49,780 in F1 over and over and saying with r squared MBA and MSE So the next section that's what we're going 156 00:11:49,780 --> 00:11:50,250 to cover. 157 00:11:50,280 --> 00:11:53,630 So we look back our list what we're covering. 158 00:11:53,730 --> 00:11:56,420 Number five we're up to improving a model. 159 00:11:56,460 --> 00:11:56,790 All right. 160 00:11:56,790 --> 00:11:58,460 So take a little break. 161 00:11:58,470 --> 00:12:04,530 Have a look at the psychic loan documentation for metrics and scoring quantifying the quality of predictions. 162 00:12:04,530 --> 00:12:10,620 You can find it by just going to this you are all here or searching SBA loan evaluator model but otherwise 163 00:12:10,770 --> 00:12:12,120 get ready for the next section. 164 00:12:12,180 --> 00:12:14,640 We're gonna see how to improve our models.