1 00:00:00,940 --> 00:00:06,250 So let's start diving into different ways we can improve our models using hyper parameter chaining. 2 00:00:06,310 --> 00:00:08,820 The first one here is by hand. 3 00:00:08,980 --> 00:00:22,220 So let's make a little heading 5.1 tuning hyper parameters by hand and so far we've talked about dealing 4 00:00:22,220 --> 00:00:27,350 with training and test data sets a model gets trained on the training set it finds patterns and then 5 00:00:27,350 --> 00:00:34,100 it gets evaluated on the test set so it uses those patterns but hyper parameter tuning introduces a 6 00:00:34,100 --> 00:00:37,260 third set a validation set. 7 00:00:37,280 --> 00:00:41,840 So if we have a look here is what it's gonna look like when we turn half of parameters by hand. 8 00:00:41,840 --> 00:00:46,220 So if we were starting with a 100 patient records in the case of our heart disease problem. 9 00:00:46,220 --> 00:00:50,000 Now again this this may change depending on what problem you're working with. 10 00:00:50,000 --> 00:00:53,870 And I've just used the number of hundred because it's an easy number to picture. 11 00:00:53,870 --> 00:00:57,750 So this is our starting data and what we might do is split it. 12 00:00:57,800 --> 00:00:59,810 This is what I mean by three different sets. 13 00:00:59,810 --> 00:01:07,790 So we split we might use 70 to 80 percent so 70 patient records as a training split we might use 10 14 00:01:07,790 --> 00:01:13,810 to 15 percent as a validation split and we might use 10 to 15 percent on a test split. 15 00:01:13,820 --> 00:01:18,100 Now of course you can adjust these numbers these are just some common numbers you'll see used. 16 00:01:18,100 --> 00:01:24,260 And so naturally the model gets trained on these is what we've seen before and usually without the validation 17 00:01:24,260 --> 00:01:26,240 set our model will get evaluated on these. 18 00:01:26,240 --> 00:01:32,660 But now because we have a validation set this is where we're going to choose our model settings a.k.a. 19 00:01:32,840 --> 00:01:37,420 the hyper parameters get tuned on the validation split. 20 00:01:37,730 --> 00:01:43,220 And then finally as normal a model gets evaluated on the test split. 21 00:01:43,250 --> 00:01:48,590 And now the analogy here is if you remember right back at the start the most important concept in machine 22 00:01:48,590 --> 00:01:50,250 learning is the three sets. 23 00:01:50,600 --> 00:01:51,920 So this was right back in the start. 24 00:01:51,920 --> 00:01:57,710 Machine learning 1 0 1 the training set is analogous to say you're at a university course and you were 25 00:01:57,710 --> 00:02:02,840 learning the course materials the validation set is where you would test your knowledge a little bit 26 00:02:02,840 --> 00:02:08,060 and see what you need to adjust so you would get the practice exam from your professor and you try it 27 00:02:08,060 --> 00:02:14,150 out and you go Oh wow I'm doing terribly at questions 1 2 and 4 maybe I need to adjust the way I approach 28 00:02:14,150 --> 00:02:16,610 this and then you do their practice exam again. 29 00:02:16,610 --> 00:02:20,930 And once you've improved your skills on the practice exam you'd feel a little bit confident and then 30 00:02:20,930 --> 00:02:27,040 you'd finally really evaluate yourself on the final exam which is the test set. 31 00:02:27,080 --> 00:02:29,780 So how would we do this with code. 32 00:02:29,780 --> 00:02:34,180 Let's come back to our notebook churning hyper parameters with hand. 33 00:02:34,360 --> 00:02:36,520 Let's make three sets. 34 00:02:36,670 --> 00:02:42,830 Training validation and test. 35 00:02:42,880 --> 00:02:49,480 So what we'll do is we'll just remind ourselves of our get firearms or remind ourselves of our random 36 00:02:49,480 --> 00:02:56,310 forest classify as baseline parameters so these are all the baseline parameters a.k.a. the settings 37 00:02:56,310 --> 00:03:01,680 on our model which we can adjust and now in fact which ones are we going to adjust. 38 00:03:01,810 --> 00:03:07,030 Well after reading the random forest documentation and again you can do this for any machine learning 39 00:03:07,030 --> 00:03:13,030 estimate or model with socket loan we start to get an idea of how we could adjust each setting. 40 00:03:13,030 --> 00:03:19,270 And even with some of the models there'll be some notes on different of parameters socket learn suggest 41 00:03:19,270 --> 00:03:23,560 to change that is kind of like the ones you want to change first to try. 42 00:03:23,560 --> 00:03:26,450 And so after reading through that after going through that change. 43 00:03:26,470 --> 00:03:31,480 So in our case we're going to try and adjust the following. 44 00:03:32,380 --> 00:03:37,170 So we want max depth actually we'll put these in back ticks because we know that there. 45 00:03:37,450 --> 00:03:47,050 So we know that they code max depth Max features mean samples leaf. 46 00:03:47,370 --> 00:03:52,050 If none of these makes sense remember the definition to all of these is in the documentation for any 47 00:03:52,050 --> 00:03:52,920 model. 48 00:03:52,950 --> 00:03:58,220 So if we come up here means samples leave the minimum number of samples required to be at a leaf known 49 00:03:58,230 --> 00:04:01,560 and you can read more in depth in what a leaf note is. 50 00:04:01,560 --> 00:04:04,320 If you'd check out some resources on a random for us in depth. 51 00:04:04,320 --> 00:04:10,420 But for now we're just focusing on how we can adjust hyper parameters of a machine learning model I 52 00:04:10,500 --> 00:04:19,640 samples leaf and mean samples split and an estimate is wonderful. 53 00:04:19,840 --> 00:04:24,940 So these are the type of parameters we're going to adjust and now we'll use our same code as before 54 00:04:25,600 --> 00:04:31,420 except this time we need to create a training validation and test split which just uses these splits 55 00:04:31,420 --> 00:04:31,630 here. 56 00:04:31,630 --> 00:04:36,430 So the training split will create with 70 percent of the data the validation and test sets will each 57 00:04:36,430 --> 00:04:37,320 contain 15. 58 00:04:37,360 --> 00:04:38,830 And we'll get some baseline results. 59 00:04:38,830 --> 00:04:43,510 So we've kind of already got them but we'll do it again get some baseline results and then we'll see 60 00:04:43,510 --> 00:04:47,360 how we can tune the models hybrid parameters by hand. 61 00:04:47,580 --> 00:04:53,740 And since we're going to be evaluating a few models I think it's important that we create an evaluation 62 00:04:53,740 --> 00:04:54,980 function. 63 00:04:55,150 --> 00:05:00,010 So this is what you might want to do whatever model you're you're working on is create functions. 64 00:05:00,010 --> 00:05:03,010 If you know you want to do something more than once. 65 00:05:03,030 --> 00:05:05,240 So it saves out writing lots of code. 66 00:05:05,380 --> 00:05:11,470 But again that's probably are a bit different to what we've seen in the past right. 67 00:05:11,470 --> 00:05:13,810 We know we like to write everything out by hand. 68 00:05:13,850 --> 00:05:22,720 We just leave little doctoring it performs evaluation comparison on wine true labels that's why pred 69 00:05:22,840 --> 00:05:29,290 labels and now as we saw in the evaluation section the main thing evaluating machine learning model 70 00:05:29,290 --> 00:05:33,580 does is it compares its predictions versus the true labels. 71 00:05:33,610 --> 00:05:39,850 So that's what this function is going to do so because we're working with the classifier we want accuracy 72 00:05:40,390 --> 00:05:51,490 accuracy score y true why parades we also want precision and now we might actually note here on our 73 00:05:51,730 --> 00:05:57,610 classification model because if we use this evaluate spreads on a regression model we're putting in 74 00:05:57,880 --> 00:06:03,190 classification metrics we're going to get errors because a regression model predicts different things 75 00:06:03,190 --> 00:06:11,040 to what a classification model predicts precision and then we want recall of course equals recall score. 76 00:06:11,050 --> 00:06:16,510 And now again you could adjust this evaluate Fred's function for what you need but I'm just gonna include 77 00:06:16,990 --> 00:06:21,370 some of the most common metrics that we've covered for the classification models some of things you 78 00:06:21,370 --> 00:06:27,460 want to pay attention to the basically all of your classification models y periods and then we'll create 79 00:06:27,460 --> 00:06:34,060 a dictionary so it can return the predictions so we can compare them with other predictions later round 80 00:06:34,150 --> 00:06:37,820 accuracy we'll go to two decimal places. 81 00:06:37,820 --> 00:06:38,330 Yeah. 82 00:06:38,620 --> 00:06:49,660 Precision round precision to Now always remember this seems like a lot. 83 00:06:50,230 --> 00:06:55,540 It's because first of all it kind of is a lot to take in and one hit and the second thing is that I've 84 00:06:55,540 --> 00:06:57,940 had a fair bit of practice with this. 85 00:06:58,090 --> 00:07:01,660 So I inherently know what to use. 86 00:07:01,660 --> 00:07:06,550 Again I'm always learning but I've just had a bit of practice building these specific kinds of systems 87 00:07:06,550 --> 00:07:09,390 and these patents come up time and time again. 88 00:07:09,540 --> 00:07:10,480 Go Print. 89 00:07:10,600 --> 00:07:21,220 We want just to give us a little bit of printout of what's going on AK accuracy times 100 and then we 90 00:07:21,220 --> 00:07:25,900 only want to decimal places that will be enough. 91 00:07:25,900 --> 00:07:27,530 Then we'll go the same thing. 92 00:07:27,870 --> 00:07:34,500 If precision this can be precision we won't need to times out by 100. 93 00:07:34,500 --> 00:07:37,210 That can just stay at two decimal places. 94 00:07:37,260 --> 00:07:41,430 We need a string and a wonderful print. 95 00:07:41,470 --> 00:07:48,400 Then we're going to go recall is a lot to type out to begin with but because we'll be able to reuse 96 00:07:48,400 --> 00:07:48,580 it. 97 00:07:48,610 --> 00:07:51,180 It's gonna save us down the track. 98 00:07:51,410 --> 00:07:57,970 And then finally we want to print if one school 99 00:08:00,880 --> 00:08:04,690 we're making it might finally make an excellent effort. 100 00:08:04,690 --> 00:08:05,260 There we go. 101 00:08:05,870 --> 00:08:07,030 Okay. 102 00:08:07,150 --> 00:08:11,220 And finally we want to return our mystery date. 103 00:08:11,410 --> 00:08:12,370 Now what is happening here. 104 00:08:12,520 --> 00:08:18,040 Well essentially this function just takes some true labels and some prediction labels from our classification 105 00:08:18,040 --> 00:08:20,090 models we know this from the string here. 106 00:08:20,170 --> 00:08:26,680 It's gonna compute different valuation functions here and then it's going to return a metric dictionary 107 00:08:26,770 --> 00:08:30,600 so we can save that for later and then also print out some metrics. 108 00:08:30,610 --> 00:08:33,280 So we get a print out as soon as we run this function. 109 00:08:33,310 --> 00:08:36,560 So just an evaluation function for our classification models. 110 00:08:36,670 --> 00:08:41,670 WONDERFUL AND I'M GONNA SAVE THE NOTEBOOK here as always. 111 00:08:41,700 --> 00:08:46,370 Now how exactly would we create these train validation and test splits. 112 00:08:46,370 --> 00:08:51,130 Well our train test split function that we've seen before remember this one train discipline. 113 00:08:51,620 --> 00:08:53,390 So this one only returns. 114 00:08:53,390 --> 00:08:59,210 Train splits arrays or matrices into random train and test subsets so this one only splits into train 115 00:08:59,210 --> 00:09:00,160 and test sets. 116 00:09:00,170 --> 00:09:05,650 So what we need to do is we need to manually split our data into train validation and test sets. 117 00:09:05,810 --> 00:09:15,020 And so to do this we can how can we do this because our data is in a data frame heart disease. 118 00:09:15,250 --> 00:09:20,400 We should just be able to do it using some good old math and indexing. 119 00:09:20,410 --> 00:09:26,860 Let's see how we would say from S.K. learn we'll import our model first even though we've already instantiated 120 00:09:27,620 --> 00:09:28,990 a random forest classifier 121 00:09:32,910 --> 00:09:42,300 let's go empty random seed and then we'll shuffle the data. 122 00:09:42,300 --> 00:09:43,720 Why are we shuffling the data here. 123 00:09:43,800 --> 00:09:50,670 If we're splitting into train validation and test when we want to mix it up make sure it's not just 124 00:09:50,670 --> 00:09:51,490 the same. 125 00:09:51,510 --> 00:09:53,610 The records are coming in order that they come in. 126 00:09:53,610 --> 00:09:58,350 If we're using slicing to create our train validation and test when we want to make sure all of these 127 00:09:58,350 --> 00:09:59,340 records are jumbled up. 128 00:09:59,370 --> 00:10:00,420 So that's what we'll do. 129 00:10:00,420 --> 00:10:08,470 We can shuffle it using pandas sample function we frac Eagles 1 4 100 percent of the data. 130 00:10:08,520 --> 00:10:11,300 This is just going to take the heart disease sample it. 131 00:10:11,370 --> 00:10:17,850 This is randomly and then reassign the heart disease data frame to its normal variable. 132 00:10:17,880 --> 00:10:24,390 So just take this sample a bunch of rows at random and then reassign it to this variable a.k.a. shuffling 133 00:10:24,390 --> 00:10:26,640 the data now. 134 00:10:26,640 --> 00:10:30,930 Now the data is being shuffled will split into x and y. 135 00:10:30,930 --> 00:10:37,900 Actually you might say that to heart disease shuffled as a better idea X equals. 136 00:10:37,930 --> 00:10:45,130 This time we need heart disease shuffled and then we're going to drop the target column here. 137 00:10:45,250 --> 00:10:49,750 This video is dragging a little bit on but this is an important concept right. 138 00:10:50,410 --> 00:10:58,450 We're going to see here it'll all be worth it because we'll be able to improve our models using high 139 00:10:58,450 --> 00:11:04,410 performance tuning split the data validation and test sets. 140 00:11:04,420 --> 00:11:05,350 Wonderful. 141 00:11:05,380 --> 00:11:06,440 So the train split. 142 00:11:06,460 --> 00:11:07,420 How we gonna do this. 143 00:11:07,420 --> 00:11:11,380 Well we need to create a number that we can use for slicing. 144 00:11:11,380 --> 00:11:15,190 So we want 70 percent of the length of our data. 145 00:11:17,050 --> 00:11:18,630 So you see what's happening here. 146 00:11:18,650 --> 00:11:19,360 Train split. 147 00:11:19,360 --> 00:11:20,320 We need a number. 148 00:11:20,350 --> 00:11:24,430 We need 70 percent so zero point seven times the length of heart disease shuffled. 149 00:11:25,180 --> 00:11:34,940 And this is going to be 70 percent of data and then we need a valid split which is going to be do you 150 00:11:34,940 --> 00:11:36,150 remember. 151 00:11:36,140 --> 00:11:39,730 Remember what the split is 15 percent that's what we've agreed on. 152 00:11:39,740 --> 00:11:40,310 Wonderful. 153 00:11:40,300 --> 00:11:45,260 So we can do that in the same way that we'll be trained split because we want to index. 154 00:11:45,260 --> 00:11:46,940 We want the next 15 percent. 155 00:11:46,960 --> 00:11:58,330 So try and split plus zero point five one five the length of our heart disease shuffled data frame. 156 00:11:58,570 --> 00:12:01,750 This is 15 percent of data. 157 00:12:01,750 --> 00:12:02,800 Wonderful. 158 00:12:02,800 --> 00:12:16,220 And so x train y train is going to be the X data up to the train split and the y data up to the train 159 00:12:16,220 --> 00:12:17,270 split. 160 00:12:17,270 --> 00:12:17,990 Wonderful. 161 00:12:18,020 --> 00:12:30,000 And then X valid y valid sorry it's going to be the X data from the train split to the valid split and 162 00:12:30,000 --> 00:12:40,770 the same thing for Y is going to be the y data from the train split to the valid split 163 00:12:44,120 --> 00:12:56,540 and then we're going to have x test is going to be and Y test is going to be X data from the valid split 164 00:12:56,630 --> 00:12:58,700 onwards. 165 00:12:58,700 --> 00:13:10,640 So the rest of the data and then Y from the valid split onwards who that was lost so far let's just 166 00:13:10,640 --> 00:13:11,690 see what's going on right. 167 00:13:11,690 --> 00:13:12,200 Let's. 168 00:13:12,260 --> 00:13:12,650 Let's go. 169 00:13:12,650 --> 00:13:24,620 Len print it out x train and then we want Len X valid and then we want Len x test 170 00:13:29,300 --> 00:13:31,480 so that training center has 70 percent of the data. 171 00:13:31,490 --> 00:13:31,850 Okay. 172 00:13:31,870 --> 00:13:33,440 212 rows. 173 00:13:33,650 --> 00:13:35,740 The validation set has 46 rows. 174 00:13:35,750 --> 00:13:40,460 15 percent of the data and the test set has 46 rows. 175 00:13:40,670 --> 00:13:43,070 So that's 15 percent of the data. 176 00:13:43,070 --> 00:13:47,480 There's different amounts here even though both 15 percent because we've used round we have an odd number 177 00:13:47,480 --> 00:13:50,270 of samples in the heart disease shuffled data set. 178 00:13:50,310 --> 00:13:55,390 So there's gonna be a spillover of one somewhere that is perfectly fine. 179 00:13:55,430 --> 00:13:55,710 Okay. 180 00:13:55,720 --> 00:13:57,000 Now we have our splits. 181 00:13:57,080 --> 00:14:01,660 We can do CSF or instantiate a random forest classifier. 182 00:14:01,660 --> 00:14:08,080 And now because we've passed nothing here this is going to instantiate it with the baseline parameters. 183 00:14:08,120 --> 00:14:13,890 So if we do this CSF don't get Paramus. 184 00:14:14,470 --> 00:14:20,530 That's an attribute that's a function to without passing anything to hear our random forest classifier 185 00:14:20,820 --> 00:14:27,210 and instantiate the random form classifier with the baseline parameters a.k.a. this. 186 00:14:27,280 --> 00:14:30,000 And so that's what we want because we want to make some baseline predictions. 187 00:14:30,010 --> 00:14:35,290 So we go see a left up fit we're going fit it on the training data as usual and then we're going to 188 00:14:35,290 --> 00:14:36,910 make predictions. 189 00:14:37,030 --> 00:14:40,960 So why spreads Eagles see a left up predict. 190 00:14:40,960 --> 00:14:48,280 No we're gonna predict on the validation data because we come back here we want to tune our model on 191 00:14:48,280 --> 00:14:49,380 the validation split. 192 00:14:49,390 --> 00:14:57,430 So that's what we're basing our metric on is will first create a baseline metric which is by running 193 00:14:57,430 --> 00:14:59,970 our evaluation function on the validation split. 194 00:15:00,520 --> 00:15:06,970 Then we'll adjust the hyper parameters and try our model again on the validation split and see how they 195 00:15:06,970 --> 00:15:10,590 compare so let's go here and make predictions. 196 00:15:10,590 --> 00:15:21,260 We'll call this baseline predictions and then we'll go evaluate the classifier on validation set. 197 00:15:21,360 --> 00:15:26,610 So we go baseline metrics and now this is where our evaluation function will come in handy metrics Eagles 198 00:15:26,640 --> 00:15:36,920 evaluate grades why valid so see how we pass it the why the validation set and why grades and then we 199 00:15:36,920 --> 00:15:47,650 want to go baseline metrics Well what have we got local variable accuracy reference before this is what 200 00:15:47,650 --> 00:15:57,030 we've missed out on this needs to be accuracy score try this again it's done terrible now what is going 201 00:15:57,030 --> 00:15:57,630 on here. 202 00:15:57,660 --> 00:15:59,620 Accuracy in point to 2. 203 00:15:59,820 --> 00:16:01,230 Have we done this correctly. 204 00:16:01,230 --> 00:16:08,550 Oh there we go that's why we've used our original C that was like that we need to use heart disease 205 00:16:08,550 --> 00:16:14,160 shuffle because otherwise our model was predicting the original labels rather than now shuffled labels 206 00:16:14,340 --> 00:16:18,620 so that's that's what we missed out on the there we go. 207 00:16:19,550 --> 00:16:24,380 So we're getting this warning here that the value of an estimate is will change from 10 inversion zero 208 00:16:24,380 --> 00:16:27,290 point two to one hundred and point to two. 209 00:16:27,320 --> 00:16:33,050 So this is our baseline metrics now what we've done is we've created a train validation and test split 210 00:16:33,490 --> 00:16:40,650 we've we've instantiate and model just if we had done before we fit it on the training data here and 211 00:16:40,650 --> 00:16:46,350 then we've evaluated the baseline parameters baseline hyper parameters on the validation set and got 212 00:16:46,350 --> 00:16:47,800 these metrics. 213 00:16:47,970 --> 00:16:53,550 So if we were to try and improve our results if we were to try and adjust our models hyper parameters 214 00:16:55,140 --> 00:17:02,250 these ones here on our random forest by hand because that's what this section is churning hard parameters 215 00:17:02,280 --> 00:17:04,690 by hand how would we do so. 216 00:17:04,920 --> 00:17:10,320 Let's take this warning as an example of changing our hyper parameter let's change and again if you're 217 00:17:10,320 --> 00:17:16,350 using psychic point to two or above you won't get this warning but since we are we're getting this warning 218 00:17:16,350 --> 00:17:23,610 we're going to try and adjust our hybrid parameters will change and estimate is which is this one here 219 00:17:24,860 --> 00:17:30,470 we'll change it from the baseline of ten to one hundred and see if we get a different score on the validation 220 00:17:30,470 --> 00:17:40,400 set greater random seed random seed forty to what we're going to do is create a second classifier with 221 00:17:41,060 --> 00:17:49,940 different type of parameters because remember instantiating our random forest classifier up here passing 222 00:17:49,940 --> 00:17:55,910 it nothing instantiate the classifier with the baseline hyper parameters it comes with right out of 223 00:17:55,910 --> 00:18:01,430 the box so like using your oven's predefined settings when it comes out of the box to cook your favorite 224 00:18:01,430 --> 00:18:07,730 dish these scores aren't really helping us out here we want a better model we'll create CnF too and 225 00:18:07,730 --> 00:18:14,420 we're going to adjust the hyper parameters like you would adjust your oven to try and improve that favorite 226 00:18:14,420 --> 00:18:19,130 delicious roast chicken dish that you're making maybe you're preparing for a big function that your 227 00:18:19,550 --> 00:18:22,060 or your friends are coming over and you've promised them a great dish. 228 00:18:22,070 --> 00:18:23,550 So you're trying to perfected. 229 00:18:23,690 --> 00:18:29,690 That's what we're doing with our machine learning model so again same thing same data fitted on the 230 00:18:29,690 --> 00:18:38,660 training data but this time we've got CnF too which is using an estimate is as 100 rather than 10 and 231 00:18:38,660 --> 00:18:43,580 now again we could try different settings for all of these features here the ones we're going to try 232 00:18:43,580 --> 00:18:50,450 and adjust but we might start with just one see an example of it and then we'll go and make predictions 233 00:18:54,160 --> 00:18:56,320 so why parades too. 234 00:18:56,440 --> 00:19:02,640 Because we're using CnF to CnF to help predict X valid. 235 00:19:02,650 --> 00:19:04,690 We're doing the same thing is up here. 236 00:19:04,810 --> 00:19:12,440 The baseline predictions are on the validation set and now make predictions with different type of parameters 237 00:19:14,480 --> 00:19:15,500 on the same data. 238 00:19:15,500 --> 00:19:17,690 So different models same data. 239 00:19:17,720 --> 00:19:28,720 Now we're going to evaluate the second classifier CSF two metrics equals we're using our evaluation 240 00:19:28,720 --> 00:19:30,140 function again already. 241 00:19:30,310 --> 00:19:36,430 The one we created before and we're saving a fair few lines of code by just calling it like that. 242 00:19:36,430 --> 00:19:36,760 All right. 243 00:19:37,140 --> 00:19:40,820 Now let's check it out. 244 00:19:41,340 --> 00:19:48,310 OK so now if we compare using the baseline hyper parameters that our model came with. 245 00:19:48,490 --> 00:19:55,050 What can we say that's different here or we've changed is an estimate as equals 100. 246 00:19:55,050 --> 00:19:57,720 So we've adjusted one dial on our model. 247 00:19:58,170 --> 00:20:04,650 So like adjusting one dial on your oven we can see a slight boost in accuracy on the same data. 248 00:20:04,650 --> 00:20:13,910 So CnF F2 has a slightly higher accuracy it's got a higher precision but a lower recall. 249 00:20:14,240 --> 00:20:16,260 And at the same EF 1 score. 250 00:20:16,760 --> 00:20:17,610 Mm hmm. 251 00:20:17,630 --> 00:20:18,530 Okay. 252 00:20:18,590 --> 00:20:23,120 That's giving us a little inkling that maybe if we kept going with different type of parameters we kept 253 00:20:23,120 --> 00:20:31,100 trying to adjust them by hand we would eventually just hopefully keep improving these metrics. 254 00:20:31,100 --> 00:20:34,270 So what would you think we would do next if we got here. 255 00:20:34,490 --> 00:20:36,710 Maybe we tried to change the max depth. 256 00:20:36,740 --> 00:20:38,770 So what's the default max depth. 257 00:20:38,810 --> 00:20:40,670 We find it in here max depth is none. 258 00:20:40,670 --> 00:20:46,010 So maybe we'd look at the documentation and go to the different variables that max length can take. 259 00:20:46,010 --> 00:20:48,330 So it can take integers or none. 260 00:20:48,350 --> 00:20:54,290 Default is none the maximum depth of the tree and doing our research reading documentation we see some 261 00:20:54,290 --> 00:20:56,230 different values for max depth. 262 00:20:56,390 --> 00:21:04,970 So then we'd go back to our model here and maybe we create CSF three Eagles random forest classifier 263 00:21:05,120 --> 00:21:13,760 this time we'll keep an estimate as is the same equals 100 and then we'll change max depth from none 264 00:21:13,760 --> 00:21:17,350 to equal 10 now. 265 00:21:17,370 --> 00:21:18,410 I haven't made this number up. 266 00:21:18,410 --> 00:21:23,160 I've done some research read the documentation and figured out different ideas for max depth and then 267 00:21:23,160 --> 00:21:27,640 we might do the same thing as what we've done here evaluate our third classifier as you might have guessed 268 00:21:27,640 --> 00:21:31,980 here as you might have thought if we're going through and adjusting all of these by hand adjusting all 269 00:21:31,980 --> 00:21:33,660 the numbers by hand. 270 00:21:33,660 --> 00:21:35,470 That's going to take a fair bit of work. 271 00:21:35,550 --> 00:21:35,910 Right. 272 00:21:35,910 --> 00:21:40,590 So just like perfecting your dish in real life like if you're making favorites you can dish everyone's 273 00:21:40,590 --> 00:21:43,610 coming over for dinner you're trying to adjust the settings on your oven. 274 00:21:43,800 --> 00:21:45,630 You may have to go through a bit of trial and error right. 275 00:21:45,630 --> 00:21:50,160 You might cook it once and then it's not that good and then by the tenth time you starting get really 276 00:21:50,160 --> 00:21:52,100 good but that could take a lot longer. 277 00:21:52,110 --> 00:21:57,060 With what we're trying to do here writing code if we have all these different settings up here trying 278 00:21:57,060 --> 00:22:03,630 to change the trying to find the best settings could take far longer than what we have and the rule 279 00:22:03,630 --> 00:22:05,280 is in code don't repeat yourself. 280 00:22:05,280 --> 00:22:07,270 So you might might have guessed. 281 00:22:07,420 --> 00:22:11,870 So I can't learn has a way a method inbuilt that can do this for us. 282 00:22:11,880 --> 00:22:13,940 Try a bunch of different settings for us. 283 00:22:14,040 --> 00:22:18,840 And that's randomized search CV where we're going to have a look at that in the next video. 284 00:22:18,840 --> 00:22:20,560 This one's already getting far too long. 285 00:22:20,640 --> 00:22:26,910 But the main takeaway here is that when we're choosing a model's hyper parameters you can think of the 286 00:22:26,910 --> 00:22:32,120 training set as like the course materials where your you're learning the Foundation data. 287 00:22:32,160 --> 00:22:37,110 That's the training set and then on the practice exam you're refining what you know. 288 00:22:37,110 --> 00:22:40,450 So you've already learned all the baseline patterns like our model. 289 00:22:40,560 --> 00:22:45,150 And then with the validation set we're adjusting the settings so we're refining what the model knows 290 00:22:45,150 --> 00:22:50,760 how it learns different things before we test ourselves on the final exam. 291 00:22:50,880 --> 00:22:56,610 So in a.k.a. how before we evaluate our model on the test set. 292 00:22:56,700 --> 00:23:00,580 Now don't worry if you're feeling a little bit overwhelmed go back through the code that we've written 293 00:23:00,640 --> 00:23:01,840 check it out. 294 00:23:01,840 --> 00:23:07,280 You could try out adjusting a few different high parameters yourself but if not not to worry we're going 295 00:23:07,280 --> 00:23:13,810 to see in the next video how we can use a module in socket learn to try different hyper parameters for 296 00:23:13,810 --> 00:23:14,020 us.