1
00:00:00,940 --> 00:00:06,250
So let's start diving into different ways we can improve our models using hyper parameter chaining.

2
00:00:06,310 --> 00:00:08,820
The first one here is by hand.

3
00:00:08,980 --> 00:00:22,220
So let's make a little heading 5.1 tuning hyper parameters by hand and so far we've talked about dealing

4
00:00:22,220 --> 00:00:27,350
with training and test data sets a model gets trained on the training set it finds patterns and then

5
00:00:27,350 --> 00:00:34,100
it gets evaluated on the test set so it uses those patterns but hyper parameter tuning introduces a

6
00:00:34,100 --> 00:00:37,260
third set a validation set.

7
00:00:37,280 --> 00:00:41,840
So if we have a look here is what it's gonna look like when we turn half of parameters by hand.

8
00:00:41,840 --> 00:00:46,220
So if we were starting with a 100 patient records in the case of our heart disease problem.

9
00:00:46,220 --> 00:00:50,000
Now again this this may change depending on what problem you're working with.

10
00:00:50,000 --> 00:00:53,870
And I've just used the number of hundred because it's an easy number to picture.

11
00:00:53,870 --> 00:00:57,750
So this is our starting data and what we might do is split it.

12
00:00:57,800 --> 00:00:59,810
This is what I mean by three different sets.

13
00:00:59,810 --> 00:01:07,790
So we split we might use 70 to 80 percent so 70 patient records as a training split we might use 10

14
00:01:07,790 --> 00:01:13,810
to 15 percent as a validation split and we might use 10 to 15 percent on a test split.

15
00:01:13,820 --> 00:01:18,100
Now of course you can adjust these numbers these are just some common numbers you'll see used.

16
00:01:18,100 --> 00:01:24,260
And so naturally the model gets trained on these is what we've seen before and usually without the validation

17
00:01:24,260 --> 00:01:26,240
set our model will get evaluated on these.

18
00:01:26,240 --> 00:01:32,660
But now because we have a validation set this is where we're going to choose our model settings a.k.a.

19
00:01:32,840 --> 00:01:37,420
the hyper parameters get tuned on the validation split.

20
00:01:37,730 --> 00:01:43,220
And then finally as normal a model gets evaluated on the test split.

21
00:01:43,250 --> 00:01:48,590
And now the analogy here is if you remember right back at the start the most important concept in machine

22
00:01:48,590 --> 00:01:50,250
learning is the three sets.

23
00:01:50,600 --> 00:01:51,920
So this was right back in the start.

24
00:01:51,920 --> 00:01:57,710
Machine learning 1 0 1 the training set is analogous to say you're at a university course and you were

25
00:01:57,710 --> 00:02:02,840
learning the course materials the validation set is where you would test your knowledge a little bit

26
00:02:02,840 --> 00:02:08,060
and see what you need to adjust so you would get the practice exam from your professor and you try it

27
00:02:08,060 --> 00:02:14,150
out and you go Oh wow I'm doing terribly at questions 1 2 and 4 maybe I need to adjust the way I approach

28
00:02:14,150 --> 00:02:16,610
this and then you do their practice exam again.

29
00:02:16,610 --> 00:02:20,930
And once you've improved your skills on the practice exam you'd feel a little bit confident and then

30
00:02:20,930 --> 00:02:27,040
you'd finally really evaluate yourself on the final exam which is the test set.

31
00:02:27,080 --> 00:02:29,780
So how would we do this with code.

32
00:02:29,780 --> 00:02:34,180
Let's come back to our notebook churning hyper parameters with hand.

33
00:02:34,360 --> 00:02:36,520
Let's make three sets.

34
00:02:36,670 --> 00:02:42,830
Training validation and test.

35
00:02:42,880 --> 00:02:49,480
So what we'll do is we'll just remind ourselves of our get firearms or remind ourselves of our random

36
00:02:49,480 --> 00:02:56,310
forest classify as baseline parameters so these are all the baseline parameters a.k.a. the settings

37
00:02:56,310 --> 00:03:01,680
on our model which we can adjust and now in fact which ones are we going to adjust.

38
00:03:01,810 --> 00:03:07,030
Well after reading the random forest documentation and again you can do this for any machine learning

39
00:03:07,030 --> 00:03:13,030
estimate or model with socket loan we start to get an idea of how we could adjust each setting.

40
00:03:13,030 --> 00:03:19,270
And even with some of the models there'll be some notes on different of parameters socket learn suggest

41
00:03:19,270 --> 00:03:23,560
to change that is kind of like the ones you want to change first to try.

42
00:03:23,560 --> 00:03:26,450
And so after reading through that after going through that change.

43
00:03:26,470 --> 00:03:31,480
So in our case we're going to try and adjust the following.

44
00:03:32,380 --> 00:03:37,170
So we want max depth actually we'll put these in back ticks because we know that there.

45
00:03:37,450 --> 00:03:47,050
So we know that they code max depth Max features mean samples leaf.

46
00:03:47,370 --> 00:03:52,050
If none of these makes sense remember the definition to all of these is in the documentation for any

47
00:03:52,050 --> 00:03:52,920
model.

48
00:03:52,950 --> 00:03:58,220
So if we come up here means samples leave the minimum number of samples required to be at a leaf known

49
00:03:58,230 --> 00:04:01,560
and you can read more in depth in what a leaf note is.

50
00:04:01,560 --> 00:04:04,320
If you'd check out some resources on a random for us in depth.

51
00:04:04,320 --> 00:04:10,420
But for now we're just focusing on how we can adjust hyper parameters of a machine learning model I

52
00:04:10,500 --> 00:04:19,640
samples leaf and mean samples split and an estimate is wonderful.

53
00:04:19,840 --> 00:04:24,940
So these are the type of parameters we're going to adjust and now we'll use our same code as before

54
00:04:25,600 --> 00:04:31,420
except this time we need to create a training validation and test split which just uses these splits

55
00:04:31,420 --> 00:04:31,630
here.

56
00:04:31,630 --> 00:04:36,430
So the training split will create with 70 percent of the data the validation and test sets will each

57
00:04:36,430 --> 00:04:37,320
contain 15.

58
00:04:37,360 --> 00:04:38,830
And we'll get some baseline results.

59
00:04:38,830 --> 00:04:43,510
So we've kind of already got them but we'll do it again get some baseline results and then we'll see

60
00:04:43,510 --> 00:04:47,360
how we can tune the models hybrid parameters by hand.

61
00:04:47,580 --> 00:04:53,740
And since we're going to be evaluating a few models I think it's important that we create an evaluation

62
00:04:53,740 --> 00:04:54,980
function.

63
00:04:55,150 --> 00:05:00,010
So this is what you might want to do whatever model you're you're working on is create functions.

64
00:05:00,010 --> 00:05:03,010
If you know you want to do something more than once.

65
00:05:03,030 --> 00:05:05,240
So it saves out writing lots of code.

66
00:05:05,380 --> 00:05:11,470
But again that's probably are a bit different to what we've seen in the past right.

67
00:05:11,470 --> 00:05:13,810
We know we like to write everything out by hand.

68
00:05:13,850 --> 00:05:22,720
We just leave little doctoring it performs evaluation comparison on wine true labels that's why pred

69
00:05:22,840 --> 00:05:29,290
labels and now as we saw in the evaluation section the main thing evaluating machine learning model

70
00:05:29,290 --> 00:05:33,580
does is it compares its predictions versus the true labels.

71
00:05:33,610 --> 00:05:39,850
So that's what this function is going to do so because we're working with the classifier we want accuracy

72
00:05:40,390 --> 00:05:51,490
accuracy score y true why parades we also want precision and now we might actually note here on our

73
00:05:51,730 --> 00:05:57,610
classification model because if we use this evaluate spreads on a regression model we're putting in

74
00:05:57,880 --> 00:06:03,190
classification metrics we're going to get errors because a regression model predicts different things

75
00:06:03,190 --> 00:06:11,040
to what a classification model predicts precision and then we want recall of course equals recall score.

76
00:06:11,050 --> 00:06:16,510
And now again you could adjust this evaluate Fred's function for what you need but I'm just gonna include

77
00:06:16,990 --> 00:06:21,370
some of the most common metrics that we've covered for the classification models some of things you

78
00:06:21,370 --> 00:06:27,460
want to pay attention to the basically all of your classification models y periods and then we'll create

79
00:06:27,460 --> 00:06:34,060
a dictionary so it can return the predictions so we can compare them with other predictions later round

80
00:06:34,150 --> 00:06:37,820
accuracy we'll go to two decimal places.

81
00:06:37,820 --> 00:06:38,330
Yeah.

82
00:06:38,620 --> 00:06:49,660
Precision round precision to Now always remember this seems like a lot.

83
00:06:50,230 --> 00:06:55,540
It's because first of all it kind of is a lot to take in and one hit and the second thing is that I've

84
00:06:55,540 --> 00:06:57,940
had a fair bit of practice with this.

85
00:06:58,090 --> 00:07:01,660
So I inherently know what to use.

86
00:07:01,660 --> 00:07:06,550
Again I'm always learning but I've just had a bit of practice building these specific kinds of systems

87
00:07:06,550 --> 00:07:09,390
and these patents come up time and time again.

88
00:07:09,540 --> 00:07:10,480
Go Print.

89
00:07:10,600 --> 00:07:21,220
We want just to give us a little bit of printout of what's going on AK accuracy times 100 and then we

90
00:07:21,220 --> 00:07:25,900
only want to decimal places that will be enough.

91
00:07:25,900 --> 00:07:27,530
Then we'll go the same thing.

92
00:07:27,870 --> 00:07:34,500
If precision this can be precision we won't need to times out by 100.

93
00:07:34,500 --> 00:07:37,210
That can just stay at two decimal places.

94
00:07:37,260 --> 00:07:41,430
We need a string and a wonderful print.

95
00:07:41,470 --> 00:07:48,400
Then we're going to go recall is a lot to type out to begin with but because we'll be able to reuse

96
00:07:48,400 --> 00:07:48,580
it.

97
00:07:48,610 --> 00:07:51,180
It's gonna save us down the track.

98
00:07:51,410 --> 00:07:57,970
And then finally we want to print if one school

99
00:08:00,880 --> 00:08:04,690
we're making it might finally make an excellent effort.

100
00:08:04,690 --> 00:08:05,260
There we go.

101
00:08:05,870 --> 00:08:07,030
Okay.

102
00:08:07,150 --> 00:08:11,220
And finally we want to return our mystery date.

103
00:08:11,410 --> 00:08:12,370
Now what is happening here.

104
00:08:12,520 --> 00:08:18,040
Well essentially this function just takes some true labels and some prediction labels from our classification

105
00:08:18,040 --> 00:08:20,090
models we know this from the string here.

106
00:08:20,170 --> 00:08:26,680
It's gonna compute different valuation functions here and then it's going to return a metric dictionary

107
00:08:26,770 --> 00:08:30,600
so we can save that for later and then also print out some metrics.

108
00:08:30,610 --> 00:08:33,280
So we get a print out as soon as we run this function.

109
00:08:33,310 --> 00:08:36,560
So just an evaluation function for our classification models.

110
00:08:36,670 --> 00:08:41,670
WONDERFUL AND I'M GONNA SAVE THE NOTEBOOK here as always.

111
00:08:41,700 --> 00:08:46,370
Now how exactly would we create these train validation and test splits.

112
00:08:46,370 --> 00:08:51,130
Well our train test split function that we've seen before remember this one train discipline.

113
00:08:51,620 --> 00:08:53,390
So this one only returns.

114
00:08:53,390 --> 00:08:59,210
Train splits arrays or matrices into random train and test subsets so this one only splits into train

115
00:08:59,210 --> 00:09:00,160
and test sets.

116
00:09:00,170 --> 00:09:05,650
So what we need to do is we need to manually split our data into train validation and test sets.

117
00:09:05,810 --> 00:09:15,020
And so to do this we can how can we do this because our data is in a data frame heart disease.

118
00:09:15,250 --> 00:09:20,400
We should just be able to do it using some good old math and indexing.

119
00:09:20,410 --> 00:09:26,860
Let's see how we would say from S.K. learn we'll import our model first even though we've already instantiated

120
00:09:27,620 --> 00:09:28,990
a random forest classifier

121
00:09:32,910 --> 00:09:42,300
let's go empty random seed and then we'll shuffle the data.

122
00:09:42,300 --> 00:09:43,720
Why are we shuffling the data here.

123
00:09:43,800 --> 00:09:50,670
If we're splitting into train validation and test when we want to mix it up make sure it's not just

124
00:09:50,670 --> 00:09:51,490
the same.

125
00:09:51,510 --> 00:09:53,610
The records are coming in order that they come in.

126
00:09:53,610 --> 00:09:58,350
If we're using slicing to create our train validation and test when we want to make sure all of these

127
00:09:58,350 --> 00:09:59,340
records are jumbled up.

128
00:09:59,370 --> 00:10:00,420
So that's what we'll do.

129
00:10:00,420 --> 00:10:08,470
We can shuffle it using pandas sample function we frac Eagles 1 4 100 percent of the data.

130
00:10:08,520 --> 00:10:11,300
This is just going to take the heart disease sample it.

131
00:10:11,370 --> 00:10:17,850
This is randomly and then reassign the heart disease data frame to its normal variable.

132
00:10:17,880 --> 00:10:24,390
So just take this sample a bunch of rows at random and then reassign it to this variable a.k.a. shuffling

133
00:10:24,390 --> 00:10:26,640
the data now.

134
00:10:26,640 --> 00:10:30,930
Now the data is being shuffled will split into x and y.

135
00:10:30,930 --> 00:10:37,900
Actually you might say that to heart disease shuffled as a better idea X equals.

136
00:10:37,930 --> 00:10:45,130
This time we need heart disease shuffled and then we're going to drop the target column here.

137
00:10:45,250 --> 00:10:49,750
This video is dragging a little bit on but this is an important concept right.

138
00:10:50,410 --> 00:10:58,450
We're going to see here it'll all be worth it because we'll be able to improve our models using high

139
00:10:58,450 --> 00:11:04,410
performance tuning split the data validation and test sets.

140
00:11:04,420 --> 00:11:05,350
Wonderful.

141
00:11:05,380 --> 00:11:06,440
So the train split.

142
00:11:06,460 --> 00:11:07,420
How we gonna do this.

143
00:11:07,420 --> 00:11:11,380
Well we need to create a number that we can use for slicing.

144
00:11:11,380 --> 00:11:15,190
So we want 70 percent of the length of our data.

145
00:11:17,050 --> 00:11:18,630
So you see what's happening here.

146
00:11:18,650 --> 00:11:19,360
Train split.

147
00:11:19,360 --> 00:11:20,320
We need a number.

148
00:11:20,350 --> 00:11:24,430
We need 70 percent so zero point seven times the length of heart disease shuffled.

149
00:11:25,180 --> 00:11:34,940
And this is going to be 70 percent of data and then we need a valid split which is going to be do you

150
00:11:34,940 --> 00:11:36,150
remember.

151
00:11:36,140 --> 00:11:39,730
Remember what the split is 15 percent that's what we've agreed on.

152
00:11:39,740 --> 00:11:40,310
Wonderful.

153
00:11:40,300 --> 00:11:45,260
So we can do that in the same way that we'll be trained split because we want to index.

154
00:11:45,260 --> 00:11:46,940
We want the next 15 percent.

155
00:11:46,960 --> 00:11:58,330
So try and split plus zero point five one five the length of our heart disease shuffled data frame.

156
00:11:58,570 --> 00:12:01,750
This is 15 percent of data.

157
00:12:01,750 --> 00:12:02,800
Wonderful.

158
00:12:02,800 --> 00:12:16,220
And so x train y train is going to be the X data up to the train split and the y data up to the train

159
00:12:16,220 --> 00:12:17,270
split.

160
00:12:17,270 --> 00:12:17,990
Wonderful.

161
00:12:18,020 --> 00:12:30,000
And then X valid y valid sorry it's going to be the X data from the train split to the valid split and

162
00:12:30,000 --> 00:12:40,770
the same thing for Y is going to be the y data from the train split to the valid split

163
00:12:44,120 --> 00:12:56,540
and then we're going to have x test is going to be and Y test is going to be X data from the valid split

164
00:12:56,630 --> 00:12:58,700
onwards.

165
00:12:58,700 --> 00:13:10,640
So the rest of the data and then Y from the valid split onwards who that was lost so far let's just

166
00:13:10,640 --> 00:13:11,690
see what's going on right.

167
00:13:11,690 --> 00:13:12,200
Let's.

168
00:13:12,260 --> 00:13:12,650
Let's go.

169
00:13:12,650 --> 00:13:24,620
Len print it out x train and then we want Len X valid and then we want Len x test

170
00:13:29,300 --> 00:13:31,480
so that training center has 70 percent of the data.

171
00:13:31,490 --> 00:13:31,850
Okay.

172
00:13:31,870 --> 00:13:33,440
212 rows.

173
00:13:33,650 --> 00:13:35,740
The validation set has 46 rows.

174
00:13:35,750 --> 00:13:40,460
15 percent of the data and the test set has 46 rows.

175
00:13:40,670 --> 00:13:43,070
So that's 15 percent of the data.

176
00:13:43,070 --> 00:13:47,480
There's different amounts here even though both 15 percent because we've used round we have an odd number

177
00:13:47,480 --> 00:13:50,270
of samples in the heart disease shuffled data set.

178
00:13:50,310 --> 00:13:55,390
So there's gonna be a spillover of one somewhere that is perfectly fine.

179
00:13:55,430 --> 00:13:55,710
Okay.

180
00:13:55,720 --> 00:13:57,000
Now we have our splits.

181
00:13:57,080 --> 00:14:01,660
We can do CSF or instantiate a random forest classifier.

182
00:14:01,660 --> 00:14:08,080
And now because we've passed nothing here this is going to instantiate it with the baseline parameters.

183
00:14:08,120 --> 00:14:13,890
So if we do this CSF don't get Paramus.

184
00:14:14,470 --> 00:14:20,530
That's an attribute that's a function to without passing anything to hear our random forest classifier

185
00:14:20,820 --> 00:14:27,210
and instantiate the random form classifier with the baseline parameters a.k.a. this.

186
00:14:27,280 --> 00:14:30,000
And so that's what we want because we want to make some baseline predictions.

187
00:14:30,010 --> 00:14:35,290
So we go see a left up fit we're going fit it on the training data as usual and then we're going to

188
00:14:35,290 --> 00:14:36,910
make predictions.

189
00:14:37,030 --> 00:14:40,960
So why spreads Eagles see a left up predict.

190
00:14:40,960 --> 00:14:48,280
No we're gonna predict on the validation data because we come back here we want to tune our model on

191
00:14:48,280 --> 00:14:49,380
the validation split.

192
00:14:49,390 --> 00:14:57,430
So that's what we're basing our metric on is will first create a baseline metric which is by running

193
00:14:57,430 --> 00:14:59,970
our evaluation function on the validation split.

194
00:15:00,520 --> 00:15:06,970
Then we'll adjust the hyper parameters and try our model again on the validation split and see how they

195
00:15:06,970 --> 00:15:10,590
compare so let's go here and make predictions.

196
00:15:10,590 --> 00:15:21,260
We'll call this baseline predictions and then we'll go evaluate the classifier on validation set.

197
00:15:21,360 --> 00:15:26,610
So we go baseline metrics and now this is where our evaluation function will come in handy metrics Eagles

198
00:15:26,640 --> 00:15:36,920
evaluate grades why valid so see how we pass it the why the validation set and why grades and then we

199
00:15:36,920 --> 00:15:47,650
want to go baseline metrics Well what have we got local variable accuracy reference before this is what

200
00:15:47,650 --> 00:15:57,030
we've missed out on this needs to be accuracy score try this again it's done terrible now what is going

201
00:15:57,030 --> 00:15:57,630
on here.

202
00:15:57,660 --> 00:15:59,620
Accuracy in point to 2.

203
00:15:59,820 --> 00:16:01,230
Have we done this correctly.

204
00:16:01,230 --> 00:16:08,550
Oh there we go that's why we've used our original C that was like that we need to use heart disease

205
00:16:08,550 --> 00:16:14,160
shuffle because otherwise our model was predicting the original labels rather than now shuffled labels

206
00:16:14,340 --> 00:16:18,620
so that's that's what we missed out on the there we go.

207
00:16:19,550 --> 00:16:24,380
So we're getting this warning here that the value of an estimate is will change from 10 inversion zero

208
00:16:24,380 --> 00:16:27,290
point two to one hundred and point to two.

209
00:16:27,320 --> 00:16:33,050
So this is our baseline metrics now what we've done is we've created a train validation and test split

210
00:16:33,490 --> 00:16:40,650
we've we've instantiate and model just if we had done before we fit it on the training data here and

211
00:16:40,650 --> 00:16:46,350
then we've evaluated the baseline parameters baseline hyper parameters on the validation set and got

212
00:16:46,350 --> 00:16:47,800
these metrics.

213
00:16:47,970 --> 00:16:53,550
So if we were to try and improve our results if we were to try and adjust our models hyper parameters

214
00:16:55,140 --> 00:17:02,250
these ones here on our random forest by hand because that's what this section is churning hard parameters

215
00:17:02,280 --> 00:17:04,690
by hand how would we do so.

216
00:17:04,920 --> 00:17:10,320
Let's take this warning as an example of changing our hyper parameter let's change and again if you're

217
00:17:10,320 --> 00:17:16,350
using psychic point to two or above you won't get this warning but since we are we're getting this warning

218
00:17:16,350 --> 00:17:23,610
we're going to try and adjust our hybrid parameters will change and estimate is which is this one here

219
00:17:24,860 --> 00:17:30,470
we'll change it from the baseline of ten to one hundred and see if we get a different score on the validation

220
00:17:30,470 --> 00:17:40,400
set greater random seed random seed forty to what we're going to do is create a second classifier with

221
00:17:41,060 --> 00:17:49,940
different type of parameters because remember instantiating our random forest classifier up here passing

222
00:17:49,940 --> 00:17:55,910
it nothing instantiate the classifier with the baseline hyper parameters it comes with right out of

223
00:17:55,910 --> 00:18:01,430
the box so like using your oven's predefined settings when it comes out of the box to cook your favorite

224
00:18:01,430 --> 00:18:07,730
dish these scores aren't really helping us out here we want a better model we'll create CnF too and

225
00:18:07,730 --> 00:18:14,420
we're going to adjust the hyper parameters like you would adjust your oven to try and improve that favorite

226
00:18:14,420 --> 00:18:19,130
delicious roast chicken dish that you're making maybe you're preparing for a big function that your

227
00:18:19,550 --> 00:18:22,060
or your friends are coming over and you've promised them a great dish.

228
00:18:22,070 --> 00:18:23,550
So you're trying to perfected.

229
00:18:23,690 --> 00:18:29,690
That's what we're doing with our machine learning model so again same thing same data fitted on the

230
00:18:29,690 --> 00:18:38,660
training data but this time we've got CnF too which is using an estimate is as 100 rather than 10 and

231
00:18:38,660 --> 00:18:43,580
now again we could try different settings for all of these features here the ones we're going to try

232
00:18:43,580 --> 00:18:50,450
and adjust but we might start with just one see an example of it and then we'll go and make predictions

233
00:18:54,160 --> 00:18:56,320
so why parades too.

234
00:18:56,440 --> 00:19:02,640
Because we're using CnF to CnF to help predict X valid.

235
00:19:02,650 --> 00:19:04,690
We're doing the same thing is up here.

236
00:19:04,810 --> 00:19:12,440
The baseline predictions are on the validation set and now make predictions with different type of parameters

237
00:19:14,480 --> 00:19:15,500
on the same data.

238
00:19:15,500 --> 00:19:17,690
So different models same data.

239
00:19:17,720 --> 00:19:28,720
Now we're going to evaluate the second classifier CSF two metrics equals we're using our evaluation

240
00:19:28,720 --> 00:19:30,140
function again already.

241
00:19:30,310 --> 00:19:36,430
The one we created before and we're saving a fair few lines of code by just calling it like that.

242
00:19:36,430 --> 00:19:36,760
All right.

243
00:19:37,140 --> 00:19:40,820
Now let's check it out.

244
00:19:41,340 --> 00:19:48,310
OK so now if we compare using the baseline hyper parameters that our model came with.

245
00:19:48,490 --> 00:19:55,050
What can we say that's different here or we've changed is an estimate as equals 100.

246
00:19:55,050 --> 00:19:57,720
So we've adjusted one dial on our model.

247
00:19:58,170 --> 00:20:04,650
So like adjusting one dial on your oven we can see a slight boost in accuracy on the same data.

248
00:20:04,650 --> 00:20:13,910
So CnF F2 has a slightly higher accuracy it's got a higher precision but a lower recall.

249
00:20:14,240 --> 00:20:16,260
And at the same EF 1 score.

250
00:20:16,760 --> 00:20:17,610
Mm hmm.

251
00:20:17,630 --> 00:20:18,530
Okay.

252
00:20:18,590 --> 00:20:23,120
That's giving us a little inkling that maybe if we kept going with different type of parameters we kept

253
00:20:23,120 --> 00:20:31,100
trying to adjust them by hand we would eventually just hopefully keep improving these metrics.

254
00:20:31,100 --> 00:20:34,270
So what would you think we would do next if we got here.

255
00:20:34,490 --> 00:20:36,710
Maybe we tried to change the max depth.

256
00:20:36,740 --> 00:20:38,770
So what's the default max depth.

257
00:20:38,810 --> 00:20:40,670
We find it in here max depth is none.

258
00:20:40,670 --> 00:20:46,010
So maybe we'd look at the documentation and go to the different variables that max length can take.

259
00:20:46,010 --> 00:20:48,330
So it can take integers or none.

260
00:20:48,350 --> 00:20:54,290
Default is none the maximum depth of the tree and doing our research reading documentation we see some

261
00:20:54,290 --> 00:20:56,230
different values for max depth.

262
00:20:56,390 --> 00:21:04,970
So then we'd go back to our model here and maybe we create CSF three Eagles random forest classifier

263
00:21:05,120 --> 00:21:13,760
this time we'll keep an estimate as is the same equals 100 and then we'll change max depth from none

264
00:21:13,760 --> 00:21:17,350
to equal 10 now.

265
00:21:17,370 --> 00:21:18,410
I haven't made this number up.

266
00:21:18,410 --> 00:21:23,160
I've done some research read the documentation and figured out different ideas for max depth and then

267
00:21:23,160 --> 00:21:27,640
we might do the same thing as what we've done here evaluate our third classifier as you might have guessed

268
00:21:27,640 --> 00:21:31,980
here as you might have thought if we're going through and adjusting all of these by hand adjusting all

269
00:21:31,980 --> 00:21:33,660
the numbers by hand.

270
00:21:33,660 --> 00:21:35,470
That's going to take a fair bit of work.

271
00:21:35,550 --> 00:21:35,910
Right.

272
00:21:35,910 --> 00:21:40,590
So just like perfecting your dish in real life like if you're making favorites you can dish everyone's

273
00:21:40,590 --> 00:21:43,610
coming over for dinner you're trying to adjust the settings on your oven.

274
00:21:43,800 --> 00:21:45,630
You may have to go through a bit of trial and error right.

275
00:21:45,630 --> 00:21:50,160
You might cook it once and then it's not that good and then by the tenth time you starting get really

276
00:21:50,160 --> 00:21:52,100
good but that could take a lot longer.

277
00:21:52,110 --> 00:21:57,060
With what we're trying to do here writing code if we have all these different settings up here trying

278
00:21:57,060 --> 00:22:03,630
to change the trying to find the best settings could take far longer than what we have and the rule

279
00:22:03,630 --> 00:22:05,280
is in code don't repeat yourself.

280
00:22:05,280 --> 00:22:07,270
So you might might have guessed.

281
00:22:07,420 --> 00:22:11,870
So I can't learn has a way a method inbuilt that can do this for us.

282
00:22:11,880 --> 00:22:13,940
Try a bunch of different settings for us.

283
00:22:14,040 --> 00:22:18,840
And that's randomized search CV where we're going to have a look at that in the next video.

284
00:22:18,840 --> 00:22:20,560
This one's already getting far too long.

285
00:22:20,640 --> 00:22:26,910
But the main takeaway here is that when we're choosing a model's hyper parameters you can think of the

286
00:22:26,910 --> 00:22:32,120
training set as like the course materials where your you're learning the Foundation data.

287
00:22:32,160 --> 00:22:37,110
That's the training set and then on the practice exam you're refining what you know.

288
00:22:37,110 --> 00:22:40,450
So you've already learned all the baseline patterns like our model.

289
00:22:40,560 --> 00:22:45,150
And then with the validation set we're adjusting the settings so we're refining what the model knows

290
00:22:45,150 --> 00:22:50,760
how it learns different things before we test ourselves on the final exam.

291
00:22:50,880 --> 00:22:56,610
So in a.k.a. how before we evaluate our model on the test set.

292
00:22:56,700 --> 00:23:00,580
Now don't worry if you're feeling a little bit overwhelmed go back through the code that we've written

293
00:23:00,640 --> 00:23:01,840
check it out.

294
00:23:01,840 --> 00:23:07,280
You could try out adjusting a few different high parameters yourself but if not not to worry we're going

295
00:23:07,280 --> 00:23:13,810
to see in the next video how we can use a module in socket learn to try different hyper parameters for

296
00:23:13,810 --> 00:23:14,020
us.