1
00:00:00,170 --> 00:00:01,080
All righty.

2
00:00:01,240 --> 00:00:06,300
So we found a logistic regression model that performs pretty well but we went and showed that to our

3
00:00:06,300 --> 00:00:11,910
boss and then she said we had to find a whole bunch of things like half revenue turning feature importance

4
00:00:11,940 --> 00:00:18,600
confusion matrix cross validation and these things could sound made up to someone who is not a budding

5
00:00:18,600 --> 00:00:24,340
data scientist or machine learning engineer like yourself but being the bonding machine learning engineer

6
00:00:24,340 --> 00:00:28,550
and data scientists that you are you know that data scientists make up words all the time.

7
00:00:28,830 --> 00:00:30,300
And the good thing is that these aren't made up.

8
00:00:30,330 --> 00:00:32,630
These are things that we can actually find.

9
00:00:32,640 --> 00:00:47,600
So now we've got a baseline model and we know a models first predictions aren't always what we should

10
00:00:48,800 --> 00:00:52,570
base our next steps off.

11
00:00:53,180 --> 00:00:53,990
What should we do

12
00:00:58,000 --> 00:01:02,280
if we remind ourselves of the classification metrics and regression metrics.

13
00:01:02,290 --> 00:01:05,860
We're not worried about regression right because we're working on a classification problem.

14
00:01:05,860 --> 00:01:08,490
We've used the default accuracy.

15
00:01:08,530 --> 00:01:13,960
Then there's precision there's recall there's F1 which are all part of the things that the boss asked

16
00:01:13,960 --> 00:01:15,840
us and we got a confusion matrix.

17
00:01:15,850 --> 00:01:20,010
Okay we should probably make one of them since we're working on a classification problem.

18
00:01:20,290 --> 00:01:26,020
Then we've got a classification report which has a bunch of information here like precision recall f1

19
00:01:26,020 --> 00:01:29,810
score support accuracy macro average far out.

20
00:01:29,830 --> 00:01:35,040
And then we've got hyper parameter tuning we've covered before but we've got a lot of steps.

21
00:01:35,050 --> 00:01:43,980
So first we should put down what we're going to tackle let's look at the following because we're a part

22
00:01:43,980 --> 00:01:46,050
of this experimental phase.

23
00:01:46,050 --> 00:01:47,880
That's what we're going to be working towards.

24
00:01:47,880 --> 00:01:50,890
We're going to be taking our models that we've got.

25
00:01:50,910 --> 00:01:56,040
These are our baseline models here and we're going to be experimenting with them further seeing if we

26
00:01:56,040 --> 00:02:03,240
can improve them and seeing does logistic regression perform best on accuracy or could attuned or hyper

27
00:02:03,240 --> 00:02:07,650
parameter tuned version of a random forest beat it out or maybe K and M could improve.

28
00:02:07,650 --> 00:02:08,360
Who knows.

29
00:02:08,640 --> 00:02:09,540
Let's write to analysts.

30
00:02:09,540 --> 00:02:14,760
Let's look at the following hyper parameter tuning

31
00:02:17,310 --> 00:02:21,520
feature importance confusion matrix

32
00:02:24,570 --> 00:02:25,800
cross validation

33
00:02:29,170 --> 00:02:29,860
precision

34
00:02:33,940 --> 00:02:46,420
recall if one school classification report ROIC curve area under the curve.

35
00:02:47,250 --> 00:02:52,590
So as we saw on the psychic Loan Section These are some of the things that you should pay attention

36
00:02:52,590 --> 00:02:56,130
to when youre working on a classification model.

37
00:02:56,130 --> 00:03:01,890
These two here are actually part of almost any machine learning model that you'll be working on.

38
00:03:01,980 --> 00:03:04,610
But these are specific to classification.

39
00:03:04,620 --> 00:03:09,150
We've seen the difference between regression and classification metrics but since we're focusing on

40
00:03:09,150 --> 00:03:13,470
the classification problem whether or not someone has heart disease based on their health parameters

41
00:03:13,740 --> 00:03:16,240
we're gonna be focused on looking at these.

42
00:03:16,350 --> 00:03:19,020
So let's jump into it.

43
00:03:19,020 --> 00:03:20,340
So first things first.

44
00:03:20,400 --> 00:03:26,800
Why don't we try hyper parameter tuning and cross validation and remember what's hyper parameter churning.

45
00:03:26,820 --> 00:03:32,690
Well to cook your favorite dish you know that to set the oven to 180 degrees and turn that grill on.

46
00:03:33,090 --> 00:03:39,000
But when your roommate cooks their favorite dish they set the oven to 200 degrees and use the fan force

47
00:03:39,000 --> 00:03:43,000
mode same oven different settings different outcomes.

48
00:03:43,020 --> 00:03:47,970
This is the same for machine learning algorithms you can use the same algorithms but change the settings

49
00:03:48,030 --> 00:03:51,470
a.k.a. hyper barometers and get different results.

50
00:03:51,690 --> 00:03:56,490
But just like turning up the oven too high if you do that for machine learning algorithms you tweak

51
00:03:56,490 --> 00:04:02,550
the settings too much you can change the settings and it works well or so well that it over fits.

52
00:04:02,550 --> 00:04:07,470
So we're really here what we're doing to do is look for the Goldilocks model one which does well in

53
00:04:07,470 --> 00:04:14,530
our dataset but also performs well on unseen examples should we turn AK neighbors classifier.

54
00:04:15,030 --> 00:04:15,440
Mm hmm.

55
00:04:15,900 --> 00:04:21,450
Yeah I might do that even though we've said goodbye to the cayenne and classify we might just see how

56
00:04:21,450 --> 00:04:22,230
would you tune it.

57
00:04:22,260 --> 00:04:27,030
And if you're wondering where I would start if I was trying to approach this this is what I'd start

58
00:04:27,030 --> 00:04:37,330
with how to tune a K neighbors classifier model we go here in depth parameter tuning for cayenne in

59
00:04:37,780 --> 00:04:41,200
model selection tuning and evaluation and came nearest neighbors.

60
00:04:41,260 --> 00:04:45,700
So if we were to read through these we would probably find some information that we're going to cover

61
00:04:45,700 --> 00:04:48,510
here so let's start it out.

62
00:04:48,960 --> 00:04:54,030
Let's make a little heading here we'll go here.

63
00:04:54,740 --> 00:05:02,400
Hyper parameter tuning let's change.

64
00:05:02,790 --> 00:05:03,170
Kanan

65
00:05:06,510 --> 00:05:13,980
let's go a list of training schools an empty list so try and scores and then a list of test scores.

66
00:05:14,020 --> 00:05:19,450
Empty list because what we want to do is compare different versions of the same model.

67
00:05:19,460 --> 00:05:24,500
So model with different settings and compare their scores and the two different datasets.

68
00:05:24,650 --> 00:05:29,720
Looking up what we looked at before we would to read into these again you can do this with almost any

69
00:05:29,720 --> 00:05:32,680
machine learning model or basically any machine learning model.

70
00:05:32,690 --> 00:05:33,890
This is a part of the research.

71
00:05:33,890 --> 00:05:39,230
This is a part of the experimentation step of many machine learning problems is is searching up things

72
00:05:39,230 --> 00:05:44,900
like this then reading what you can find and bringing it back to a scenario like what we're doing here.

73
00:05:45,020 --> 00:05:47,220
But let's just pretend we've gone through that.

74
00:05:48,230 --> 00:05:51,080
So we're going to create a list of different values

75
00:05:53,570 --> 00:05:56,030
for and neighbors.

76
00:05:56,030 --> 00:05:57,560
Now why and neighbors.

77
00:05:57,560 --> 00:06:06,480
Because as I said before when we did our reading we found that if we go S.K. loan K and in K neighbors

78
00:06:06,550 --> 00:06:14,060
classifier we see that n neighbors is one of the parameters and we could see this by reading through

79
00:06:14,060 --> 00:06:16,550
the documentation of psychic loan as well.

80
00:06:16,580 --> 00:06:21,200
So the number of neighbors used by default for K neighbors queries in our research on how to tune a

81
00:06:21,300 --> 00:06:22,070
and and model.

82
00:06:22,100 --> 00:06:26,810
We found that we can adjust the number of neighbors and if you want to find out more about this you

83
00:06:26,810 --> 00:06:28,090
can check out the documentation.

84
00:06:28,100 --> 00:06:29,090
You can read it up here.

85
00:06:29,570 --> 00:06:30,610
So that's what we're going to do.

86
00:06:30,620 --> 00:06:33,680
We're going to create a list of different values.

87
00:06:33,830 --> 00:06:35,750
So the default here is five.

88
00:06:35,870 --> 00:06:41,270
So maybe what we might do is try different values from one to 20 or something like that.

89
00:06:41,960 --> 00:06:48,500
So let's see here neighbors equals range 1 to 21.

90
00:06:48,500 --> 00:06:49,140
Wonderful.

91
00:06:49,580 --> 00:06:58,620
And then we're going to set up K an instance so go Mary Kay and N equals what is it.

92
00:06:59,180 --> 00:07:02,070
You know this kind enables classifier.

93
00:07:02,070 --> 00:07:03,050
Beautiful.

94
00:07:03,150 --> 00:07:05,320
And then we might go loop through.

95
00:07:05,340 --> 00:07:09,420
We need to loop through different and neighbors

96
00:07:12,630 --> 00:07:18,350
which is our ace that we made here for I in neighbors thinking out loud here.

97
00:07:18,660 --> 00:07:20,350
We'll go Cannon Dot.

98
00:07:20,400 --> 00:07:22,160
Set parameters.

99
00:07:22,170 --> 00:07:27,390
Remember you can adjust the parameters of a machine learning model and socket line using dot set parameters

100
00:07:28,050 --> 00:07:32,490
equals I because I'm going to loop through this range here.

101
00:07:32,490 --> 00:07:33,530
Wonderful.

102
00:07:33,630 --> 00:07:41,700
And then we're going to fit the algorithm what we're trying to do here is improve on this baseline score.

103
00:07:41,700 --> 00:07:44,950
So this is with the default of 5.

104
00:07:45,060 --> 00:07:52,470
And so we're going to try 20 different versions 1 to 20 to see if it's any better I do anything I accidentally

105
00:07:52,470 --> 00:07:58,560
hit shift into trigger happy you know my fingers are just they just default to shift in into whenever

106
00:07:58,560 --> 00:08:00,170
they want.

107
00:08:00,180 --> 00:08:07,510
Now we want to update training scores list so we're going to go training.

108
00:08:07,710 --> 00:08:12,860
We want train scores up here train scores dot append.

109
00:08:12,960 --> 00:08:18,250
Now we want our canine Ensign stopped score on X train y train.

110
00:08:18,600 --> 00:08:19,680
Wonderful.

111
00:08:19,680 --> 00:08:30,090
And then we want to update the test scores list and we'll do that test scores dot append K and N dot

112
00:08:30,090 --> 00:08:34,420
score and you can probably guess this is going to be on the test set.

113
00:08:34,500 --> 00:08:36,270
Beautiful bonus points if he did.

114
00:08:36,960 --> 00:08:43,140
Now what this is going to do is just gonna loop through this range of 1 to 20 and then it's gonna create

115
00:08:43,650 --> 00:08:48,190
20 different canine models and append their scores to lists.

116
00:08:48,240 --> 00:08:56,950
So let's check out those lists train scores shift into wonderful and then go test scores.

117
00:08:57,220 --> 00:08:59,190
Typo.

118
00:09:00,010 --> 00:09:02,110
Now we've got 20 different schools or so.

119
00:09:02,200 --> 00:09:04,630
But these are probably best visualized.

120
00:09:04,690 --> 00:09:06,040
Let's see how we would do that.

121
00:09:06,820 --> 00:09:07,960
Let's go penalty.

122
00:09:08,110 --> 00:09:20,410
No lot neighbors train schools and then we'll get a label equals train score.

123
00:09:20,410 --> 00:09:21,130
Of course.

124
00:09:21,130 --> 00:09:22,030
Yes.

125
00:09:22,030 --> 00:09:32,570
BLT will add another plot neighbors test scores label equals test score.

126
00:09:32,690 --> 00:09:33,690
Wonderful.

127
00:09:33,860 --> 00:09:36,950
And we want to go peyote actually.

128
00:09:36,950 --> 00:09:43,290
Let's just see what this looks like ex label will go a number of neighbors because that's what's on

129
00:09:43,290 --> 00:09:45,840
the X spelling neighbors.

130
00:09:45,860 --> 00:09:48,210
The Australian way here with you there.

131
00:09:48,250 --> 00:09:49,220
Well maybe it's not Australian.

132
00:09:49,220 --> 00:09:54,270
Maybe it's just a different version but neighbours in a socket line is spelled with no u.

133
00:09:54,380 --> 00:10:03,470
So this is the x and then the dot y label is going to be the scores so this is X Y so this is going

134
00:10:03,470 --> 00:10:11,380
to be model score and then we want P LTE legend and we'll see what comes up.

135
00:10:13,070 --> 00:10:24,120
Actually we want little tidbit here Max F so maximum Kane and score on the test data.

136
00:10:24,380 --> 00:10:30,530
We're going to go Max test scores.

137
00:10:30,530 --> 00:10:39,530
It's gonna be accuracy so we want times one hundred point to f close parentheses and then close string.

138
00:10:39,670 --> 00:10:44,420
That should work wonderful OK.

139
00:10:44,450 --> 00:10:49,940
So we can see here that we're really paying attention to this test score here.

140
00:10:49,970 --> 00:10:59,300
So we get the highest value here is around about 11 for K nearest neighbors.

141
00:10:59,350 --> 00:11:05,310
So if we look at the graph actually that's let's adjust these x2 so we can see where it actually is.

142
00:11:05,310 --> 00:11:10,920
X takes dot NDP range 1 to twenty one space of 1.

143
00:11:10,930 --> 00:11:20,040
So what this is gonna do is just bring out a list of range of 1 to twenty one of space 1 which is the

144
00:11:20,040 --> 00:11:22,480
exact same as what we're using here.

145
00:11:22,560 --> 00:11:25,020
So neighbors equals range 1 to twenty.

146
00:11:25,290 --> 00:11:33,350
So let's do that and yes our assumption before before we adjusted the X ticks which is these just little

147
00:11:33,350 --> 00:11:34,370
labels down here.

148
00:11:34,550 --> 00:11:36,010
Was that okay.

149
00:11:36,200 --> 00:11:42,670
And then or n neighbors value of eleven yields the best score on our test data set.

150
00:11:42,680 --> 00:11:46,370
So if we come up here the default we remember is five.

151
00:11:46,370 --> 00:11:51,650
So we've just done a little bit of hyper parameter tuning and we were able to improve our cayenne and

152
00:11:51,650 --> 00:11:59,000
classifiers results on the test data set by changing it from the default 5 and and neighbors parameter.

153
00:11:59,000 --> 00:12:04,970
So we're doing hyper parameter tuning here change that to 11 and we get seventy five point four one

154
00:12:05,060 --> 00:12:11,880
versus our initial result which is back up here of sixty eight per cent Mm hmm.

155
00:12:12,070 --> 00:12:19,290
Well even then even after improving it to 75 or whatever it was it's still far below Alo drastic regression

156
00:12:19,620 --> 00:12:21,500
and random forest.

157
00:12:21,660 --> 00:12:27,280
So I think this has put the nail in the coffin for our K N N model is that because even with the type

158
00:12:27,290 --> 00:12:33,330
of parameter tuning it still hasn't reached the scores that we got with logistic regression or random

159
00:12:33,330 --> 00:12:34,590
forest.

160
00:12:34,830 --> 00:12:37,890
And so because of this we're going to discard K and end for now.

161
00:12:37,950 --> 00:12:39,750
This is part of our experiment.

162
00:12:39,840 --> 00:12:44,970
A big part of experimenting a machine learning step six is going through different machine learning

163
00:12:44,970 --> 00:12:47,520
models and figuring out which work and which don't.

164
00:12:47,820 --> 00:12:53,520
So what we're trying to do here is as quickly as possible work through different little experiments

165
00:12:53,520 --> 00:12:59,640
like what we've just done tuning K and N and seeing which model performs best on our data.

166
00:12:59,640 --> 00:13:05,650
So what we might try and do is we've churned Kanan by hand like we wrote out this little for loop here.

167
00:13:05,730 --> 00:13:06,810
That was a bit tedious right.

168
00:13:06,810 --> 00:13:10,980
So you had to do that for every single machine learning model you might run into some problems.

169
00:13:10,980 --> 00:13:15,750
It was okay here because we had one parameter tuned but if you had more parameters writing these for

170
00:13:15,750 --> 00:13:17,910
loops is going to be very inefficient.

171
00:13:17,910 --> 00:13:25,380
So what we might do next see how we can tune logistic regression and random forest classifier using

172
00:13:25,440 --> 00:13:30,280
randomized search CV which stands for cross validation.

173
00:13:30,330 --> 00:13:36,060
So instead of us having to manually try choosing different hybrid parameters by hand randomize search

174
00:13:36,060 --> 00:13:41,190
CV which we've seen in the psychic loan section it's gonna try a number of different combinations of

175
00:13:41,190 --> 00:13:47,070
hyper parameters for us and evaluate which ones are the best and then save them for us.

176
00:13:47,070 --> 00:13:48,900
So that's what we're going to have a look at that next video.