1
00:00:00,210 --> 00:00:01,310
Welcome back.

2
00:00:01,350 --> 00:00:07,080
Last video we saw how to tune one of our machine learning models specifically the K newest name is classifier

3
00:00:07,440 --> 00:00:08,700
by hand.

4
00:00:08,700 --> 00:00:13,900
Now this is hyper parameter tuning what I've done is I've just adjusted this little title here to say

5
00:00:13,900 --> 00:00:16,410
hyper parameter tuning by hand.

6
00:00:16,410 --> 00:00:22,140
But as you could imagine if you had more than one parameter or hyper parameter in our case we only tuned

7
00:00:22,230 --> 00:00:26,340
in neighbors would go up to the K neighbors classifier.

8
00:00:26,520 --> 00:00:28,580
We only tuned this one parameter.

9
00:00:28,740 --> 00:00:32,160
We wanted to chain all of these I get a bit tedious.

10
00:00:32,160 --> 00:00:34,300
We were writing for loops for days.

11
00:00:34,350 --> 00:00:36,820
So what we're going to do here is.

12
00:00:36,900 --> 00:00:41,730
And remember we've also written off our K nearest neighbors square in the search of better results.

13
00:00:42,030 --> 00:00:46,770
So because our logistic regression in random forest classifier performing better we're going to cut

14
00:00:47,220 --> 00:00:50,310
the cane to its neighbors model out of our experimentation.

15
00:00:50,310 --> 00:00:59,710
So now we're going to go hyper parameter tuning with randomized search CV.

16
00:00:59,730 --> 00:01:03,500
And so this is what we saw in the socket line section.

17
00:01:03,540 --> 00:01:14,400
And if we search it out randomized search save a S.K. line we'd find it here beautiful so there's the

18
00:01:14,410 --> 00:01:17,700
documentation there you can read through that on your own time.

19
00:01:17,710 --> 00:01:19,400
But we're going to see it implemented.

20
00:01:19,540 --> 00:01:24,820
And again if you're wondering how you could find the hyper parameters to tune a set machine learning

21
00:01:24,820 --> 00:01:25,380
model.

22
00:01:25,480 --> 00:01:27,870
Well here's what I would do for logistic regression.

23
00:01:27,940 --> 00:01:36,160
So how to tune a logistic regression machine learning model in Python something like that.

24
00:01:36,190 --> 00:01:42,900
If we switch that up logistic regression model training with psychic line beautiful logistic regression

25
00:01:43,140 --> 00:01:48,450
using Python psychic line high parameter optimization of machine learning models chaining parameters

26
00:01:48,450 --> 00:01:49,660
for logistic regression.

27
00:01:49,770 --> 00:01:54,690
If we were to go through these and research and do our own experiments and figure out things and ask

28
00:01:54,690 --> 00:01:57,600
more questions and probably find some pretty good answers.

29
00:01:57,990 --> 00:02:02,460
But for now we're going to pretend we've gone through those steps and we figured out that we can use

30
00:02:02,460 --> 00:02:07,280
randomize search CV and that we can tuna a bunch of different parameters.

31
00:02:07,350 --> 00:02:08,970
Let's do that.

32
00:02:08,970 --> 00:02:12,570
We're going to have to Choon

33
00:02:16,800 --> 00:02:25,800
logistic regression and logistic regression model which is that and random forest classifier

34
00:02:31,190 --> 00:02:38,930
using randomized search CEV and again if you need information about which parameters you can use for

35
00:02:38,930 --> 00:02:40,880
logistic regression I've even spent that wrong.

36
00:02:41,090 --> 00:02:47,300
Difficult Daniel logistic regression and random forest classifier can go up here as canine logistic

37
00:02:48,140 --> 00:02:55,820
regression so if we look in here this is going to give us a list of different parameters and you might

38
00:02:55,820 --> 00:02:57,910
read the documentation and trust me.

39
00:02:57,920 --> 00:03:03,350
I did this the first time I did this I read the documentation and I'm like this means all of these terms

40
00:03:03,350 --> 00:03:09,830
like elastic net L2 penalties bool actually I know what it all means true or false right.

41
00:03:10,370 --> 00:03:12,280
All these scenes didn't mean much to me.

42
00:03:12,290 --> 00:03:19,280
It was only once I started to do some research and go through and figure out what all these different

43
00:03:19,280 --> 00:03:24,500
parameters actually meant was that I started to understand it and the same goes for the random forest

44
00:03:24,500 --> 00:03:25,250
classifier.

45
00:03:25,310 --> 00:03:31,940
You could do the same thing put that into the SBA loan documentation or change this to be how to tune

46
00:03:32,600 --> 00:03:33,650
a random forest

47
00:03:36,260 --> 00:03:37,640
machine learning model in Python.

48
00:03:37,640 --> 00:03:42,830
These are steps that people like machine learning engineers and data scientists take every single day

49
00:03:42,890 --> 00:03:45,320
to figure out how to improve their models.

50
00:03:45,320 --> 00:03:47,720
No one in the beginning knows how to do this off by heart.

51
00:03:47,750 --> 00:03:51,810
Even after doing it many times I still have to look these things up.

52
00:03:51,890 --> 00:03:55,020
So we go here let's see how we would do it.

53
00:03:55,140 --> 00:04:00,480
So the first thing to do with randomized search CV is and if you're wondering what CV stands for it

54
00:04:00,480 --> 00:04:05,550
stands for cross validation we saw that in psychic loan and more specifically cross validation what

55
00:04:05,550 --> 00:04:11,040
it does is instead of doing a normal training test split like we've done before by creating one training

56
00:04:11,040 --> 00:04:17,370
split and a test blend a.k.a. 80 percent in the training split and 20 percent in the test split it creates.

57
00:04:17,370 --> 00:04:18,540
This should be really be okay.

58
00:04:18,840 --> 00:04:21,270
But I've done five because it looks nice here.

59
00:04:21,330 --> 00:04:23,570
This can be k fold cross validation.

60
00:04:23,670 --> 00:04:30,480
So 5 is the default in get line in the latest version but you can really adjust this and what it's going

61
00:04:30,480 --> 00:04:35,220
to do is go on it create five different versions of training data and five different versions of the

62
00:04:35,220 --> 00:04:42,210
test data and then evaluate different parameters Rama because we're doing a hyper parameter search evaluate

63
00:04:42,210 --> 00:04:47,220
different hyper parameters on all of these different sets of all of these versions of the training and

64
00:04:47,220 --> 00:04:54,660
test data and work out which set of parameters or hyper parameters is best across these five different

65
00:04:54,660 --> 00:04:58,590
splits rather than just being one single split.

66
00:04:58,590 --> 00:05:01,470
So if we go here let's do it.

67
00:05:01,710 --> 00:05:09,210
So what we need to do is create a hyper parameter grid for logistic

68
00:05:11,620 --> 00:05:19,480
regression and reading the logistic regression documentation as well as searching up here for logistic

69
00:05:19,480 --> 00:05:20,380
regression.

70
00:05:20,380 --> 00:05:25,660
We figure out that there's a few high parameters that we can tune such as the value for C which if we

71
00:05:25,660 --> 00:05:28,690
look in here we go down here see.

72
00:05:28,900 --> 00:05:32,310
So inverse of regularization strength must be a positive flight.

73
00:05:32,380 --> 00:05:36,950
Again you could read this the first on the documentation go what any of these words mean.

74
00:05:36,970 --> 00:05:39,560
That's why it requires research to check it out.

75
00:05:39,820 --> 00:05:41,980
And then there's another one called solver.

76
00:05:42,100 --> 00:05:46,440
So we'll go here we'll just have a look at what a parameter grid looks like.

77
00:05:46,720 --> 00:05:47,300
See.

78
00:05:47,680 --> 00:05:50,560
And again there are more where the using to here.

79
00:05:50,560 --> 00:05:58,160
So we're only using C and solver you might find in your research that you could adjust the penalty you

80
00:05:58,160 --> 00:06:03,830
could adjust the fit intercept you could adjust a whole bunch of height parameters but the overall concept

81
00:06:03,860 --> 00:06:07,500
of adjusting hot parameters is what we're focused on.

82
00:06:07,500 --> 00:06:13,130
So we're gonna set this after our research we know that a good value is NDP log space negative four

83
00:06:13,130 --> 00:06:18,110
and if you're wondering what the long space is I'm kind of throwing these things out relatively quickly

84
00:06:18,450 --> 00:06:22,370
log space does if we go here.

85
00:06:22,380 --> 00:06:26,550
Returns numbers spaced evenly along a log scale.

86
00:06:26,550 --> 00:06:28,890
So the start stop number.

87
00:06:28,920 --> 00:06:33,720
So this is gonna be between negative four between four and 50 of them.

88
00:06:33,780 --> 00:06:37,180
And if you're wondering what a log space is these are the ways you can do it.

89
00:06:37,320 --> 00:06:37,740
Let's go.

90
00:06:37,770 --> 00:06:39,160
What is a log.

91
00:06:39,210 --> 00:06:39,750
Space

92
00:06:43,010 --> 00:06:45,340
log space reduction and complexity.

93
00:06:45,410 --> 00:06:51,960
Can you explain in simple words what is log space reduction so we could read that again for a few complex

94
00:06:51,960 --> 00:06:55,860
words there but after a little bit of effort you'll be able to figure it out.

95
00:06:56,070 --> 00:07:03,550
So we go here we can only really use one solver here because really after the research and finding what

96
00:07:03,670 --> 00:07:08,370
parameters we should adjust we find that c is probably the most valuable one for logistic regression.

97
00:07:08,830 --> 00:07:10,840
So we're only going to use one value for solver.

98
00:07:10,870 --> 00:07:18,550
So really what we're testing here we're creating a grid of numbers on a log space between negative 4

99
00:07:18,580 --> 00:07:21,470
and 4 so there we go.

100
00:07:21,770 --> 00:07:22,140
Oh sorry.

101
00:07:22,150 --> 00:07:23,200
Because it's a long space.

102
00:07:23,200 --> 00:07:29,330
It's 1 times 10 to the power of negative for up to 1 times tend to the powerful.

103
00:07:29,350 --> 00:07:30,740
So a pretty big space there.

104
00:07:30,760 --> 00:07:32,480
These numbers are pretty well separated.

105
00:07:32,920 --> 00:07:37,510
So let's get rid of that and now we'll create type of parameter

106
00:07:40,090 --> 00:07:51,050
read for random forest classifier wonderful and through our research we find that some of the best parameters

107
00:07:51,050 --> 00:07:59,660
for a random forest number of estimates which is if we're using a random forest and estimate us is how

108
00:07:59,660 --> 00:08:01,990
many trees that we have in our forest.

109
00:08:02,030 --> 00:08:09,050
So N.P. a range and you might be wondering why I'm using ranges here when I'm going through this.

110
00:08:10,200 --> 00:08:16,080
Well the reason is because if we look up the documentation I know in a previous video we have an explicit

111
00:08:16,080 --> 00:08:16,850
the use range.

112
00:08:16,850 --> 00:08:21,850
We've used a list just for a refresher on what a range does.

113
00:08:21,860 --> 00:08:30,220
Let's check out this so it's basically just going to create a range of numbers space 50 apart because

114
00:08:30,220 --> 00:08:33,300
at 50 here between 10 and 1000.

115
00:08:33,300 --> 00:08:37,800
So if we look up randomized I think we already have it maybe here.

116
00:08:37,810 --> 00:08:40,780
There we go there somewhere here.

117
00:08:40,840 --> 00:08:41,830
In contrast

118
00:08:44,420 --> 00:08:49,060
it is highly recommended to use continuous distributions of all continuous parameters.

119
00:08:49,100 --> 00:08:52,030
So that's why we create a range of different values.

120
00:08:52,040 --> 00:08:57,560
This is where we read that from rather than just being an explicit list we're creating a range of continuous

121
00:08:57,560 --> 00:09:02,280
distribution for our hyper parameters for randomized search CVA.

122
00:09:02,450 --> 00:09:04,490
But that's just in the documentation.

123
00:09:04,490 --> 00:09:09,740
So we've got here you could still just use a list but the documentation of people have written ask a

124
00:09:09,740 --> 00:09:12,700
loan lobby recommend us to use a range of values.

125
00:09:12,770 --> 00:09:18,920
So I'm gonna go Max depth here which is another one that we found through our own research of a hyper

126
00:09:18,920 --> 00:09:22,640
parameter that we can tune remember for any machine learning model.

127
00:09:22,820 --> 00:09:28,120
A quick search of going hey I'm using this model I've figured it out I followed the psychic line.

128
00:09:28,250 --> 00:09:29,370
I don't have the map do I.

129
00:09:29,810 --> 00:09:30,530
Yeah we do.

130
00:09:30,530 --> 00:09:34,400
I find the map and I've decided I'm going to use a random forest classifier.

131
00:09:34,400 --> 00:09:36,150
How can I do high parameter tuning Oh.

132
00:09:36,770 --> 00:09:39,830
Well that's where it comes to searching something like this.

133
00:09:39,830 --> 00:09:40,090
Right.

134
00:09:41,160 --> 00:09:49,810
So we go here we find another one is the mean samples split and we go here and we're going to go with

135
00:09:49,810 --> 00:09:50,330
the range.

136
00:09:50,340 --> 00:09:56,370
Actually I've said that we're gonna use a distribution but here Max depth we haven't used a distribution

137
00:09:56,370 --> 00:09:58,150
we've tissues an explicit list.

138
00:09:58,300 --> 00:10:08,340
We'll use another range here and then we'll go here and we'll also use mean samples leaf and p a range

139
00:10:10,250 --> 00:10:12,390
one twenty two.

140
00:10:12,710 --> 00:10:13,680
Beautiful.

141
00:10:13,700 --> 00:10:20,240
So now we have to hyper parameter grids for the two models that we're going to try and have a parameter

142
00:10:20,250 --> 00:10:22,970
tune using randomized search CV.

143
00:10:23,840 --> 00:10:28,580
So what we'll probably do is we'll end this video here and then we'll come back and we'll we'll use

144
00:10:28,580 --> 00:10:34,640
these two grids along with randomized search CV to try and improve our results of our logistic regression

145
00:10:35,030 --> 00:10:39,680
model and our random forest classifier beyond what we initially got.

146
00:10:40,490 --> 00:10:43,000
So beyond these initial values that we got.

147
00:10:43,370 --> 00:10:50,960
Remember these values were obtained without using cross validation because we only used a single train

148
00:10:50,960 --> 00:10:56,320
and test split whereas if we go into keynote and cross validation.

149
00:10:56,330 --> 00:11:02,450
So this is what we used to get our original scores a single train and test split whereas in cross validation

150
00:11:02,900 --> 00:11:07,130
we're going to be using multiple train in test plate remember in the psychic loan section we discuss

151
00:11:07,130 --> 00:11:13,370
that if you're going to provide a metric in terms of how a model is performing it's probably best to

152
00:11:13,370 --> 00:11:18,920
use a cross validation metric especially with classification problems.

153
00:11:19,610 --> 00:11:26,130
So let's revisit that in the next video of tuning our random forest and logistic regression with randomize

154
00:11:26,150 --> 00:11:26,840
search CV.