1
00:00:00,150 --> 00:00:06,450
Halaal, so before going ahead in this session, let's have a quick recap of what we have done in our

2
00:00:06,450 --> 00:00:07,260
previous session.

3
00:00:07,680 --> 00:00:12,600
In the previous session, we have learned some basics of cross-validation, what exactly it means.

4
00:00:12,660 --> 00:00:20,190
We just understand if we play with this randomness, quested barometer's, we we definitely have a fluctuation

5
00:00:20,190 --> 00:00:21,180
in our accuracy.

6
00:00:21,450 --> 00:00:29,510
That's why we have to cross validate our model using using this parameter, using this seewhy parameter.

7
00:00:29,520 --> 00:00:34,800
And once we have this, then what we have to do, we have to simply compute Esmie.

8
00:00:34,890 --> 00:00:35,550
That's it.

9
00:00:35,550 --> 00:00:37,950
And it will it will give some accuracy.

10
00:00:37,950 --> 00:00:41,000
And that is exactly why the accuracy.

11
00:00:41,460 --> 00:00:47,190
So in this session, we have to learn some some of the I guess some of the cross-validation approaches

12
00:00:47,520 --> 00:00:50,590
that are exactly my randomise thoughts.

13
00:00:50,600 --> 00:00:52,240
TV ad grid search.

14
00:00:52,620 --> 00:00:54,080
So what are exactly these?

15
00:00:54,420 --> 00:00:58,740
And before understanding this, let's let's have some basic idea behind.

16
00:00:58,950 --> 00:00:59,400
Yeah.

17
00:00:59,580 --> 00:01:05,250
What is that hyper parameter optimization or I can see what is the model hypertonic.

18
00:01:05,610 --> 00:01:13,500
So let's say whatever machine learning algorithm, whatever machine learning algorithm you have implemented

19
00:01:13,500 --> 00:01:19,290
on your data or on your data frame, whatever you want to consider, whatever machine learning algorithm

20
00:01:19,290 --> 00:01:24,750
that you have implemented on your data set, what our machine learning algorithm that you have implemented

21
00:01:24,750 --> 00:01:28,350
on data, let's say let's say you have some regression.

22
00:01:28,350 --> 00:01:31,860
You guess or or let's say you have some classification.

23
00:01:32,340 --> 00:01:33,700
Let's say you have this Justis.

24
00:01:34,050 --> 00:01:42,420
Let's say the algorithm that I have implemented on my data, let's say I have implemented random forest

25
00:01:42,420 --> 00:01:43,200
on my data.

26
00:01:44,380 --> 00:01:53,620
So once once I have implemented this random forest on my data so far, this we have a class in my Ascalon

27
00:01:53,620 --> 00:01:54,160
module.

28
00:01:56,080 --> 00:02:02,200
In case of regression, I have something known as random forest regression, random forest regressive,

29
00:02:02,680 --> 00:02:08,080
whereas in case of classification, I have something known as random forest classifier.

30
00:02:09,430 --> 00:02:15,610
So what do we have to do, what we have to do very first, so if we train, if we train our model,

31
00:02:15,610 --> 00:02:22,720
if we train our regression model, or if we have a classified model, so it will definitely get trained

32
00:02:23,110 --> 00:02:32,780
by using by using a different parameters, using default parameters of my random forest aggression.

33
00:02:32,810 --> 00:02:37,150
Similarly, in case of default barometer's.

34
00:02:38,340 --> 00:02:45,030
In case of Regnum Forest, this classifiable, but it is not compulsory, but it is not compulsory,

35
00:02:45,180 --> 00:02:52,110
whatever you say you have that was sold using this default barometer's whatever Arimidex in case of

36
00:02:52,110 --> 00:02:55,230
random forest left in case of random forest, classify it.

37
00:02:55,420 --> 00:03:01,680
You have some barometer's exit number of businesses and trees, which is nothing but what are mine and

38
00:03:01,680 --> 00:03:08,400
on the school altimeters as well as you have something known as maximum features which is my max on

39
00:03:08,400 --> 00:03:15,030
the school features whose value let's say what auto is Gary Locke took manyas value that that's not

40
00:03:15,030 --> 00:03:16,830
a huge number of vicinities you own.

41
00:03:17,070 --> 00:03:20,390
And what are my what is the maximum depth of a tree?

42
00:03:20,400 --> 00:03:26,220
What is the maximum that of the season tree that I'm going to consider it in my it in my random forest.

43
00:03:26,440 --> 00:03:29,450
So these are my these are basically the parameters.

44
00:03:30,120 --> 00:03:31,100
So what do we have to do?

45
00:03:31,320 --> 00:03:35,670
So that's it is its default value, Lexa's default values.

46
00:03:35,670 --> 00:03:36,240
One hundred.

47
00:03:36,240 --> 00:03:38,750
And let's say it's it is just auto.

48
00:03:38,760 --> 00:03:39,990
It is it is by default.

49
00:03:40,000 --> 00:03:47,760
That is auto that it is somewhere let's say two or three what whatever default value set or whatever,

50
00:03:47,760 --> 00:03:51,360
whatever value already in Sakhalin modu.

51
00:03:52,450 --> 00:04:00,550
But it is not necessary, whatever you say you have with respect to that youth is these values are going

52
00:04:00,550 --> 00:04:01,050
to fit.

53
00:04:01,840 --> 00:04:08,440
So what we have to do in such case, so in such case, in such case, what we have to do to optimize

54
00:04:08,440 --> 00:04:12,150
to optimize our this classifiers.

55
00:04:12,430 --> 00:04:19,300
So to optimize what I can say to achieve the best disvalue, to achieve the best value of all these

56
00:04:19,300 --> 00:04:25,960
parameters, of all these hyper parameters, to achieve the best value of this hyper barometer's.

57
00:04:26,170 --> 00:04:32,880
You have something known is let me bring you which we have something known as this great research,

58
00:04:32,890 --> 00:04:42,490
S.V., this research TV and you have something known as this randomise, such TV spot, exactly what

59
00:04:42,490 --> 00:04:42,970
it was like.

60
00:04:42,970 --> 00:04:47,320
Let's understand what exactly the difference between both how how that actually works.

61
00:04:47,620 --> 00:04:48,870
So what do we do over here?

62
00:04:49,000 --> 00:04:55,060
Let's say so here what we have to do, we have to define a dictionary over here and in this dictionary

63
00:04:55,240 --> 00:04:56,130
what we have to do.

64
00:04:56,470 --> 00:04:58,620
So what what what what use case?

65
00:04:58,690 --> 00:05:00,250
Let me sit here.

66
00:05:00,430 --> 00:05:02,020
You have a classic classification.

67
00:05:02,060 --> 00:05:03,060
You get elected.

68
00:05:03,060 --> 00:05:07,050
These are my custom parameters defined by your random forest.

69
00:05:07,060 --> 00:05:08,980
Classify it inside you, Cyclone.

70
00:05:09,490 --> 00:05:12,700
So what I am going to do over here, I'm going to define a dictionary.

71
00:05:12,700 --> 00:05:19,810
And in this dictionary I would say, let's say number of number of decision trees.

72
00:05:20,350 --> 00:05:25,150
You can consider, let's say one hundred, two hundred, three hundred.

73
00:05:25,570 --> 00:05:29,550
Then let's say they'll the exit polls in 2000.

74
00:05:29,570 --> 00:05:34,770
Similarly, you have some parameters, let's say Max features.

75
00:05:35,140 --> 00:05:42,220
So here I'm going to say it is my let's say auto in auto mode in something known as Lotu as well.

76
00:05:42,730 --> 00:05:48,160
And I have asked and there are lots of lots of values.

77
00:05:48,550 --> 00:05:51,220
Similarly, I have Max that parameter.

78
00:05:53,880 --> 00:06:00,410
And you can consider its value as well here in case of classification, you have all these parameters.

79
00:06:00,960 --> 00:06:08,020
Similarly, what you have to do once you create this dictionary, you have to pass.

80
00:06:08,040 --> 00:06:18,030
You have to pass this dictionary to your grade, such as CV, to your grade, to CV, and you have to

81
00:06:18,030 --> 00:06:18,810
verify it here.

82
00:06:18,820 --> 00:06:25,360
You have to pass what exactly the algorithm that you have to pass to this great CV.

83
00:06:25,560 --> 00:06:30,770
So my algorithm is exactly my random forest classifier.

84
00:06:30,930 --> 00:06:35,660
So here I have to pass the objective of that algorithm.

85
00:06:35,670 --> 00:06:42,450
Then I have to pass this dictionary in exactly my badam underscore a good barometer.

86
00:06:42,600 --> 00:06:48,960
So in this barometer we have to parse this dictionary and here we have to pass our cross-validation

87
00:06:48,990 --> 00:06:51,140
what we have learned, what we have learned earlier.

88
00:06:52,240 --> 00:06:57,100
And many of those better media than you can play with, but these barometer, but these three, the

89
00:06:57,100 --> 00:06:59,500
first two and what is your would classify it?

90
00:07:00,040 --> 00:07:04,990
The second one, you use this this dictionary, this dictionary and the second then the third one.

91
00:07:05,350 --> 00:07:06,030
Exactly.

92
00:07:06,400 --> 00:07:07,780
You'll see barometer.

93
00:07:07,960 --> 00:07:13,090
And there are lots of antenna's good jobs and multiple multiple parameters that that's not issue.

94
00:07:13,690 --> 00:07:15,040
So what do we have to do over here?

95
00:07:16,320 --> 00:07:19,860
So it will seize this critical services.

96
00:07:19,980 --> 00:07:27,980
Yeah, I will do permutation and combination with each and every parameter that you have defined.

97
00:07:27,980 --> 00:07:35,790
Final word and which our player, whichever peer, whoever peer I can say whose Lavoipierre would give

98
00:07:35,790 --> 00:07:38,130
me highest accuracy.

99
00:07:38,160 --> 00:07:41,640
Let's say let's say let's say this one hundred.

100
00:07:41,850 --> 00:07:48,250
And in case of an estimate, just in case of selected math features, I have something like to extend

101
00:07:48,600 --> 00:07:50,580
or maximum that I have something like that for.

102
00:07:51,270 --> 00:07:56,160
Let's say this beer gives me high tech reselected.

103
00:07:56,160 --> 00:08:01,350
This gives me some 90 percent accuracy for what we will do.

104
00:08:01,590 --> 00:08:03,470
We will consider this beer.

105
00:08:03,510 --> 00:08:08,060
We will consider this beer and and and we are going to fit our model.

106
00:08:08,070 --> 00:08:12,390
We are going to fit our model or we are going to initialize our algorithm.

107
00:08:12,390 --> 00:08:19,440
We are going to initialize our algorithm with this model and then we are going to perform training with

108
00:08:19,440 --> 00:08:20,810
this best model.

109
00:08:21,300 --> 00:08:26,170
What I can see with the best parameter, with the optimal parameter that we have achieved over here,

110
00:08:26,430 --> 00:08:29,160
that's what that's what this group says we will do.

111
00:08:30,030 --> 00:08:33,360
It will exactly return me my best model.

112
00:08:33,360 --> 00:08:37,410
I can see this is my best model and best parameters.

113
00:08:37,590 --> 00:08:41,130
That's what that's what this good research team will do.

114
00:08:41,610 --> 00:08:43,360
Best model and best parameters.

115
00:08:43,860 --> 00:08:49,420
So once we have this best model and that parameter, it will definitely give us the best accuracy.

116
00:08:50,130 --> 00:08:56,370
So that's the advantage of using this cross-validation, this this great CV and all these things.

117
00:08:56,770 --> 00:08:58,860
Now, you would think so.

118
00:08:59,010 --> 00:09:01,620
So we have Suja of you all are aware about this.

119
00:09:01,620 --> 00:09:02,620
What is excessive?

120
00:09:02,820 --> 00:09:04,680
So what exactly is this randomise?

121
00:09:05,040 --> 00:09:06,690
You all guys are asking this.

122
00:09:07,290 --> 00:09:11,670
So here in this grid is such CV, in this grid is such TV.

123
00:09:11,670 --> 00:09:12,450
What happens?

124
00:09:12,480 --> 00:09:18,120
We have to do permutation combinations with these parameters, with these parameters.

125
00:09:18,390 --> 00:09:25,920
But in this randomizer CV, I'm going to see just pick up some random parameters, just pick up some

126
00:09:25,920 --> 00:09:26,700
random parameter.

127
00:09:26,700 --> 00:09:30,960
Let's say I'm going to pick some three hundred XY here from Lotu.

128
00:09:31,260 --> 00:09:35,520
And here I'd like to do and I would say just check accuracy.

129
00:09:35,880 --> 00:09:41,880
And if whatever accuracy you have is stored in some data structures, let's say you are going to store

130
00:09:41,880 --> 00:09:47,850
some let's you are going to store all the accuracy in some let's list and what accuracy you have to

131
00:09:47,850 --> 00:09:51,660
store over here just to return me my best accuracy.

132
00:09:51,990 --> 00:09:54,650
That's what that's what this randomizer CV will do.

133
00:09:54,990 --> 00:10:02,370
So if you guys are thinking which one of these which would be better, definitely if I if I will talk

134
00:10:02,370 --> 00:10:05,130
in terms of computation, power competition.

135
00:10:05,430 --> 00:10:09,990
So that definitely that is Pioneros because you will see it.

136
00:10:10,200 --> 00:10:12,930
I am going to pick a random parameter.

137
00:10:13,080 --> 00:10:18,630
I don't have to do this, this permutation combination because what if what if I have let's say I have

138
00:10:19,230 --> 00:10:24,480
ten parameters and let's say let's say I'm going to set this this this is data.

139
00:10:24,480 --> 00:10:30,270
Let's say this is data that I have stored over here in list, which is exactly my value with respect

140
00:10:30,270 --> 00:10:30,920
to this key.

141
00:10:31,140 --> 00:10:33,240
Let's say let's say I have had twenty values.

142
00:10:33,360 --> 00:10:38,850
So you will see you will see how complex it is if I am going to do this Baluchestan and combinations

143
00:10:38,850 --> 00:10:39,370
over here.

144
00:10:39,930 --> 00:10:45,180
So what I can do to get rid of this, this, this, this issue, what I can do, I could pick up some

145
00:10:45,180 --> 00:10:52,020
random parameters and then I would say just determine the best accuracy so that really both of the approaches

146
00:10:52,020 --> 00:10:56,630
are good but are random, such as I can see it is better than this.

147
00:10:56,650 --> 00:10:59,250
Good to see in terms of computation power.

148
00:10:59,580 --> 00:11:04,470
And definitely it is great to see what takes a lot of time at the time of execution.

149
00:11:04,470 --> 00:11:09,990
And whereas with respect to I thought it takes less time with respect to this this grid search.

150
00:11:10,440 --> 00:11:15,330
Similarly, with respect to other other algorithm you have, you have multiple parameters.

151
00:11:15,990 --> 00:11:21,030
Activate this back to, if I will, talk about with respect to logistic regression, with respect to

152
00:11:21,030 --> 00:11:24,450
logistic regression, we have something known as the parameter.

153
00:11:24,450 --> 00:11:30,870
We have something known as the relative or we have something known as Benetti parameter that I'm going

154
00:11:30,870 --> 00:11:35,430
to say what type of regularization technique I'm going to use.

155
00:11:35,450 --> 00:11:40,200
And so there are lots of parameters in each and every machine learning algorithm.

156
00:11:40,410 --> 00:11:42,060
Similarly, we have SVM.

157
00:11:42,060 --> 00:11:48,240
Let's see what type of kernel we are going to use in the case of this disease entries, what type of

158
00:11:48,240 --> 00:11:52,920
approach, what type of approach you are going to build decision trees, whether you are going to use

159
00:11:52,920 --> 00:12:00,000
a concept of entropy, whether you are going to use a concept of your your information or whether or

160
00:12:00,000 --> 00:12:02,850
whether you are going to use the concept of yogini impurity.

161
00:12:03,180 --> 00:12:04,560
Similarly, similarly.

162
00:12:04,680 --> 00:12:10,890
No, no, this is entry's similarly what is up what is a maximum depth of decision.

163
00:12:11,190 --> 00:12:13,860
So these are these are all your parameter.

164
00:12:13,860 --> 00:12:15,540
These are all your high.

165
00:12:15,820 --> 00:12:22,840
Parameters that you can play with that and once you will get your best bills, once you will get your

166
00:12:23,110 --> 00:12:31,030
best bills, just clean this model with the best, just train the model, just train it, just train

167
00:12:31,030 --> 00:12:31,210
it.

168
00:12:31,510 --> 00:12:37,750
And once you train this model, it means you have a best model and you have this best, because once

169
00:12:37,750 --> 00:12:43,410
you have all these things at a best, you will definitely get your best accuracy.

170
00:12:44,360 --> 00:12:49,440
Literally in each of the algorithm, whether it's a classification, whether it's a linear regression

171
00:12:49,440 --> 00:12:53,080
in each of the average KADHEM, that's all about the session.

172
00:12:53,100 --> 00:12:54,980
Hopefully you will love this session very much.

173
00:12:55,370 --> 00:12:56,160
Thank you.

174
00:12:56,210 --> 00:12:57,080
Have a nice day.

175
00:12:57,350 --> 00:12:58,220
Keep learning.

176
00:12:58,490 --> 00:12:59,360
Keep growing.

177
00:12:59,570 --> 00:13:00,470
Keep practicing.