1
00:00:00,150 --> 00:00:02,820
Head on before going ahead in the session.

2
00:00:02,850 --> 00:00:09,990
Let's have a glance on what we have done in all of our previous precision from initial two till now.

3
00:00:10,260 --> 00:00:17,310
So basically from retired bought today to cleaning, lots with analysis, lots of possibly lots of techniques

4
00:00:17,310 --> 00:00:24,810
for feature encoding, outline reduction, this feature selection as well that we have done automate

5
00:00:24,960 --> 00:00:28,350
our all the processes that we have done for our model.

6
00:00:28,350 --> 00:00:33,390
And then we check what exactly the accuracy of different different algorithm.

7
00:00:33,390 --> 00:00:38,640
And then we have played with all these different different kinds of algorithms, like getting linear

8
00:00:38,640 --> 00:00:39,460
regression.

9
00:00:39,480 --> 00:00:41,550
This isn't really random for us.

10
00:00:41,550 --> 00:00:44,010
And arrest is this one code.

11
00:00:44,160 --> 00:00:48,660
And you can play with multiple regression and whatever you want.

12
00:00:48,990 --> 00:00:55,500
So in this session, we have this assignment in which I have a problem, a statement in which I have

13
00:00:55,500 --> 00:01:04,980
to hyper to my model to why there is a need of performing hyper parameter tuning and why you have to

14
00:01:04,980 --> 00:01:10,320
apply on your data, why you have to apply on your machine learning algorithm.

15
00:01:10,630 --> 00:01:15,180
So let's see if I'm going to press shift plus tab over here.

16
00:01:15,510 --> 00:01:18,580
You will see it with respect to this entry.

17
00:01:19,160 --> 00:01:28,470
These are exactly my all the default parameters that are selected by your decision team, but there

18
00:01:28,470 --> 00:01:33,950
is not any guarantee with respect to my Justis, these are my best parameters.

19
00:01:34,350 --> 00:01:41,000
So what we are going to do, we are basically going to use our hyper parameter tuning approach.

20
00:01:41,100 --> 00:01:48,260
And in this we have randomizer, TV, great TV ad, different cross-validation approach.

21
00:01:48,960 --> 00:01:56,610
What they will do, they will basically return us best parameters for our model so that my training

22
00:01:56,610 --> 00:02:01,630
will happen in the best of it and it will exactly return as best escort.

23
00:02:01,800 --> 00:02:07,730
That's what all these cross-validation that are exactly my high for tuning approach.

24
00:02:07,830 --> 00:02:09,330
That's what this will do.

25
00:02:09,750 --> 00:02:11,700
So very close to what we are going to do.

26
00:02:12,420 --> 00:02:16,260
So basically there are two approaches that you can go ahead with.

27
00:02:16,530 --> 00:02:20,070
The first one is a randomized search approach randomizer.

28
00:02:20,070 --> 00:02:27,300
Such the second one is you'll see there is still some advanced approaches like Jandakot algorithm,

29
00:02:27,320 --> 00:02:28,740
some of do not algorithms.

30
00:02:28,890 --> 00:02:30,840
So they are basically very advanced.

31
00:02:31,140 --> 00:02:36,250
So we are basically going to deal with grid search or readymades says it's all up to you.

32
00:02:36,270 --> 00:02:39,530
So basically, we are going to deal with a randomized search.

33
00:02:40,140 --> 00:02:42,970
So what I'm going to do very first, I have to import it.

34
00:02:43,380 --> 00:02:50,760
Let's say I'm going to say from this cyclone dot model selection, you have to ready for this one that

35
00:02:50,760 --> 00:02:54,360
I'm going to import my randomizer CV.

36
00:02:54,720 --> 00:03:01,620
So if I'm going to, let's say, initialize it so you will see over here, it wasn't just going to copy,

37
00:03:02,070 --> 00:03:07,320
just paste and just to you will see all these different different parameters.

38
00:03:07,590 --> 00:03:14,130
What exactly is estimates are, which is nothing but object of your machine learning algorithm that

39
00:03:14,130 --> 00:03:21,810
I have parum distribution in which I have to pass whatever parameters of my machine learning algorithm.

40
00:03:21,810 --> 00:03:26,340
I have to pass that parameters in the form of dictionary.

41
00:03:26,360 --> 00:03:33,420
So this is not exactly a data in the form of keyword, because then I have this ad in the what are the

42
00:03:33,420 --> 00:03:37,990
number of applications I want, what is encoding parameters, all these different different languages?

43
00:03:38,040 --> 00:03:39,960
I have to play with it as well.

44
00:03:39,990 --> 00:03:46,520
So what I'm going to do over here, say if I'm going to let's say, oh, let's say random forest regression,

45
00:03:46,530 --> 00:03:51,960
and if you will pass shift crosstab, you will see all these different different parameters in case

46
00:03:51,960 --> 00:03:52,800
of random forest.

47
00:03:52,800 --> 00:03:54,370
We have all these parameters.

48
00:03:54,600 --> 00:03:57,840
No, this isn't exactly my NIST meters.

49
00:03:58,080 --> 00:03:59,460
What is your criterion?

50
00:03:59,460 --> 00:04:02,620
Which is exactly what MFC what is a master depth of field.

51
00:04:03,030 --> 00:04:04,380
This is what I mean.

52
00:04:04,380 --> 00:04:11,010
Most samples is what are your minimum sample sleeve and all these different different parameters.

53
00:04:11,010 --> 00:04:15,370
So you have to play with all these parameters to hypotenuse model.

54
00:04:15,570 --> 00:04:20,280
So what I'm going to do, if you will, have these, are all these different, different parameters.

55
00:04:20,760 --> 00:04:27,750
So let's say what I'm going to do, I'm basically going to create a random grade as dictionary.

56
00:04:27,750 --> 00:04:28,530
So I to be ready.

57
00:04:28,980 --> 00:04:36,170
And here basically I'm going to define my all the parameters of random forest in the form of dictionary.

58
00:04:36,180 --> 00:04:41,400
So here I'm going to say the very first parameter is exactly my an estimate.

59
00:04:41,490 --> 00:04:46,150
So my end estimate is nothing, but number of result is exactly one.

60
00:04:46,590 --> 00:04:51,250
So here I'm going to say so I will receive the value of this key from here.

61
00:04:51,270 --> 00:04:57,520
So here I'm going to say four X in, so I'm just going to to it.

62
00:04:57,570 --> 00:04:59,430
As for X in.

63
00:04:59,980 --> 00:05:06,970
Umpired got lost, so here you have a function, this land space, and here I'm going to say, if you

64
00:05:06,970 --> 00:05:10,680
will pass shiftless tab, you will see from there you have to start.

65
00:05:10,690 --> 00:05:16,780
And at what point you have to stop a number of things that you want or number of items that you want

66
00:05:16,780 --> 00:05:20,140
in your party or list or whatever you can don't ask.

67
00:05:20,560 --> 00:05:29,040
So here I'm going to say here mine is DOT is nothing but my hundred and stop.

68
00:05:29,230 --> 00:05:38,700
I'm going to say I have to stop it at my dual handed decision and how many total estimates I want here.

69
00:05:38,710 --> 00:05:42,280
I'm going to say I just need, let's say six.

70
00:05:42,400 --> 00:05:48,190
And after what I have to do, I have to let's say I'm going to convert it into some Intisar so far that

71
00:05:48,190 --> 00:05:49,020
I can call this.

72
00:05:49,390 --> 00:05:53,940
So this is exactly nothing but a code of lisp comprehension.

73
00:05:54,670 --> 00:06:00,340
So if you're not much comfortable with this list comprehension code, you guys can follow my basics

74
00:06:00,340 --> 00:06:07,880
of Python goes in which I have got all these basics of Python in just approx in an hour.

75
00:06:08,170 --> 00:06:09,720
So now what we have to do.

76
00:06:09,730 --> 00:06:13,780
So here I have to mention my and and a score estimate.

77
00:06:14,140 --> 00:06:17,530
And it is exactly that list that you have to mention over here.

78
00:06:17,830 --> 00:06:25,570
So here I would say and an estimate was after that, what we have to do, we have to also set what are

79
00:06:25,570 --> 00:06:26,990
my maximum features.

80
00:06:27,190 --> 00:06:33,580
So here I'm going to say, Max, on the spot features of all these features, all about number of features

81
00:06:33,580 --> 00:06:37,450
to consider at every split of the season.

82
00:06:37,880 --> 00:06:40,000
So maximum on its core features.

83
00:06:40,000 --> 00:06:45,640
And here I am going to accept fill it with my list for him to see.

84
00:06:45,670 --> 00:06:48,580
The very first one is nothing but my auto.

85
00:06:49,420 --> 00:06:52,060
The second one is nothing but my squatty.

86
00:06:52,390 --> 00:06:59,350
After what we have to do, we have to deal with our max on this code that features, which is exactly

87
00:06:59,370 --> 00:07:01,990
maximum number of levels in our decision.

88
00:07:02,500 --> 00:07:08,950
So here I am going to say Max underscored that and it is nothing but here I'm going to something mentionable

89
00:07:09,060 --> 00:07:13,160
the maximum that is nothing, but it is just a list.

90
00:07:13,360 --> 00:07:20,530
So here I'm going to say let's say I'm just going to copy this entire code and I have to do some modifications

91
00:07:20,530 --> 00:07:21,050
over here.

92
00:07:21,340 --> 00:07:22,680
So just do copy paste.

93
00:07:22,690 --> 00:07:31,690
And this time, let's say I have to basically start from five and till 30 I have to move.

94
00:07:31,720 --> 00:07:35,740
And I'd say I just want four values.

95
00:07:36,040 --> 00:07:39,030
And in this max depth, you have all these things.

96
00:07:39,030 --> 00:07:44,530
Then you have to assign this value, which is exactly new, Max, on this contempt after it.

97
00:07:44,560 --> 00:07:50,860
What you have to do, except I still need some more data, some more parameters.

98
00:07:51,220 --> 00:07:58,340
So the last one that you need is exactly what I mean, unless all samples and a score split.

99
00:07:58,630 --> 00:08:05,260
So what this feature is all about, minimum number of speeds required to split node here.

100
00:08:05,260 --> 00:08:13,660
I'm going to say you guys can consider these random values that a five, 10, 15 and 100.

101
00:08:13,660 --> 00:08:19,250
So I'm going to consider these values from my own experience, working machine learning domain.

102
00:08:19,660 --> 00:08:25,540
So after it, what we have to do, we have to just execute it as well as we have to also execute this

103
00:08:25,720 --> 00:08:29,090
dictionary, which is exactly you're right on the school grid.

104
00:08:29,300 --> 00:08:36,310
Now, if I'm going to print this dictionary, which is exactly in this Cinematical grade use, this

105
00:08:36,310 --> 00:08:44,290
is exactly the dictionary that you have to parse to randomizer see in your bottom on the score distribution

106
00:08:44,290 --> 00:08:51,490
parameter so that if I have to initialize man and my CV and here in this estimate, or I'm going to

107
00:08:51,490 --> 00:08:57,850
say in this estimate, you have to parse object of your random forest very first.

108
00:08:58,090 --> 00:09:03,540
So here I'm going to say let's say I'm going to create an object of random forest very first.

109
00:09:03,850 --> 00:09:13,720
So here I'm going to say from this ascalon, from this ascalon dot in Sambell, just best Dabb, I have

110
00:09:13,720 --> 00:09:16,590
to import my random forest.

111
00:09:16,600 --> 00:09:18,400
So I'm just going to import this.

112
00:09:18,400 --> 00:09:22,690
And if you stamp this is exactly that, just execute it.

113
00:09:22,840 --> 00:09:26,780
Now, what you have to do, you have to simply initialize this.

114
00:09:26,830 --> 00:09:34,360
I'm going to say let's say it is nothing, but it is mine that they are e.g. score R.F. whatever you

115
00:09:34,360 --> 00:09:38,060
want to name it, it's all up to you, just as you did.

116
00:09:38,410 --> 00:09:44,480
Now what you have to do in this estimate, you have to parse its object some way to say regression and

117
00:09:44,480 --> 00:09:47,770
a school or you are a school area.

118
00:09:48,040 --> 00:09:52,880
And in the very second parameter you have to set you have to set this random grade.

119
00:09:52,900 --> 00:09:57,620
So here I am going to say my random great parameter and let's see.

120
00:09:57,640 --> 00:09:59,320
And we will say Mycenae.

121
00:09:59,600 --> 00:10:02,940
Is my cross validation equal to three by default?

122
00:10:02,960 --> 00:10:13,400
It is a swipe and after in my verbose parameter, so is basically to show your whatever activity is

123
00:10:13,400 --> 00:10:17,120
happening across your cell once you will, as you said.

124
00:10:17,420 --> 00:10:18,800
So what was it goes to do?

125
00:10:19,190 --> 00:10:25,980
And after that, I'm going to set mine and in good jobs parameter as let's say, minus one.

126
00:10:26,000 --> 00:10:31,130
So whenever you are going to parse this minus one, it means it will use all the course.

127
00:10:31,130 --> 00:10:33,680
It means it will use all the resources of.

128
00:10:34,580 --> 00:10:37,610
Let's say I want to store it in some variables here.

129
00:10:37,610 --> 00:10:42,070
I would say that one is random, so just executed.

130
00:10:42,710 --> 00:10:46,640
Now what we have to do, we have to simplify it, our data.

131
00:10:46,650 --> 00:10:49,940
So I'm going to say out of an it's got random, not fit.

132
00:10:49,970 --> 00:10:58,280
So what we have to fit, we have to fit basically X on the screen and definitely Y and just execute

133
00:10:58,280 --> 00:10:58,610
it.

134
00:10:58,640 --> 00:11:00,500
It will take some couple of seconds.

135
00:11:00,500 --> 00:11:03,250
You will see all these things come across.

136
00:11:03,290 --> 00:11:07,410
So sad because you have said your variables equals two.

137
00:11:07,550 --> 00:11:13,310
So it will take a couple of seconds depending upon what processor you are using, depending upon what

138
00:11:13,380 --> 00:11:15,440
specifications of the system.

139
00:11:15,470 --> 00:11:23,480
Now you will see all your stuff gets executed and it has taken that much time in my system having my

140
00:11:23,600 --> 00:11:30,370
own specification for in your case, it will take definitely depending upon what it is in your system,

141
00:11:30,370 --> 00:11:31,300
how you will see.

142
00:11:31,310 --> 00:11:38,450
These are all the parameters written by my cross-validation, written by my randomizer cross-validation

143
00:11:38,810 --> 00:11:44,620
software it what we are going to do, let's say I'm going to check what are my best parameters.

144
00:11:44,630 --> 00:11:52,460
So here I'm going to say right part this for random dot best underscore firearms to just have the order.

145
00:11:52,460 --> 00:11:57,500
You'll see these are my best parameters selected by my cross-validation validation.

146
00:11:57,770 --> 00:12:02,860
So now what I'm going to do very first, I have to do a prediction.

147
00:12:02,870 --> 00:12:12,080
So now I'm going to say this on this random dot product, because you have to predict on your X underscored

148
00:12:12,260 --> 00:12:14,570
test data supporting prediction.

149
00:12:14,870 --> 00:12:21,680
I'm going to store it in, let's say, prediction and let's say if I'm going to execute it and if this

150
00:12:21,680 --> 00:12:30,770
time you are going to check what is exactly distribution between your actual data, minus whatever prediction

151
00:12:30,770 --> 00:12:32,090
you have done or what did.

152
00:12:32,090 --> 00:12:40,220
So I'm going to say this is nothing, but this is this type of distribution that I have achieved using

153
00:12:40,220 --> 00:12:41,710
my randomized search.

154
00:12:42,230 --> 00:12:50,330
And if you are going to check what exactly is accuracy after doing this, all these highfaluting for

155
00:12:50,330 --> 00:12:57,410
this, I'm going to say metrics dot are to underscore a square, and here you have to pass.

156
00:12:57,410 --> 00:13:01,040
What is the actual data and what exactly is a prediction?

157
00:13:01,850 --> 00:13:08,780
Just executer you will see now you have in case of random forest, you have somewhere approx.

158
00:13:08,780 --> 00:13:11,060
Eighty three percent accuracy.

159
00:13:11,360 --> 00:13:20,110
But before when you use random forest, you can observe you have some tools to 80 percent accuracy that

160
00:13:20,120 --> 00:13:23,710
are bovver of your model highfaluting.

161
00:13:23,810 --> 00:13:30,260
Whenever you are going to work on real world projects, you have to always hyper tuned model.

162
00:13:30,440 --> 00:13:33,590
You have to always cross validation model.

163
00:13:33,590 --> 00:13:35,360
That's a power of your model.

164
00:13:36,380 --> 00:13:39,490
It will definitely increase your accuracy.

165
00:13:39,620 --> 00:13:41,450
Let's say I have to say it.

166
00:13:41,540 --> 00:13:46,980
Let's say I have to dump this model, this best model that I have created over here.

167
00:13:46,980 --> 00:13:48,890
It's very first what I'm going to do.

168
00:13:48,890 --> 00:13:52,280
I have to open some flat in which I can dump this.

169
00:13:52,280 --> 00:13:55,640
For this, I'm going to say where I have to open it.

170
00:13:55,640 --> 00:13:58,280
So I'm just going to copy this part.

171
00:13:58,760 --> 00:14:01,970
Just going to paste over there and here.

172
00:14:01,970 --> 00:14:08,390
I'm going to say this time, my model name is, let's say AALDEF on a random dot Beaky.

173
00:14:08,390 --> 00:14:13,970
And then I have to say, in what model I have to open this file.

174
00:14:13,980 --> 00:14:15,800
So here I would say in writable.

175
00:14:16,160 --> 00:14:17,390
So just execute.

176
00:14:17,390 --> 00:14:22,730
And now what we have to do, we have to use this pikul dot dump.

177
00:14:22,880 --> 00:14:29,530
And here I have to say I have to dump this order from the random into my file.

178
00:14:29,720 --> 00:14:31,220
So just execute it.

179
00:14:31,430 --> 00:14:38,510
And now you will see over here, here is exactly your model that is right now creating over here.

180
00:14:38,690 --> 00:14:45,240
Let's say we have to do prediction using that model, using that model that I have created over here.

181
00:14:45,260 --> 00:14:49,850
So what we have to do at first, we have to load this model that we have done.

182
00:14:50,210 --> 00:14:54,250
So now I'm going to say very first, I'm very first.

183
00:14:54,260 --> 00:14:57,530
I'm going to say what I have to open very first.

184
00:14:57,860 --> 00:14:59,420
So let's say I'm just.

185
00:14:59,500 --> 00:15:05,710
Going to copy this, but let's see, so I'm just going to go up with is just going to Pastoria, let's

186
00:15:05,710 --> 00:15:13,350
say I have to load my previous model that I have created, which is exactly my model dot beacon.

187
00:15:13,540 --> 00:15:20,530
And here I have to say to my read in binary mode, because I have to read that plan after what I have

188
00:15:20,530 --> 00:15:24,210
to do, I'm going to save it in the fly.

189
00:15:24,430 --> 00:15:28,300
And after it I have to basically load my models here.

190
00:15:28,300 --> 00:15:32,290
I would say bikal dot load.

191
00:15:32,290 --> 00:15:37,840
And here I have to let's say I'm going to store it, let's say in model.

192
00:15:38,140 --> 00:15:41,810
And after I left it, I would say I have to load this model.

193
00:15:41,830 --> 00:15:50,890
So once you were executed, you will see this is that model of random forest class that you have imported

194
00:15:50,890 --> 00:15:51,470
earlier.

195
00:15:51,700 --> 00:15:57,870
Similarly, you can load this crossover little model that you have dump over here.

196
00:15:58,090 --> 00:16:02,040
Let's say I have to do prediction using this model.

197
00:16:02,050 --> 00:16:05,210
So I'm going to say model for it.

198
00:16:05,210 --> 00:16:14,320
Let's say I am going to say model dot product and you have to just use of credit function and on what

199
00:16:14,320 --> 00:16:21,490
data you have to predict to what I am going to do very first, you have to store this model somewhere.

200
00:16:21,530 --> 00:16:23,410
So let's say I'm with the store again.

201
00:16:23,410 --> 00:16:26,680
Let's say for this or random forest, it's all up to you.

202
00:16:26,800 --> 00:16:32,560
And after it, what we have to do using this forest, you have to call a Braddick function.

203
00:16:32,740 --> 00:16:35,800
And here you have to mention, let's say, X test.

204
00:16:36,040 --> 00:16:42,640
And if you will execute you, you'll see over here you have all your predictions with respect to this

205
00:16:42,670 --> 00:16:43,780
X test data.

206
00:16:43,780 --> 00:16:44,270
That's it.

207
00:16:44,500 --> 00:16:47,800
I have to store these predictions.

208
00:16:48,040 --> 00:16:49,840
I have to store these predictions.

209
00:16:49,840 --> 00:16:53,430
We are somewhere as, let's say, predictions doom.

210
00:16:53,500 --> 00:16:54,850
So just executed.

211
00:16:54,880 --> 00:16:57,370
All the prediction is exactly over here.

212
00:16:57,670 --> 00:17:00,420
I'd say I have to check the accuracy as well.

213
00:17:00,700 --> 00:17:06,910
So here I would say Kranks dot to describe breast.

214
00:17:07,540 --> 00:17:15,610
And here you have to mention what exactly is your exact data and after that, what is your predictions

215
00:17:16,060 --> 00:17:16,630
exactly.

216
00:17:16,630 --> 00:17:24,190
And predictions do so just execute and you will see this is the previous accuracy that you have achieved

217
00:17:24,190 --> 00:17:31,290
earlier and similarly you can perform all these is similarly for your cross village model.

218
00:17:31,630 --> 00:17:38,440
So in such scenarios, you will definitely get somewhere approx, 80 percent accuracy that you have

219
00:17:38,440 --> 00:17:39,870
achieved over here.

220
00:17:40,180 --> 00:17:41,910
So that's what I'm trying to show you.

221
00:17:42,130 --> 00:17:44,080
So let's see if you have some new data.

222
00:17:44,440 --> 00:17:49,690
You have to just pass that new data over here and you will get your accuracy.

223
00:17:49,990 --> 00:17:52,120
So that's all about this project.

224
00:17:52,150 --> 00:17:58,480
Hopefully you'll love this project very much and try to explore from your own site as much as you can.

225
00:17:58,610 --> 00:17:59,590
So thank you, guys.

226
00:17:59,590 --> 00:18:00,570
Have a nice day.

227
00:18:01,000 --> 00:18:01,870
Keep learning.

228
00:18:01,870 --> 00:18:04,120
Keep growing, keep motivating.