1
00:00:00,900 --> 00:00:09,310
In the last video, we used data augmentation techniques to increase our validation accuracy to a two

2
00:00:09,310 --> 00:00:10,860
to two to three percent.

3
00:00:12,940 --> 00:00:21,320
In this video, we will use BDD 16 model architecture to further increase our validation accuracy.

4
00:00:21,520 --> 00:00:22,660
About 90 percent.

5
00:00:26,620 --> 00:00:31,460
Viji, 16, was The Runner-Up of 2014.

6
00:00:32,220 --> 00:00:34,390
I unless we are see competition.

7
00:00:36,280 --> 00:00:43,290
The problem is statement of that competition was to categorize millions of pictures in two thousand

8
00:00:43,330 --> 00:00:44,620
different categories.

9
00:00:46,640 --> 00:00:50,880
The pictures were of animals, humans, et cetera.

10
00:00:52,110 --> 00:00:57,300
And the categories were off different animal species and many other.

11
00:00:59,280 --> 00:01:01,330
So the problem we are trying to solve.

12
00:01:01,890 --> 00:01:06,920
You can consider it as a subset of all regional 2014.

13
00:01:07,610 --> 00:01:10,230
Unless we see completion data.

14
00:01:13,190 --> 00:01:15,170
As we discussed, in order to be lectured.

15
00:01:16,160 --> 00:01:23,840
We can use convolutional part of this retrain model architectures of problems.

16
00:01:27,110 --> 00:01:35,700
These murders consist of two parts, convolutional base and then a fully connected neural network base.

17
00:01:36,410 --> 00:01:42,800
The convolutional base is used to identify features from the images and then.

18
00:01:43,980 --> 00:01:48,630
The fully connected, neutral base is used to classify those features.

19
00:01:51,980 --> 00:02:00,920
So for any similar kind of problem, we can easily use pre train convolutional base to extract features

20
00:02:00,920 --> 00:02:02,000
from our images.

21
00:02:03,580 --> 00:02:10,960
And then we can do three layers of fully connected neural network to classify the desert of this can.

22
00:02:11,220 --> 00:02:11,650
Mrs..

23
00:02:13,520 --> 00:02:15,470
In this video, the idea is same.

24
00:02:16,190 --> 00:02:20,600
We will use the on base of Viji 16 modern.

25
00:02:21,900 --> 00:02:29,100
And then we will add one fully connected hidden layer and one output layer to classify the features

26
00:02:29,460 --> 00:02:32,790
extracted from what we do this extreme con base.

27
00:02:35,860 --> 00:02:38,130
So let's start first.

28
00:02:38,230 --> 00:02:44,740
We will be creating to object green generator and validation, you know, that we have already used

29
00:02:45,340 --> 00:02:49,240
the same generators in previous cases also.

30
00:02:50,560 --> 00:02:52,330
So we are using the same setting.

31
00:02:52,900 --> 00:02:57,930
We are using these scaling off one by 255 to convert our RTG.

32
00:02:57,930 --> 00:02:58,690
We will lose.

33
00:02:59,640 --> 00:03:04,630
From zero to 255, two zero two one.

34
00:03:05,430 --> 00:03:13,720
Then we have a rotation ringgits, which ship high chair Shearin zoom range and horizontal flip to create

35
00:03:13,720 --> 00:03:15,210
dummy augmented data.

36
00:03:16,890 --> 00:03:21,300
And our target sales of images is one 50 by one four feet.

37
00:03:21,390 --> 00:03:23,670
And we are using a bed size of twenty.

38
00:03:24,870 --> 00:03:30,120
We have already discussed this in detail in our previous videos.

39
00:03:30,180 --> 00:03:36,810
So we are not going to discuss this here, just like in our previous case.

40
00:03:37,500 --> 00:03:39,420
We have our own 2000 images.

41
00:03:40,370 --> 00:03:43,040
For training and polls and images for validation.

42
00:03:45,740 --> 00:03:49,820
Now, the second step was to create architecture for more than.

43
00:03:51,510 --> 00:03:59,340
Now, our idea is to first use a corn base of disease 16 and then use to then slier.

44
00:04:03,630 --> 00:04:10,140
So to use corn based off, we do this 16, you can directly import that from Kiraz.

45
00:04:10,830 --> 00:04:15,360
There is no need to manually build all the corn layers in that base.

46
00:04:15,960 --> 00:04:18,570
So the import we did is 60.

47
00:04:19,020 --> 00:04:23,340
You can just right from then sort of flowed out, get out or the application import.

48
00:04:23,370 --> 00:04:24,290
We do these 16.

49
00:04:24,900 --> 00:04:28,260
And then give these three different parameters.

50
00:04:32,330 --> 00:04:36,920
So we are creating our corn base object and we are using.

51
00:04:36,980 --> 00:04:37,530
We did this.

52
00:04:38,630 --> 00:04:41,180
And these are the three parameters that we are passing.

53
00:04:42,200 --> 00:04:45,020
First, we need to provide Waites.

54
00:04:46,700 --> 00:04:52,270
So in any convolutional neural network, first we provide randomise suite.

55
00:04:52,910 --> 00:04:57,200
And then our convolutional network tries to optimize those words.

56
00:04:59,220 --> 00:04:59,790
Since.

57
00:05:00,810 --> 00:05:08,880
We do these 16 was used in that competition and we can use the final weights of that more than.

58
00:05:12,260 --> 00:05:18,100
So to use those weights, we have to right weights equal to imagine that you made.

59
00:05:18,200 --> 00:05:19,520
Net is the competition.

60
00:05:20,870 --> 00:05:22,430
Yes, we are see competition.

61
00:05:25,610 --> 00:05:30,260
So to use pre train weights, we just separate very quickly Metronet.

62
00:05:31,040 --> 00:05:34,040
And then there were two parts of that.

63
00:05:34,130 --> 00:05:41,830
We do these 16 model first for the corn based and then the fully connected Nuran liquid base.

64
00:05:42,590 --> 00:05:49,730
We only won the corn base from that model since corn bases are reusable.

65
00:05:50,180 --> 00:05:55,850
Those are mainly used to extract features and not to categorize the images.

66
00:05:56,390 --> 00:06:02,210
So we will be only using the corn based and we only need to import corn based.

67
00:06:02,860 --> 00:06:07,610
That's what we are using, include underscore top equal to falls.

68
00:06:09,810 --> 00:06:16,320
If we want to impose the whole model, along with the fully connected, dense layers, then you have

69
00:06:16,320 --> 00:06:17,610
to change it to cool.

70
00:06:19,710 --> 00:06:26,530
But in our case, since we are on the importing the convolutional base, we are providing false here.

71
00:06:29,750 --> 00:06:33,110
Then next parameter is to give the input chip.

72
00:06:34,910 --> 00:06:39,410
The input shape of our images are one four feet by four feet by three.

73
00:06:39,590 --> 00:06:43,040
That's why we operated this couple here.

74
00:06:44,090 --> 00:06:45,110
Let's turn this.

75
00:06:46,260 --> 00:06:50,970
So we have imported our corn base from which it is 16 model.

76
00:06:54,200 --> 00:06:58,850
Now, to look at this, you can just write one based dot somebody's.

77
00:07:02,030 --> 00:07:11,670
Oh, if you run this, you will get details of all the layers of this, which is this extreme corn base.

78
00:07:15,110 --> 00:07:17,530
Now, as we discuss in order to re lecture.

79
00:07:18,530 --> 00:07:26,990
We do these 16 have this convolutional blocks, so here you can see first convolutional block, then

80
00:07:26,990 --> 00:07:30,230
second convolutional block in each block.

81
00:07:30,470 --> 00:07:32,420
There are multiple layers.

82
00:07:32,690 --> 00:07:34,460
So in first sense, I can blocks.

83
00:07:34,490 --> 00:07:36,000
There are two corn layers.

84
00:07:36,410 --> 00:07:41,620
And then a max pulling layer in third, fourth and fifth block.

85
00:07:42,290 --> 00:07:47,070
There are three corn layers and then a mix pulling layer.

86
00:07:52,310 --> 00:08:01,760
So in a way, by importing BGT 16, we avoided creating these many layers and we have already imported

87
00:08:01,760 --> 00:08:04,580
the final Waites of that model.

88
00:08:05,120 --> 00:08:09,950
So there is no need to randomly provide weights and optimize those words.

89
00:08:11,000 --> 00:08:15,890
We already have the final more than weights with us in this part of.

90
00:08:18,580 --> 00:08:25,840
Now, the next step is to add fully connected, dense layer and put layer in front of this fan base.

91
00:08:27,820 --> 00:08:31,480
Now, this is similar to creating any CNN model.

92
00:08:33,520 --> 00:08:35,460
We just have to create our model first.

93
00:08:36,190 --> 00:08:38,420
We are using models, not sequential.

94
00:08:39,310 --> 00:08:46,200
And then just like you add any other layer, you can add the corn base that we have imported.

95
00:08:47,580 --> 00:08:50,330
So we will write more than not egg.

96
00:08:50,850 --> 00:08:55,750
And here you can just write the variable in which we have is stored.

97
00:08:55,830 --> 00:08:56,960
This video is 60.

98
00:08:57,690 --> 00:08:59,780
So what we will need was on base.

99
00:09:00,600 --> 00:09:03,620
So first we can add this one base.

100
00:09:04,500 --> 00:09:06,880
Next, we have to use a flattened layer.

101
00:09:08,010 --> 00:09:10,320
And then include a fully connected, dense layer.

102
00:09:10,500 --> 00:09:11,490
And then output layer.

103
00:09:11,970 --> 00:09:17,010
So first, we are heading deadlier than our dense layer with 256 neurons.

104
00:09:17,850 --> 00:09:20,850
And then an output layer with a single neuron.

105
00:09:22,530 --> 00:09:24,840
The activation is a loop in the dense layer.

106
00:09:25,010 --> 00:09:26,530
And that tuition is sigmoid.

107
00:09:27,150 --> 00:09:28,150
The output layer.

108
00:09:31,230 --> 00:09:35,790
You can run this and then you can look at the more than somebody else.

109
00:09:37,990 --> 00:09:43,820
So if you see this is over more than somebody, our first layer is we do do 16.

110
00:09:44,950 --> 00:09:51,400
We have it on 14 millions re-enable barometer in this disease, 16 layer.

111
00:09:52,060 --> 00:09:57,810
Then we have a flattened layer than we have a dense layer with around two million cranial parameters.

112
00:09:58,570 --> 00:10:06,030
And then finally, an output layer with 257 cranial parameters, that totally trainable parameter.

113
00:10:06,370 --> 00:10:08,950
And our model is that on 16 million.

114
00:10:10,000 --> 00:10:15,290
Now, as I told you earlier, we were using the weights of the final.

115
00:10:15,310 --> 00:10:16,490
We did the 16 model.

116
00:10:18,190 --> 00:10:20,710
So the weights are already optimized.

117
00:10:20,980 --> 00:10:22,780
In this, we do this extremely at.

118
00:10:23,800 --> 00:10:30,040
Now, if you don't want to screen those weights, you can just freeze that layer.

119
00:10:30,550 --> 00:10:35,640
To freeze that, you can use corn based, not trainable, equal to falls.

120
00:10:36,460 --> 00:10:42,190
In that case, the cranial parameter here will turn to zero.

121
00:10:43,090 --> 00:10:49,090
And our model will not try to optimize the weights of this layer in that way.

122
00:10:49,210 --> 00:10:57,040
We can significantly reduce the number of animal parameters in one model and significantly improve our

123
00:10:57,130 --> 00:10:57,790
execution.

124
00:11:00,400 --> 00:11:07,450
So if you run this one based or trainable, equal to false, our number of cranial parameters will reduce

125
00:11:07,450 --> 00:11:10,870
from 16 million to just two point one million.

126
00:11:13,300 --> 00:11:19,290
But here we are not running this and we are screening all the sixteen million parameters.

127
00:11:20,050 --> 00:11:25,630
But in case you want to save thing, you can run this corn based or cleanable equate to false.

128
00:11:31,700 --> 00:11:34,280
Now, the next step is to come by lower Märta.

129
00:11:36,630 --> 00:11:40,310
We will be using the lost function of Binary Cross and Groppi.

130
00:11:41,100 --> 00:11:50,010
Since we have two classes, then we are using Artemus Prop as our optimizer and a learning rate of two

131
00:11:50,070 --> 00:11:51,810
and two tameness were minus five.

132
00:11:53,460 --> 00:12:01,260
We are using somewhat smaller learning rate to get just because we want to fine tune our already trained

133
00:12:01,260 --> 00:12:01,570
model.

134
00:12:03,360 --> 00:12:08,230
The roots of this convolutional lietz are already optimized.

135
00:12:08,370 --> 00:12:13,560
And we just want to optimize it and little steps according to our problem.

136
00:12:14,490 --> 00:12:20,370
So since we are fine tuning it, we are not draining it from randomly assigned weights.

137
00:12:20,730 --> 00:12:23,160
We can use a smaller learning rate.

138
00:12:24,090 --> 00:12:27,900
That's why we are using two in bootprint dendrites, four minus five.

139
00:12:28,770 --> 00:12:29,470
And the metrics.

140
00:12:29,520 --> 00:12:32,040
We want to calculate is of accuracy.

141
00:12:33,950 --> 00:12:38,730
So screaming these models will take somewhere between eight to 10 hours.

142
00:12:39,450 --> 00:12:45,600
So it is better to use callbacks who say you are more than offset each epoch.

143
00:12:48,010 --> 00:12:55,400
We are creating our tech porn callback and we are saving our model for each people.

144
00:12:58,000 --> 00:13:03,450
You can also use save best only barometer here if you don't want to save.

145
00:13:03,940 --> 00:13:04,810
Different models.

146
00:13:06,150 --> 00:13:13,150
And if you give safe, best equipped crew, we will save more than with the best value addition and

147
00:13:13,160 --> 00:13:13,470
support.

148
00:13:16,590 --> 00:13:19,460
Now, the next step is to fit the training data.

149
00:13:21,740 --> 00:13:29,270
The step is similar to the last thing we will use for gen rigor and then Green Ritter with his Steps

150
00:13:29,280 --> 00:13:35,940
Buddy book, and we also give validation and data and validation steps for evaluation data.

151
00:13:37,110 --> 00:13:42,860
And yet we are also providing callback just to say, well, model after each epoch.

152
00:13:43,950 --> 00:13:46,410
So I have already executed this.

153
00:13:47,250 --> 00:13:49,410
And these are the results.

154
00:13:51,960 --> 00:13:58,080
So if you see the validation, accuracy's, are in the range of 92, 97.

155
00:13:59,430 --> 00:14:06,900
So at the end of the book we will getting our training accuracy of around 98 percent.

156
00:14:07,110 --> 00:14:10,470
And the validation, accuracy of 98 percent as well.

157
00:14:14,010 --> 00:14:18,660
You can see each epoch is taking around 15 minutes to train.

158
00:14:19,710 --> 00:14:24,440
So just remember, this may take up to eight to 10 hours to Cremer.

159
00:14:24,460 --> 00:14:24,960
What more than.

160
00:14:28,900 --> 00:14:35,020
Now, let's look at how accuracy's and losses are changing with each people.

161
00:14:38,800 --> 00:14:41,750
The audience line here is for training accuracy.

162
00:14:42,070 --> 00:14:44,560
The red line here is for evaluation accuracy.

163
00:14:45,010 --> 00:14:52,060
And similarly, we have evaluation loss in green and greening loss in blue.

164
00:14:55,330 --> 00:15:06,100
You can see that validation accuracy is oscillating between 97, 98, and there is no further improvement

165
00:15:06,130 --> 00:15:10,080
in accuracy as we move from lower Depok or a little higher.

166
00:15:10,910 --> 00:15:14,740
So we can say that we have achieved a convergence in our model.

167
00:15:14,950 --> 00:15:22,150
And it is not possible to further improve this validation accuracy with increasing number of epochs.

168
00:15:24,190 --> 00:15:31,000
So if you compare the validation accuracy with our last CNN model, in the last CNN model, we were

169
00:15:31,000 --> 00:15:33,610
getting the maximum accuracy of 84 percent.

170
00:15:34,720 --> 00:15:44,290
But in this, by using BDD 16, retrain more than we are achieving up to 97, 98 percent of validation

171
00:15:44,290 --> 00:15:44,830
accuracy.

172
00:15:46,810 --> 00:15:52,340
And it is very easy to train our model using this retrained models.

173
00:15:55,480 --> 00:15:59,880
So there is no need to create your own corn business.

174
00:16:01,750 --> 00:16:06,520
You can just use any one on pre retrained corn basis.

175
00:16:06,670 --> 00:16:12,610
If the problem is statement is somewhat similar to the image, net problem is statements.

176
00:16:17,510 --> 00:16:21,470
Now, I am also saving this history variable and blowsy at Sufyan.

177
00:16:21,880 --> 00:16:23,900
There is no need to do this.

178
00:16:23,900 --> 00:16:27,620
A step here, Bill.

179
00:16:28,280 --> 00:16:33,800
We were on legal, including the accuracy's on our validation sets.

180
00:16:35,770 --> 00:16:42,490
But now it's time to use our tests to see how this model performs on our test site.

181
00:16:45,340 --> 00:16:50,320
Now we have to follow the same steps to evaluate our model for performance.

182
00:16:51,880 --> 00:16:55,810
Again, we will be using test generated.

183
00:16:58,950 --> 00:17:01,320
So we're creating another generator.

184
00:17:01,810 --> 00:17:03,540
We are calling this generator.

185
00:17:06,540 --> 00:17:08,940
We are using the test and that's called detergent.

186
00:17:09,420 --> 00:17:12,550
This is the same object we use for validation as well.

187
00:17:13,350 --> 00:17:20,100
So hidden in this object, we are just reshaping our data from zero to 255 to zero to one.

188
00:17:20,910 --> 00:17:23,300
And then we are using flow from barratry.

189
00:17:23,370 --> 00:17:27,620
And here we are creating a static tree instead of validation category.

190
00:17:28,140 --> 00:17:29,060
So we're testing it.

191
00:17:29,070 --> 00:17:29,850
That is literally.

192
00:17:31,450 --> 00:17:39,860
Now, normally, if we have data in the form of a data frame, we use it, well, you admitted.

193
00:17:40,360 --> 00:17:47,550
But since we have our data flowing from our directory, that's why we have to use it.

194
00:17:47,580 --> 00:17:47,700
Well.

195
00:17:47,890 --> 00:17:49,130
It underscored generator.

196
00:17:51,640 --> 00:17:52,940
So there is a simple back.

197
00:17:53,200 --> 00:17:55,420
What we were using for generator.

198
00:17:57,010 --> 00:17:59,710
Similarly for revaluate, we are using it will.

199
00:17:59,710 --> 00:18:00,410
Your gender does.

200
00:18:01,150 --> 00:18:06,790
And here also we have to provide a base generator object and the number of steps.

201
00:18:07,870 --> 00:18:09,550
We have a bed size of 20.

202
00:18:10,060 --> 00:18:13,780
We have to touch base data size of our own potent images.

203
00:18:14,260 --> 00:18:16,750
That's why we need 50 steps.

204
00:18:18,640 --> 00:18:21,220
Thousand divided by 20, equal to 50.

205
00:18:21,880 --> 00:18:25,990
So we will be able to go our all our best images in four feet steps.

206
00:18:26,500 --> 00:18:33,190
So if you run this just like evaluate method, you will get to use force as the lost value.

207
00:18:33,280 --> 00:18:35,080
And second is the accuracy value.

208
00:18:35,890 --> 00:18:39,830
And here you can see that the accuracy we are getting is little.

209
00:18:40,020 --> 00:18:41,080
Ninety seven percent.

210
00:18:44,270 --> 00:18:45,130
So just foodways.

211
00:18:45,680 --> 00:18:50,810
We started with a simple, convolutional model.

212
00:18:51,830 --> 00:18:55,750
At that time, we were getting accuracy off our own sound, 74 percent.

213
00:18:57,350 --> 00:19:03,820
Then we use data augmentation techniques to create dummy data and avoid what fitting?

214
00:19:04,610 --> 00:19:09,710
In that case, we were getting accuracy of our own 83 to 84 percent.

215
00:19:12,190 --> 00:19:15,610
And in this case, we used up retrain.

216
00:19:15,770 --> 00:19:17,200
We do these 16 more than.

217
00:19:19,440 --> 00:19:26,670
For our problem and in this case, we are getting an accuracy of 97 percent.

218
00:19:27,810 --> 00:19:36,720
So we have increased over valuation accuracy from 73 percent to 98 percent during this project.

219
00:19:38,520 --> 00:19:39,870
That's all for this project.

220
00:19:40,350 --> 00:19:40,800
Thank you.