1 00:00:00,900 --> 00:00:09,310 In the last video, we used data augmentation techniques to increase our validation accuracy to a two 2 00:00:09,310 --> 00:00:10,860 to two to three percent. 3 00:00:12,940 --> 00:00:21,320 In this video, we will use BDD 16 model architecture to further increase our validation accuracy. 4 00:00:21,520 --> 00:00:22,660 About 90 percent. 5 00:00:26,620 --> 00:00:31,460 Viji, 16, was The Runner-Up of 2014. 6 00:00:32,220 --> 00:00:34,390 I unless we are see competition. 7 00:00:36,280 --> 00:00:43,290 The problem is statement of that competition was to categorize millions of pictures in two thousand 8 00:00:43,330 --> 00:00:44,620 different categories. 9 00:00:46,640 --> 00:00:50,880 The pictures were of animals, humans, et cetera. 10 00:00:52,110 --> 00:00:57,300 And the categories were off different animal species and many other. 11 00:00:59,280 --> 00:01:01,330 So the problem we are trying to solve. 12 00:01:01,890 --> 00:01:06,920 You can consider it as a subset of all regional 2014. 13 00:01:07,610 --> 00:01:10,230 Unless we see completion data. 14 00:01:13,190 --> 00:01:15,170 As we discussed, in order to be lectured. 15 00:01:16,160 --> 00:01:23,840 We can use convolutional part of this retrain model architectures of problems. 16 00:01:27,110 --> 00:01:35,700 These murders consist of two parts, convolutional base and then a fully connected neural network base. 17 00:01:36,410 --> 00:01:42,800 The convolutional base is used to identify features from the images and then. 18 00:01:43,980 --> 00:01:48,630 The fully connected, neutral base is used to classify those features. 19 00:01:51,980 --> 00:02:00,920 So for any similar kind of problem, we can easily use pre train convolutional base to extract features 20 00:02:00,920 --> 00:02:02,000 from our images. 21 00:02:03,580 --> 00:02:10,960 And then we can do three layers of fully connected neural network to classify the desert of this can. 22 00:02:11,220 --> 00:02:11,650 Mrs.. 23 00:02:13,520 --> 00:02:15,470 In this video, the idea is same. 24 00:02:16,190 --> 00:02:20,600 We will use the on base of Viji 16 modern. 25 00:02:21,900 --> 00:02:29,100 And then we will add one fully connected hidden layer and one output layer to classify the features 26 00:02:29,460 --> 00:02:32,790 extracted from what we do this extreme con base. 27 00:02:35,860 --> 00:02:38,130 So let's start first. 28 00:02:38,230 --> 00:02:44,740 We will be creating to object green generator and validation, you know, that we have already used 29 00:02:45,340 --> 00:02:49,240 the same generators in previous cases also. 30 00:02:50,560 --> 00:02:52,330 So we are using the same setting. 31 00:02:52,900 --> 00:02:57,930 We are using these scaling off one by 255 to convert our RTG. 32 00:02:57,930 --> 00:02:58,690 We will lose. 33 00:02:59,640 --> 00:03:04,630 From zero to 255, two zero two one. 34 00:03:05,430 --> 00:03:13,720 Then we have a rotation ringgits, which ship high chair Shearin zoom range and horizontal flip to create 35 00:03:13,720 --> 00:03:15,210 dummy augmented data. 36 00:03:16,890 --> 00:03:21,300 And our target sales of images is one 50 by one four feet. 37 00:03:21,390 --> 00:03:23,670 And we are using a bed size of twenty. 38 00:03:24,870 --> 00:03:30,120 We have already discussed this in detail in our previous videos. 39 00:03:30,180 --> 00:03:36,810 So we are not going to discuss this here, just like in our previous case. 40 00:03:37,500 --> 00:03:39,420 We have our own 2000 images. 41 00:03:40,370 --> 00:03:43,040 For training and polls and images for validation. 42 00:03:45,740 --> 00:03:49,820 Now, the second step was to create architecture for more than. 43 00:03:51,510 --> 00:03:59,340 Now, our idea is to first use a corn base of disease 16 and then use to then slier. 44 00:04:03,630 --> 00:04:10,140 So to use corn based off, we do this 16, you can directly import that from Kiraz. 45 00:04:10,830 --> 00:04:15,360 There is no need to manually build all the corn layers in that base. 46 00:04:15,960 --> 00:04:18,570 So the import we did is 60. 47 00:04:19,020 --> 00:04:23,340 You can just right from then sort of flowed out, get out or the application import. 48 00:04:23,370 --> 00:04:24,290 We do these 16. 49 00:04:24,900 --> 00:04:28,260 And then give these three different parameters. 50 00:04:32,330 --> 00:04:36,920 So we are creating our corn base object and we are using. 51 00:04:36,980 --> 00:04:37,530 We did this. 52 00:04:38,630 --> 00:04:41,180 And these are the three parameters that we are passing. 53 00:04:42,200 --> 00:04:45,020 First, we need to provide Waites. 54 00:04:46,700 --> 00:04:52,270 So in any convolutional neural network, first we provide randomise suite. 55 00:04:52,910 --> 00:04:57,200 And then our convolutional network tries to optimize those words. 56 00:04:59,220 --> 00:04:59,790 Since. 57 00:05:00,810 --> 00:05:08,880 We do these 16 was used in that competition and we can use the final weights of that more than. 58 00:05:12,260 --> 00:05:18,100 So to use those weights, we have to right weights equal to imagine that you made. 59 00:05:18,200 --> 00:05:19,520 Net is the competition. 60 00:05:20,870 --> 00:05:22,430 Yes, we are see competition. 61 00:05:25,610 --> 00:05:30,260 So to use pre train weights, we just separate very quickly Metronet. 62 00:05:31,040 --> 00:05:34,040 And then there were two parts of that. 63 00:05:34,130 --> 00:05:41,830 We do these 16 model first for the corn based and then the fully connected Nuran liquid base. 64 00:05:42,590 --> 00:05:49,730 We only won the corn base from that model since corn bases are reusable. 65 00:05:50,180 --> 00:05:55,850 Those are mainly used to extract features and not to categorize the images. 66 00:05:56,390 --> 00:06:02,210 So we will be only using the corn based and we only need to import corn based. 67 00:06:02,860 --> 00:06:07,610 That's what we are using, include underscore top equal to falls. 68 00:06:09,810 --> 00:06:16,320 If we want to impose the whole model, along with the fully connected, dense layers, then you have 69 00:06:16,320 --> 00:06:17,610 to change it to cool. 70 00:06:19,710 --> 00:06:26,530 But in our case, since we are on the importing the convolutional base, we are providing false here. 71 00:06:29,750 --> 00:06:33,110 Then next parameter is to give the input chip. 72 00:06:34,910 --> 00:06:39,410 The input shape of our images are one four feet by four feet by three. 73 00:06:39,590 --> 00:06:43,040 That's why we operated this couple here. 74 00:06:44,090 --> 00:06:45,110 Let's turn this. 75 00:06:46,260 --> 00:06:50,970 So we have imported our corn base from which it is 16 model. 76 00:06:54,200 --> 00:06:58,850 Now, to look at this, you can just write one based dot somebody's. 77 00:07:02,030 --> 00:07:11,670 Oh, if you run this, you will get details of all the layers of this, which is this extreme corn base. 78 00:07:15,110 --> 00:07:17,530 Now, as we discuss in order to re lecture. 79 00:07:18,530 --> 00:07:26,990 We do these 16 have this convolutional blocks, so here you can see first convolutional block, then 80 00:07:26,990 --> 00:07:30,230 second convolutional block in each block. 81 00:07:30,470 --> 00:07:32,420 There are multiple layers. 82 00:07:32,690 --> 00:07:34,460 So in first sense, I can blocks. 83 00:07:34,490 --> 00:07:36,000 There are two corn layers. 84 00:07:36,410 --> 00:07:41,620 And then a max pulling layer in third, fourth and fifth block. 85 00:07:42,290 --> 00:07:47,070 There are three corn layers and then a mix pulling layer. 86 00:07:52,310 --> 00:08:01,760 So in a way, by importing BGT 16, we avoided creating these many layers and we have already imported 87 00:08:01,760 --> 00:08:04,580 the final Waites of that model. 88 00:08:05,120 --> 00:08:09,950 So there is no need to randomly provide weights and optimize those words. 89 00:08:11,000 --> 00:08:15,890 We already have the final more than weights with us in this part of. 90 00:08:18,580 --> 00:08:25,840 Now, the next step is to add fully connected, dense layer and put layer in front of this fan base. 91 00:08:27,820 --> 00:08:31,480 Now, this is similar to creating any CNN model. 92 00:08:33,520 --> 00:08:35,460 We just have to create our model first. 93 00:08:36,190 --> 00:08:38,420 We are using models, not sequential. 94 00:08:39,310 --> 00:08:46,200 And then just like you add any other layer, you can add the corn base that we have imported. 95 00:08:47,580 --> 00:08:50,330 So we will write more than not egg. 96 00:08:50,850 --> 00:08:55,750 And here you can just write the variable in which we have is stored. 97 00:08:55,830 --> 00:08:56,960 This video is 60. 98 00:08:57,690 --> 00:08:59,780 So what we will need was on base. 99 00:09:00,600 --> 00:09:03,620 So first we can add this one base. 100 00:09:04,500 --> 00:09:06,880 Next, we have to use a flattened layer. 101 00:09:08,010 --> 00:09:10,320 And then include a fully connected, dense layer. 102 00:09:10,500 --> 00:09:11,490 And then output layer. 103 00:09:11,970 --> 00:09:17,010 So first, we are heading deadlier than our dense layer with 256 neurons. 104 00:09:17,850 --> 00:09:20,850 And then an output layer with a single neuron. 105 00:09:22,530 --> 00:09:24,840 The activation is a loop in the dense layer. 106 00:09:25,010 --> 00:09:26,530 And that tuition is sigmoid. 107 00:09:27,150 --> 00:09:28,150 The output layer. 108 00:09:31,230 --> 00:09:35,790 You can run this and then you can look at the more than somebody else. 109 00:09:37,990 --> 00:09:43,820 So if you see this is over more than somebody, our first layer is we do do 16. 110 00:09:44,950 --> 00:09:51,400 We have it on 14 millions re-enable barometer in this disease, 16 layer. 111 00:09:52,060 --> 00:09:57,810 Then we have a flattened layer than we have a dense layer with around two million cranial parameters. 112 00:09:58,570 --> 00:10:06,030 And then finally, an output layer with 257 cranial parameters, that totally trainable parameter. 113 00:10:06,370 --> 00:10:08,950 And our model is that on 16 million. 114 00:10:10,000 --> 00:10:15,290 Now, as I told you earlier, we were using the weights of the final. 115 00:10:15,310 --> 00:10:16,490 We did the 16 model. 116 00:10:18,190 --> 00:10:20,710 So the weights are already optimized. 117 00:10:20,980 --> 00:10:22,780 In this, we do this extremely at. 118 00:10:23,800 --> 00:10:30,040 Now, if you don't want to screen those weights, you can just freeze that layer. 119 00:10:30,550 --> 00:10:35,640 To freeze that, you can use corn based, not trainable, equal to falls. 120 00:10:36,460 --> 00:10:42,190 In that case, the cranial parameter here will turn to zero. 121 00:10:43,090 --> 00:10:49,090 And our model will not try to optimize the weights of this layer in that way. 122 00:10:49,210 --> 00:10:57,040 We can significantly reduce the number of animal parameters in one model and significantly improve our 123 00:10:57,130 --> 00:10:57,790 execution. 124 00:11:00,400 --> 00:11:07,450 So if you run this one based or trainable, equal to false, our number of cranial parameters will reduce 125 00:11:07,450 --> 00:11:10,870 from 16 million to just two point one million. 126 00:11:13,300 --> 00:11:19,290 But here we are not running this and we are screening all the sixteen million parameters. 127 00:11:20,050 --> 00:11:25,630 But in case you want to save thing, you can run this corn based or cleanable equate to false. 128 00:11:31,700 --> 00:11:34,280 Now, the next step is to come by lower Märta. 129 00:11:36,630 --> 00:11:40,310 We will be using the lost function of Binary Cross and Groppi. 130 00:11:41,100 --> 00:11:50,010 Since we have two classes, then we are using Artemus Prop as our optimizer and a learning rate of two 131 00:11:50,070 --> 00:11:51,810 and two tameness were minus five. 132 00:11:53,460 --> 00:12:01,260 We are using somewhat smaller learning rate to get just because we want to fine tune our already trained 133 00:12:01,260 --> 00:12:01,570 model. 134 00:12:03,360 --> 00:12:08,230 The roots of this convolutional lietz are already optimized. 135 00:12:08,370 --> 00:12:13,560 And we just want to optimize it and little steps according to our problem. 136 00:12:14,490 --> 00:12:20,370 So since we are fine tuning it, we are not draining it from randomly assigned weights. 137 00:12:20,730 --> 00:12:23,160 We can use a smaller learning rate. 138 00:12:24,090 --> 00:12:27,900 That's why we are using two in bootprint dendrites, four minus five. 139 00:12:28,770 --> 00:12:29,470 And the metrics. 140 00:12:29,520 --> 00:12:32,040 We want to calculate is of accuracy. 141 00:12:33,950 --> 00:12:38,730 So screaming these models will take somewhere between eight to 10 hours. 142 00:12:39,450 --> 00:12:45,600 So it is better to use callbacks who say you are more than offset each epoch. 143 00:12:48,010 --> 00:12:55,400 We are creating our tech porn callback and we are saving our model for each people. 144 00:12:58,000 --> 00:13:03,450 You can also use save best only barometer here if you don't want to save. 145 00:13:03,940 --> 00:13:04,810 Different models. 146 00:13:06,150 --> 00:13:13,150 And if you give safe, best equipped crew, we will save more than with the best value addition and 147 00:13:13,160 --> 00:13:13,470 support. 148 00:13:16,590 --> 00:13:19,460 Now, the next step is to fit the training data. 149 00:13:21,740 --> 00:13:29,270 The step is similar to the last thing we will use for gen rigor and then Green Ritter with his Steps 150 00:13:29,280 --> 00:13:35,940 Buddy book, and we also give validation and data and validation steps for evaluation data. 151 00:13:37,110 --> 00:13:42,860 And yet we are also providing callback just to say, well, model after each epoch. 152 00:13:43,950 --> 00:13:46,410 So I have already executed this. 153 00:13:47,250 --> 00:13:49,410 And these are the results. 154 00:13:51,960 --> 00:13:58,080 So if you see the validation, accuracy's, are in the range of 92, 97. 155 00:13:59,430 --> 00:14:06,900 So at the end of the book we will getting our training accuracy of around 98 percent. 156 00:14:07,110 --> 00:14:10,470 And the validation, accuracy of 98 percent as well. 157 00:14:14,010 --> 00:14:18,660 You can see each epoch is taking around 15 minutes to train. 158 00:14:19,710 --> 00:14:24,440 So just remember, this may take up to eight to 10 hours to Cremer. 159 00:14:24,460 --> 00:14:24,960 What more than. 160 00:14:28,900 --> 00:14:35,020 Now, let's look at how accuracy's and losses are changing with each people. 161 00:14:38,800 --> 00:14:41,750 The audience line here is for training accuracy. 162 00:14:42,070 --> 00:14:44,560 The red line here is for evaluation accuracy. 163 00:14:45,010 --> 00:14:52,060 And similarly, we have evaluation loss in green and greening loss in blue. 164 00:14:55,330 --> 00:15:06,100 You can see that validation accuracy is oscillating between 97, 98, and there is no further improvement 165 00:15:06,130 --> 00:15:10,080 in accuracy as we move from lower Depok or a little higher. 166 00:15:10,910 --> 00:15:14,740 So we can say that we have achieved a convergence in our model. 167 00:15:14,950 --> 00:15:22,150 And it is not possible to further improve this validation accuracy with increasing number of epochs. 168 00:15:24,190 --> 00:15:31,000 So if you compare the validation accuracy with our last CNN model, in the last CNN model, we were 169 00:15:31,000 --> 00:15:33,610 getting the maximum accuracy of 84 percent. 170 00:15:34,720 --> 00:15:44,290 But in this, by using BDD 16, retrain more than we are achieving up to 97, 98 percent of validation 171 00:15:44,290 --> 00:15:44,830 accuracy. 172 00:15:46,810 --> 00:15:52,340 And it is very easy to train our model using this retrained models. 173 00:15:55,480 --> 00:15:59,880 So there is no need to create your own corn business. 174 00:16:01,750 --> 00:16:06,520 You can just use any one on pre retrained corn basis. 175 00:16:06,670 --> 00:16:12,610 If the problem is statement is somewhat similar to the image, net problem is statements. 176 00:16:17,510 --> 00:16:21,470 Now, I am also saving this history variable and blowsy at Sufyan. 177 00:16:21,880 --> 00:16:23,900 There is no need to do this. 178 00:16:23,900 --> 00:16:27,620 A step here, Bill. 179 00:16:28,280 --> 00:16:33,800 We were on legal, including the accuracy's on our validation sets. 180 00:16:35,770 --> 00:16:42,490 But now it's time to use our tests to see how this model performs on our test site. 181 00:16:45,340 --> 00:16:50,320 Now we have to follow the same steps to evaluate our model for performance. 182 00:16:51,880 --> 00:16:55,810 Again, we will be using test generated. 183 00:16:58,950 --> 00:17:01,320 So we're creating another generator. 184 00:17:01,810 --> 00:17:03,540 We are calling this generator. 185 00:17:06,540 --> 00:17:08,940 We are using the test and that's called detergent. 186 00:17:09,420 --> 00:17:12,550 This is the same object we use for validation as well. 187 00:17:13,350 --> 00:17:20,100 So hidden in this object, we are just reshaping our data from zero to 255 to zero to one. 188 00:17:20,910 --> 00:17:23,300 And then we are using flow from barratry. 189 00:17:23,370 --> 00:17:27,620 And here we are creating a static tree instead of validation category. 190 00:17:28,140 --> 00:17:29,060 So we're testing it. 191 00:17:29,070 --> 00:17:29,850 That is literally. 192 00:17:31,450 --> 00:17:39,860 Now, normally, if we have data in the form of a data frame, we use it, well, you admitted. 193 00:17:40,360 --> 00:17:47,550 But since we have our data flowing from our directory, that's why we have to use it. 194 00:17:47,580 --> 00:17:47,700 Well. 195 00:17:47,890 --> 00:17:49,130 It underscored generator. 196 00:17:51,640 --> 00:17:52,940 So there is a simple back. 197 00:17:53,200 --> 00:17:55,420 What we were using for generator. 198 00:17:57,010 --> 00:17:59,710 Similarly for revaluate, we are using it will. 199 00:17:59,710 --> 00:18:00,410 Your gender does. 200 00:18:01,150 --> 00:18:06,790 And here also we have to provide a base generator object and the number of steps. 201 00:18:07,870 --> 00:18:09,550 We have a bed size of 20. 202 00:18:10,060 --> 00:18:13,780 We have to touch base data size of our own potent images. 203 00:18:14,260 --> 00:18:16,750 That's why we need 50 steps. 204 00:18:18,640 --> 00:18:21,220 Thousand divided by 20, equal to 50. 205 00:18:21,880 --> 00:18:25,990 So we will be able to go our all our best images in four feet steps. 206 00:18:26,500 --> 00:18:33,190 So if you run this just like evaluate method, you will get to use force as the lost value. 207 00:18:33,280 --> 00:18:35,080 And second is the accuracy value. 208 00:18:35,890 --> 00:18:39,830 And here you can see that the accuracy we are getting is little. 209 00:18:40,020 --> 00:18:41,080 Ninety seven percent. 210 00:18:44,270 --> 00:18:45,130 So just foodways. 211 00:18:45,680 --> 00:18:50,810 We started with a simple, convolutional model. 212 00:18:51,830 --> 00:18:55,750 At that time, we were getting accuracy off our own sound, 74 percent. 213 00:18:57,350 --> 00:19:03,820 Then we use data augmentation techniques to create dummy data and avoid what fitting? 214 00:19:04,610 --> 00:19:09,710 In that case, we were getting accuracy of our own 83 to 84 percent. 215 00:19:12,190 --> 00:19:15,610 And in this case, we used up retrain. 216 00:19:15,770 --> 00:19:17,200 We do these 16 more than. 217 00:19:19,440 --> 00:19:26,670 For our problem and in this case, we are getting an accuracy of 97 percent. 218 00:19:27,810 --> 00:19:36,720 So we have increased over valuation accuracy from 73 percent to 98 percent during this project. 219 00:19:38,520 --> 00:19:39,870 That's all for this project. 220 00:19:40,350 --> 00:19:40,800 Thank you.