1 00:00:00,510 --> 00:00:07,650 In this lecture, we are going to see the impact of bullying layer on the number of parameters we have 2 00:00:07,650 --> 00:00:12,090 to train and the execution time on our CNN model. 3 00:00:13,440 --> 00:00:19,890 First, we will run the CNN model that we discussed in our last lecture with the polling leader. 4 00:00:20,900 --> 00:00:29,850 Then we will remove the polling layer from that model to see its impact on the execution time. 5 00:00:32,740 --> 00:00:37,360 So this is the architecture of the model that we developed in the last lecture. 6 00:00:38,890 --> 00:00:42,880 First we have then put layer, then on layer, then pulling layer. 7 00:00:43,690 --> 00:00:46,990 Then we have to dense layer and then output layer. 8 00:00:49,280 --> 00:00:55,090 We are going to remove this pulling layer to notice its impact on the execution day. 9 00:00:57,850 --> 00:00:59,650 So the code will remain the same. 10 00:01:01,570 --> 00:01:03,040 First we have the corn layer. 11 00:01:03,520 --> 00:01:04,540 Then pulling layer. 12 00:01:04,720 --> 00:01:05,770 Then like then layer. 13 00:01:06,010 --> 00:01:06,850 Then Dennis Layer. 14 00:01:07,720 --> 00:01:10,790 We are calling this model as model underscored a.. 15 00:01:11,500 --> 00:01:16,270 This is same as the model we develop in our last lecture. 16 00:01:16,960 --> 00:01:22,390 And then we have a second model in which we are not taking this layer. 17 00:01:23,080 --> 00:01:31,000 So first we have the corn layer, then flat and layer, then two dense layer and one output layer. 18 00:01:32,380 --> 00:01:35,790 We are calling this model S model underscore B. 19 00:01:37,520 --> 00:01:43,320 And later on we will compare the performance of Model A. versus performance of Model B. 20 00:01:45,490 --> 00:01:47,170 Let's just try this. 21 00:01:49,750 --> 00:01:58,200 Beacon also look at somebody to get an idea of how many parameters our model out optimizing. 22 00:02:13,950 --> 00:02:24,030 So here you can see for the first dance, let we have our on one point six million parameters to train 23 00:02:25,640 --> 00:02:30,780 if we compared with Model B values. 24 00:02:38,740 --> 00:02:46,510 You can see in what second model that we do not have any pulling layer, the number of cranial parameters 25 00:02:47,200 --> 00:02:49,060 is around six point five million. 26 00:02:50,500 --> 00:02:58,810 So all you can say that there are four times more trainable parameter in your model without pulling 27 00:02:58,810 --> 00:02:59,080 leer. 28 00:03:00,790 --> 00:03:07,750 And we know that the execution time is directly dependent on the number of parameters that we are going 29 00:03:07,750 --> 00:03:08,230 to train. 30 00:03:09,700 --> 00:03:15,850 So obviously we can expect that Model B will take a lot more time than more delay. 31 00:03:18,010 --> 00:03:26,590 Let's just combine both of these models and then we will run model eight four three box. 32 00:03:26,740 --> 00:03:28,990 Then we will run Model B for three books. 33 00:03:29,530 --> 00:03:33,130 And after that, we'll compare the execution time for both models. 34 00:03:33,490 --> 00:03:35,170 Let's just first model A. 35 00:03:46,350 --> 00:03:47,900 Now, we have drained our water. 36 00:03:49,080 --> 00:03:54,870 And as you can see, for each boat, the execution time is on 31 seconds. 37 00:03:55,740 --> 00:04:00,800 And after the completion of Pertti book, we are getting the validation accuracy of 82 percent. 38 00:04:01,230 --> 00:04:04,470 And same as the accuracy for training data as well. 39 00:04:06,210 --> 00:04:11,400 Now, let's run the model without the pulling layer. 40 00:04:13,080 --> 00:04:13,710 Then this. 41 00:04:34,670 --> 00:04:37,520 So now we have trained our second Märta lesbian. 42 00:04:38,600 --> 00:04:46,460 And as you can see, the execution time for each epoch is around sixty two to sixty three seconds, 43 00:04:47,270 --> 00:04:53,480 whereas for one model one, the execution time for each box was around 30 to 31 seconds. 44 00:04:54,740 --> 00:04:59,320 So the execution time is almost double for our model B. 45 00:04:59,990 --> 00:05:06,070 That is the model without the pulling layer as compared to a model with pulling layer. 46 00:05:07,760 --> 00:05:16,490 You can also look at the accuracy scored on training set, the accuracy of over model BS more and on 47 00:05:16,490 --> 00:05:17,530 the validation set. 48 00:05:17,720 --> 00:05:21,680 The accuracy is almost the same for both the models. 49 00:05:23,690 --> 00:05:32,240 This is because when we are pulling for pixels and the one pixel, there is some information loss and 50 00:05:32,420 --> 00:05:36,410 that information loss is resulting in the lower accuracy. 51 00:05:37,160 --> 00:05:44,900 So if you are using pulling layer, your execution time will be less and the accuracy will also be a 52 00:05:44,900 --> 00:05:48,470 little less as compared to a model without pulling their. 53 00:05:52,150 --> 00:06:00,370 In this case, we have only used one convolutional layer, but in real life scenario, you may have 54 00:06:00,370 --> 00:06:03,030 to use multiple convolutional layer. 55 00:06:03,490 --> 00:06:09,520 And in such cases, a use of pulling layer becomes much more important. 56 00:06:11,080 --> 00:06:19,810 So using pulling layer, you can significantly reduce your execution time without impacting much accuracy.