1 00:00:01,090 --> 00:00:08,200 The Lost foundational concept we need to understand before we start building our CNN model in this orbit 2 00:00:08,590 --> 00:00:16,580 is that of a pooling lid and since you know how conditional layers work pooling layer is going to be 3 00:00:16,700 --> 00:00:26,070 easy to understand we use pooling layer in our network to reduce the computational load memory usage 4 00:00:26,670 --> 00:00:29,240 and the number of parameters to be estimated. 5 00:00:31,390 --> 00:00:39,370 Just like in conclusion earlier each neuron in appallingly it also has a small rectangular deceptive 6 00:00:39,370 --> 00:00:40,870 field. 7 00:00:40,870 --> 00:00:48,280 We have to define the size of this rectangular receptive field thus trade the type. 8 00:00:48,300 --> 00:00:55,640 Just like before however pulling neurons have no weight. 9 00:00:56,430 --> 00:01:03,320 All they do is aggregate the input using an aggregate function such as Max or mean. 10 00:01:03,960 --> 00:01:10,570 In this image the layer on top is that of a pooling lid. 11 00:01:10,640 --> 00:01:18,390 You can see that each new dawn is looking at a two by two set of neurons on the lower layer. 12 00:01:18,480 --> 00:01:24,860 The first neuron is looking at these four cells which have a red boundary. 13 00:01:24,960 --> 00:01:30,710 Next is looking at these four which are dotted blue boundary. 14 00:01:30,740 --> 00:01:38,930 This means that this right here is to by default the straight in upwelling there is same as the width 15 00:01:38,960 --> 00:01:45,170 of the receptive field. 16 00:01:45,190 --> 00:01:54,560 Now if we use max function or Max pooling as it is called Only the maximum input value although the 17 00:01:54,560 --> 00:02:02,370 four values in this receptive field makes it to the next layer the other three inputs are dropped. 18 00:02:02,370 --> 00:02:14,460 For example if these are the four outputs of these four cells 1 5 3 2 then order these 4 5 is the largest 19 00:02:14,460 --> 00:02:22,110 value so this neuron in the top layer will have 5 as output 20 00:02:25,460 --> 00:02:26,860 so it is very simple. 21 00:02:26,870 --> 00:02:34,340 No weights no filters to be trained just find the maximum value out of the forward values that it sees 22 00:02:34,960 --> 00:02:36,110 and it outputs that 23 00:02:39,040 --> 00:02:47,510 if you look at the TGIF at the bottom if this is the feature map at which are Max pooling layer is looking 24 00:02:47,510 --> 00:02:56,220 at for the first squared off for neurons the largest value is 6. 25 00:02:56,270 --> 00:03:04,720 So we ended six here in the first cell of the max pooling their similar to Max pooling. 26 00:03:04,850 --> 00:03:08,390 That is average pulling an average pooling. 27 00:03:08,390 --> 00:03:11,520 We find out the mean of the four values. 28 00:03:11,660 --> 00:03:19,610 So if we are doing average pooling it will be the average of these four values six six four and five. 29 00:03:20,510 --> 00:03:25,150 So that averages 5.2 two in the next trade. 30 00:03:25,180 --> 00:03:31,720 We look at the next four cells and we find out there Max and their average value those are stored in 31 00:03:31,720 --> 00:03:35,110 the next neuron now. 32 00:03:35,120 --> 00:03:43,230 Also notice that since we are using a straight off to here the pooling layer has half the weight and 33 00:03:43,320 --> 00:03:51,650 half the height of previously it you can now imagine how this will reduce the computations and memory 34 00:03:51,650 --> 00:04:01,530 usage instead of a willing layer if we had the next convolution cleared straight away so that layer 35 00:04:01,650 --> 00:04:07,520 would have six endured six as eight and eight as with. 36 00:04:07,710 --> 00:04:17,890 So six into it forty eight input neurons so each neuron in the next layer would have forty eight parameters 37 00:04:18,070 --> 00:04:30,330 to be trained but if we have this pooling layer on top then each neuron gets only three into four that 38 00:04:30,330 --> 00:04:33,300 is 12 input neurons. 39 00:04:33,300 --> 00:04:41,180 So only 12 parameters but neuron are to be trained so instead of forty eight we get well that I mean 40 00:04:41,260 --> 00:04:42,080 train. 41 00:04:42,480 --> 00:04:51,330 So the amount of computation goes down significantly so in this example we saw that we can do both Max 42 00:04:51,330 --> 00:04:59,160 pulling and mean pulling but commonly Max pooling works better than the alternative options because 43 00:04:59,730 --> 00:05:10,000 it highlights the main features instead of averaging them old so in our model most often we'll be using 44 00:05:10,030 --> 00:05:19,790 max pooling only that this is the content we think Max pulling it is a tradeoff we give away some extra 45 00:05:19,790 --> 00:05:28,010 information in the previous layer to reduce the computational load on our system I will highlight this 46 00:05:28,010 --> 00:05:30,560 impact on computation when we write the code.