1 00:00:01,090 --> 00:00:07,800 The last foundational concept we need to understand before we start building our CNN model in this sort, 2 00:00:07,840 --> 00:00:10,240 it is that of a pooling leered. 3 00:00:11,720 --> 00:00:17,640 And since you know how congressional leaders work, pulling labor is going to be easy to understand. 4 00:00:19,710 --> 00:00:27,420 We use pulling that in our network to reduce the computational load, memory usage and the number of 5 00:00:27,420 --> 00:00:29,240 parameters to be estimated. 6 00:00:31,420 --> 00:00:39,370 Just like in Convolutional, let each neuron in appallingly, it also has a small, rectangular, receptive 7 00:00:39,370 --> 00:00:39,760 field. 8 00:00:40,870 --> 00:00:44,710 We have to define the size of this rectangular, receptive field. 9 00:00:45,840 --> 00:00:49,200 Thus, trade departing day just like before. 10 00:00:50,610 --> 00:00:54,590 However, pulling neurons have no weight. 11 00:00:56,430 --> 00:01:04,800 All they do is aggregate the input using an aggregate function such as Max or mean in this image. 12 00:01:06,150 --> 00:01:09,090 The layer on top is that of a pooling lid. 13 00:01:10,640 --> 00:01:16,760 You can see that each neuron is looking at a two by two set of neurons on the lower layer. 14 00:01:18,470 --> 00:01:23,640 The first neuron is looking at these four cells which have a dead body. 15 00:01:24,980 --> 00:01:29,280 Next is looking at these four, which are dotted blue boundary. 16 00:01:30,740 --> 00:01:32,630 This means that this right here is to. 17 00:01:34,280 --> 00:01:41,180 By default, the strain in upwelling there is same as the vict or the receptively. 18 00:01:45,190 --> 00:01:50,560 Now, if we use max function or max pooling, as it is called. 19 00:01:52,140 --> 00:01:58,500 Only the maximum input value, although the four values in this receptive field makes it to the next 20 00:01:58,500 --> 00:01:58,730 level. 21 00:01:59,490 --> 00:02:01,350 The other three inputs are dropped. 22 00:02:02,370 --> 00:02:11,830 For example, if these are defore output of these four cells, one, five, three, two, then although 23 00:02:11,830 --> 00:02:14,930 these four, five is the largest value. 24 00:02:16,140 --> 00:02:22,110 So this new dawn in the top layer will have five as outport. 25 00:02:25,460 --> 00:02:26,570 So it is very simple. 26 00:02:26,870 --> 00:02:29,110 No wait, no filters to be trained. 27 00:02:29,780 --> 00:02:36,110 Just find the maximum value out of default values that it sees and it outputs that. 28 00:02:39,040 --> 00:02:41,710 If you look at the DMV at the bottom. 29 00:02:43,730 --> 00:02:47,750 If this is the feature map ad, which are Max pulling, Laird is looking at. 30 00:02:49,130 --> 00:02:51,290 For the first squared off for. 31 00:02:52,640 --> 00:02:53,170 Neurons. 32 00:02:53,810 --> 00:02:55,430 The largest value is six. 33 00:02:56,270 --> 00:03:01,660 So we ended six here in the first cell of the max pulling their. 34 00:03:03,260 --> 00:03:08,030 Similar to Max pooling, that is average pulling, an average pulling. 35 00:03:08,390 --> 00:03:10,970 We find out the mean of the values. 36 00:03:11,660 --> 00:03:18,770 So if we are doing average pulling, it will be the average of these forward values, six, six, four 37 00:03:18,860 --> 00:03:19,610 and five. 38 00:03:20,510 --> 00:03:22,630 So that averages five point two five. 39 00:03:24,040 --> 00:03:29,980 In the next raid, we look at the next forcefield and we find out their max and the average value. 40 00:03:30,970 --> 00:03:32,470 Those are stored in the next year on. 41 00:03:34,610 --> 00:03:38,600 Now, also notice that since we are using a straight off to hear. 42 00:03:40,170 --> 00:03:45,210 The pulling lead has half of it and half the height of previously it. 43 00:03:47,840 --> 00:03:52,180 You can now imagine how this will reduce the computations and memory usage. 44 00:03:54,050 --> 00:03:55,610 Instead of pulling leered. 45 00:03:56,700 --> 00:03:59,580 If we had the next convolutional lives straight away. 46 00:04:00,690 --> 00:04:07,210 So that layer would have six endured six as eight and eight as well. 47 00:04:07,680 --> 00:04:08,780 So six into it. 48 00:04:09,300 --> 00:04:11,700 Forty eight input neurons. 49 00:04:13,060 --> 00:04:18,700 So each new dawn in the next layer would have forty eight parameters to be trained. 50 00:04:20,600 --> 00:04:23,540 But if we have this pulling it on top. 51 00:04:25,650 --> 00:04:29,930 Then each neuron gets only three into four. 52 00:04:30,150 --> 00:04:32,380 That is dwil input neurons. 53 00:04:33,300 --> 00:04:36,770 So only 12 parameters but neuron are to be trained. 54 00:04:37,890 --> 00:04:40,590 So instead of 48, we get to all. 55 00:04:40,890 --> 00:04:46,110 I mean, just between the amount of computation goes down significantly. 56 00:04:48,030 --> 00:04:50,880 So in this example, we saw that we can do both. 57 00:04:50,970 --> 00:04:58,380 Max, bullying and mean pulling, but commonly Max pooling works better than the alternative options 58 00:04:58,650 --> 00:05:04,080 because it highlights the main features instead of averaging them out. 59 00:05:06,160 --> 00:05:11,020 So in our model, most often we'll be using Max, pulling only. 60 00:05:12,590 --> 00:05:13,100 That's it. 61 00:05:13,400 --> 00:05:15,260 This is the gauntlet behind Max pulling. 62 00:05:16,370 --> 00:05:17,060 It is a trade. 63 00:05:17,810 --> 00:05:25,250 We give away some extra information and we previously had to reduce the computational load on our system. 64 00:05:27,200 --> 00:05:30,580 I will highlight this impact on competition when we write the good.