1
00:00:01,090 --> 00:00:07,800
The last foundational concept we need to understand before we start building our CNN model in this sort,

2
00:00:07,840 --> 00:00:10,240
it is that of a pooling leered.

3
00:00:11,720 --> 00:00:17,640
And since you know how congressional leaders work, pulling labor is going to be easy to understand.

4
00:00:19,710 --> 00:00:27,420
We use pulling that in our network to reduce the computational load, memory usage and the number of

5
00:00:27,420 --> 00:00:29,240
parameters to be estimated.

6
00:00:31,420 --> 00:00:39,370
Just like in Convolutional, let each neuron in appallingly, it also has a small, rectangular, receptive

7
00:00:39,370 --> 00:00:39,760
field.

8
00:00:40,870 --> 00:00:44,710
We have to define the size of this rectangular, receptive field.

9
00:00:45,840 --> 00:00:49,200
Thus, trade departing day just like before.

10
00:00:50,610 --> 00:00:54,590
However, pulling neurons have no weight.

11
00:00:56,430 --> 00:01:04,800
All they do is aggregate the input using an aggregate function such as Max or mean in this image.

12
00:01:06,150 --> 00:01:09,090
The layer on top is that of a pooling lid.

13
00:01:10,640 --> 00:01:16,760
You can see that each neuron is looking at a two by two set of neurons on the lower layer.

14
00:01:18,470 --> 00:01:23,640
The first neuron is looking at these four cells which have a dead body.

15
00:01:24,980 --> 00:01:29,280
Next is looking at these four, which are dotted blue boundary.

16
00:01:30,740 --> 00:01:32,630
This means that this right here is to.

17
00:01:34,280 --> 00:01:41,180
By default, the strain in upwelling there is same as the vict or the receptively.

18
00:01:45,190 --> 00:01:50,560
Now, if we use max function or max pooling, as it is called.

19
00:01:52,140 --> 00:01:58,500
Only the maximum input value, although the four values in this receptive field makes it to the next

20
00:01:58,500 --> 00:01:58,730
level.

21
00:01:59,490 --> 00:02:01,350
The other three inputs are dropped.

22
00:02:02,370 --> 00:02:11,830
For example, if these are defore output of these four cells, one, five, three, two, then although

23
00:02:11,830 --> 00:02:14,930
these four, five is the largest value.

24
00:02:16,140 --> 00:02:22,110
So this new dawn in the top layer will have five as outport.

25
00:02:25,460 --> 00:02:26,570
So it is very simple.

26
00:02:26,870 --> 00:02:29,110
No wait, no filters to be trained.

27
00:02:29,780 --> 00:02:36,110
Just find the maximum value out of default values that it sees and it outputs that.

28
00:02:39,040 --> 00:02:41,710
If you look at the DMV at the bottom.

29
00:02:43,730 --> 00:02:47,750
If this is the feature map ad, which are Max pulling, Laird is looking at.

30
00:02:49,130 --> 00:02:51,290
For the first squared off for.

31
00:02:52,640 --> 00:02:53,170
Neurons.

32
00:02:53,810 --> 00:02:55,430
The largest value is six.

33
00:02:56,270 --> 00:03:01,660
So we ended six here in the first cell of the max pulling their.

34
00:03:03,260 --> 00:03:08,030
Similar to Max pooling, that is average pulling, an average pulling.

35
00:03:08,390 --> 00:03:10,970
We find out the mean of the values.

36
00:03:11,660 --> 00:03:18,770
So if we are doing average pulling, it will be the average of these forward values, six, six, four

37
00:03:18,860 --> 00:03:19,610
and five.

38
00:03:20,510 --> 00:03:22,630
So that averages five point two five.

39
00:03:24,040 --> 00:03:29,980
In the next raid, we look at the next forcefield and we find out their max and the average value.

40
00:03:30,970 --> 00:03:32,470
Those are stored in the next year on.

41
00:03:34,610 --> 00:03:38,600
Now, also notice that since we are using a straight off to hear.

42
00:03:40,170 --> 00:03:45,210
The pulling lead has half of it and half the height of previously it.

43
00:03:47,840 --> 00:03:52,180
You can now imagine how this will reduce the computations and memory usage.

44
00:03:54,050 --> 00:03:55,610
Instead of pulling leered.

45
00:03:56,700 --> 00:03:59,580
If we had the next convolutional lives straight away.

46
00:04:00,690 --> 00:04:07,210
So that layer would have six endured six as eight and eight as well.

47
00:04:07,680 --> 00:04:08,780
So six into it.

48
00:04:09,300 --> 00:04:11,700
Forty eight input neurons.

49
00:04:13,060 --> 00:04:18,700
So each new dawn in the next layer would have forty eight parameters to be trained.

50
00:04:20,600 --> 00:04:23,540
But if we have this pulling it on top.

51
00:04:25,650 --> 00:04:29,930
Then each neuron gets only three into four.

52
00:04:30,150 --> 00:04:32,380
That is dwil input neurons.

53
00:04:33,300 --> 00:04:36,770
So only 12 parameters but neuron are to be trained.

54
00:04:37,890 --> 00:04:40,590
So instead of 48, we get to all.

55
00:04:40,890 --> 00:04:46,110
I mean, just between the amount of computation goes down significantly.

56
00:04:48,030 --> 00:04:50,880
So in this example, we saw that we can do both.

57
00:04:50,970 --> 00:04:58,380
Max, bullying and mean pulling, but commonly Max pooling works better than the alternative options

58
00:04:58,650 --> 00:05:04,080
because it highlights the main features instead of averaging them out.

59
00:05:06,160 --> 00:05:11,020
So in our model, most often we'll be using Max, pulling only.

60
00:05:12,590 --> 00:05:13,100
That's it.

61
00:05:13,400 --> 00:05:15,260
This is the gauntlet behind Max pulling.

62
00:05:16,370 --> 00:05:17,060
It is a trade.

63
00:05:17,810 --> 00:05:25,250
We give away some extra information and we previously had to reduce the computational load on our system.

64
00:05:27,200 --> 00:05:30,580
I will highlight this impact on competition when we write the good.