1
00:00:00,533 --> 00:00:02,866
Hello and welcome back to the course
on Deep Learning.

2
00:00:02,866 --> 00:00:04,500
Today we're talking about max pooling.

3
00:00:04,500 --> 00:00:07,366
And we've got some very exciting slides
coming up ahead.

4
00:00:07,366 --> 00:00:10,633
And even a special surprise
at the very end of the tutorial.

5
00:00:10,866 --> 00:00:12,300
So let's get started.

6
00:00:12,300 --> 00:00:15,633
The first question is what is pooling
and why do we need it?

7
00:00:15,900 --> 00:00:18,500
Well, to answer that question,
let's have a look at these images.

8
00:00:18,500 --> 00:00:20,666
On these three images we've got a cheetah.

9
00:00:20,666 --> 00:00:23,566
In fact, it is the same exact cheetah
on the first image.

10
00:00:23,566 --> 00:00:27,633
The image is positioned properly and
the cheetah is looking straight at you.

11
00:00:27,966 --> 00:00:32,266
On the second image it's a bit rotated
and the third image is a bit squashed.

12
00:00:32,633 --> 00:00:37,000
And the thing here is that we want
the neural network to be able

13
00:00:37,000 --> 00:00:41,000
to recognize the cheetah
in every single one of these images.

14
00:00:41,333 --> 00:00:43,166
In fact, this is just one cheetah.

15
00:00:43,166 --> 00:00:45,000
What if we have lots of different
cheetahs?

16
00:00:45,000 --> 00:00:48,300
Here's a cheetah, here's a cheetah,
here's another cheetah.

17
00:00:48,666 --> 00:00:51,566
Here's a cheetah, here's a cheetah,
and here's a cheetah.

18
00:00:51,566 --> 00:00:54,700
And we want the neural network
to recognize all of these cheetahs

19
00:00:54,700 --> 00:00:56,133
as cheaters.

20
00:00:56,133 --> 00:01:01,700
And how can it do that if they're all
looking in different directions?

21
00:01:01,700 --> 00:01:04,033
They're all in different parts
of the image.

22
00:01:04,033 --> 00:01:06,966
They're like their faces are positioned
in different parts of the image.

23
00:01:06,966 --> 00:01:09,500
Somebody on the right hand side,
somebody in the left corner,

24
00:01:09,500 --> 00:01:10,933
somebody is in the middle.

25
00:01:10,933 --> 00:01:12,500
They're all a bit different.

26
00:01:12,500 --> 00:01:14,166
The texture's a little bit different.

27
00:01:14,166 --> 00:01:16,100
The lighting is a bit different.

28
00:01:16,100 --> 00:01:17,333
There's lots of little differences.

29
00:01:17,333 --> 00:01:22,200
And so if the neural network looks for
exactly a certain feature, for instance,

30
00:01:22,200 --> 00:01:25,400
a distinctive feature of the cheetah is

31
00:01:25,400 --> 00:01:29,800
the tears that are, on its face
going from the eyes or the,

32
00:01:30,100 --> 00:01:32,766
the shadows that look like tears,

33
00:01:32,766 --> 00:01:35,900
the texture or the pattern
that is going from its eyes down.

34
00:01:36,166 --> 00:01:37,766
It's, on the size of its nose.

35
00:01:37,766 --> 00:01:38,400
It looks like tears.

36
00:01:38,400 --> 00:01:40,766
That's a distinctive feature
of this, cheetah.

37
00:01:40,766 --> 00:01:46,133
But if it's looking for that feature,
which it learned from, certain cheetahs,

38
00:01:46,800 --> 00:01:50,066
in an exact location or an exact shape

39
00:01:50,066 --> 00:01:53,066
or form or texture,
it'll never find these other cheetahs.

40
00:01:53,300 --> 00:01:57,400
So we have to make sure
that our neural network,

41
00:01:57,966 --> 00:02:02,500
has a property called spatial invariance,
meaning that it doesn't care

42
00:02:02,700 --> 00:02:06,600
where the, features are located.

43
00:02:06,633 --> 00:02:09,700
Not not so much as in
which part of the image,

44
00:02:09,700 --> 00:02:13,100
because we we've kind of taken
that into consideration

45
00:02:13,100 --> 00:02:16,366
with our map,
with our, with our convolution layer.

46
00:02:16,633 --> 00:02:21,200
But it doesn't have to care
if the features are a bit tilted,

47
00:02:21,200 --> 00:02:23,866
if the features are a bit different
in texture,

48
00:02:23,866 --> 00:02:27,100
if the features are a bit closer,
if features are a bit further

49
00:02:27,100 --> 00:02:30,133
apart relative to, relative to each other.

50
00:02:30,133 --> 00:02:34,400
So if the feature itself is a bit
distorted, we our neural network

51
00:02:34,400 --> 00:02:39,633
has to have some level of flexibility
to be able to still find that feature.

52
00:02:39,900 --> 00:02:42,566
And that is what pooling is all about.

53
00:02:42,566 --> 00:02:45,000
So let's have a look at how pooling works.

54
00:02:45,000 --> 00:02:46,066
Here's our feature map.

55
00:02:46,066 --> 00:02:50,466
So we've already done our convolution
and we've completed that part.

56
00:02:50,466 --> 00:02:52,500
And now we're working
with the convolution layer.

57
00:02:52,500 --> 00:02:54,600
Now we're going to apply pooling.
So how does it work.

58
00:02:54,600 --> 00:02:56,600
We're going to be applying max pooling.

59
00:02:56,600 --> 00:02:57,900
there's several different types

60
00:02:57,900 --> 00:03:00,900
of pooling complies
mean pooling max pooling some pooling.

61
00:03:00,900 --> 00:03:03,400
And we'll comment on those towards
the end of this tutorial.

62
00:03:03,400 --> 00:03:05,000
But for now
we're just applying max pooling.

63
00:03:05,000 --> 00:03:09,600
So we take a box of two
by two pixels like that.

64
00:03:09,933 --> 00:03:12,266
And again
it doesn't have to be two by two.

65
00:03:12,266 --> 00:03:13,466
You can choose any size of box.

66
00:03:13,466 --> 00:03:16,033
And again we'll comment on that
towards our tutorial.

67
00:03:16,033 --> 00:03:19,033
And you place it in the top left
hand corner

68
00:03:19,166 --> 00:03:21,833
and you find the maximum value
in that box.

69
00:03:21,833 --> 00:03:26,000
And then you record only that value
and you disregard the other three.

70
00:03:26,100 --> 00:03:27,800
So in your box you have four values.

71
00:03:27,800 --> 00:03:29,000
You just disregard three.

72
00:03:29,000 --> 00:03:31,666
You only keep one the maximum,
which is one. In this case.

73
00:03:31,666 --> 00:03:34,566
Then you move your box to the right
by a stride.

74
00:03:34,566 --> 00:03:36,033
You select the stride once again.

75
00:03:36,033 --> 00:03:41,000
So here we select a stride of two
and you that's what you normally select.

76
00:03:41,000 --> 00:03:42,833
You can select a straight of one.
You can select.

77
00:03:42,833 --> 00:03:44,333
So there are overlapping boxes.

78
00:03:44,333 --> 00:03:47,866
You can select any kind of stride
that you like even three if you want.

79
00:03:48,666 --> 00:03:52,166
But we're selecting a stride of two here
and that's what is commonly used.

80
00:03:52,333 --> 00:03:53,833
And then you repeat the repeat
the process.

81
00:03:53,833 --> 00:03:55,766
You record the maximum here.

82
00:03:55,766 --> 00:03:58,933
If you crossover and it doesn't matter,
you just keep continue

83
00:03:58,933 --> 00:03:59,933
doing what you're doing.

84
00:03:59,933 --> 00:04:02,800
So, you still record the max over here.

85
00:04:02,800 --> 00:04:03,900
Zero.

86
00:04:03,900 --> 00:04:05,566
here the maximum is four.

87
00:04:05,566 --> 00:04:07,200
Here the maximum is two here.

88
00:04:07,200 --> 00:04:10,533
The maximum is one, zero
one as a row, two and then one.

89
00:04:11,233 --> 00:04:13,900
So as you can see, a few things happened.

90
00:04:13,900 --> 00:04:17,800
First of all, we still were able
to preserve the features.

91
00:04:17,800 --> 00:04:18,400
Right.

92
00:04:18,400 --> 00:04:23,166
the maximum numbers they represent
because we know how the convolution

93
00:04:23,166 --> 00:04:23,666
layer works.

94
00:04:23,666 --> 00:04:27,333
We know that the maximum or the bit
large numbers in your feature map,

95
00:04:27,333 --> 00:04:31,200
they represent where you actually found
the closest similarity to a feature.

96
00:04:31,500 --> 00:04:34,400
But by then pooling these features,

97
00:04:34,400 --> 00:04:38,166
we are first of all
getting rid of 75% of the information

98
00:04:38,166 --> 00:04:42,166
that, is not the feature
which is which is not,

99
00:04:42,500 --> 00:04:45,500
the important things
that we're looking out for,

100
00:04:45,533 --> 00:04:48,933
because we are disregarding
three pixels out of four.

101
00:04:49,633 --> 00:04:51,366
so we're only keeping 25%.

102
00:04:51,366 --> 00:04:54,300
And then also because

103
00:04:54,300 --> 00:04:57,566
we are taking the maximum of the,

104
00:04:58,300 --> 00:05:00,600
pixels that way
or the values that we have,

105
00:05:00,600 --> 00:05:04,066
we are therefore accounting
for any distortion.

106
00:05:04,066 --> 00:05:08,500
So for instance,
two images in which, for example,

107
00:05:08,500 --> 00:05:11,566
the cheetah's, tears on the eyes are

108
00:05:12,066 --> 00:05:15,466
in one image, they're a bit to the left,
or a bit rotated to the left.

109
00:05:15,466 --> 00:05:18,466
And another one, they're a bit
and they're how they're supposed to be or

110
00:05:18,566 --> 00:05:22,233
how we, like, if we take one as the bases
and another one, they're a bit

111
00:05:22,233 --> 00:05:26,466
rotate to the left, the the pooled feature
will be exactly the same.

112
00:05:26,466 --> 00:05:30,300
So you can see here, if we are talking
about the cheetah's tears,

113
00:05:30,400 --> 00:05:34,066
then let's say this is the four
and this is where it was here.

114
00:05:34,133 --> 00:05:35,966
Then if it was a bit rotated.

115
00:05:35,966 --> 00:05:38,233
So for instance
the four ended up over here.

116
00:05:38,233 --> 00:05:40,400
Then when we're doing the pooling

117
00:05:40,400 --> 00:05:43,000
we're still going
to get the same pooled feature map.

118
00:05:43,000 --> 00:05:46,000
And that's kind of the
the principle behind it.

119
00:05:46,400 --> 00:05:48,733
It's a very, rough explanation.

120
00:05:48,733 --> 00:05:51,600
Again, intuitive explanation,
but that's the point of pooling

121
00:05:51,600 --> 00:05:54,766
that we're still being able
to preserve the features.

122
00:05:54,966 --> 00:05:58,666
And moreover, accounts for,
their possible

123
00:05:58,666 --> 00:06:01,966
spatial or textural
or other kind of distortions.

124
00:06:02,300 --> 00:06:05,700
And in addition to all of that,
we are reducing the size.

125
00:06:05,700 --> 00:06:07,266
So there's another benefit.

126
00:06:07,266 --> 00:06:09,800
So we've got
we're preserving the features.

127
00:06:09,800 --> 00:06:12,000
We're introducing spatial invariance.

128
00:06:12,000 --> 00:06:15,800
We're reducing the size by 75%,

129
00:06:16,066 --> 00:06:19,266
which is huge, which is really going
to help us in terms of processing.

130
00:06:19,633 --> 00:06:23,166
And moreover,
another benefit of pooling is

131
00:06:23,166 --> 00:06:25,033
we are reducing the number of parameters.

132
00:06:25,033 --> 00:06:27,733
So we're reducing again by 75%.

133
00:06:27,733 --> 00:06:28,866
We're reducing the number of parameters

134
00:06:28,866 --> 00:06:32,133
that are going to go into our final layers
of the neural network.

135
00:06:32,533 --> 00:06:35,133
And therefore we're preventing
overfitting.

136
00:06:35,133 --> 00:06:41,100
It is a very important benefit of pooling
that we're removing information.

137
00:06:41,100 --> 00:06:42,500
And that is a good thing.

138
00:06:42,500 --> 00:06:45,500
That is a good thing because that way,

139
00:06:45,500 --> 00:06:48,533
our model won't be able to overfit

140
00:06:48,533 --> 00:06:52,500
onto that information because especially
because that information is not real.

141
00:06:52,500 --> 00:06:55,600
And remember, like at the very start
we were talking about even for humans,

142
00:06:55,666 --> 00:06:59,100
us as humans, it's important
to see exactly the features

143
00:06:59,100 --> 00:07:02,133
rather than all this other noise
that is coming into our eyes.

144
00:07:02,700 --> 00:07:04,433
Well, same thing for neural networks.

145
00:07:04,433 --> 00:07:07,500
They by disregarding the unnecessary,

146
00:07:07,633 --> 00:07:11,866
not important information we're helping
with preventing of overfitting.

147
00:07:12,333 --> 00:07:14,500
So there we go.
That is what pooling is about.

148
00:07:14,500 --> 00:07:19,600
And the question here is, of course,
why why max pooling.

149
00:07:19,600 --> 00:07:21,600
Right. There's
lots of different types of pooling.

150
00:07:21,600 --> 00:07:23,600
And you know why why a stride of two?

151
00:07:23,600 --> 00:07:25,533
Why a size of two by two pixels.

152
00:07:25,533 --> 00:07:26,633
Lots of all these things.

153
00:07:26,633 --> 00:07:30,566
And on that note,
I'd like to introduce you to this,

154
00:07:30,733 --> 00:07:34,500
a lovely research paper
called Evaluation of Pooling Operations

155
00:07:34,500 --> 00:07:37,500
in Convolutional Architectures
for Object Recognition

156
00:07:37,633 --> 00:07:40,733
by Dominic Scherrer
from University of Bonn.

157
00:07:41,000 --> 00:07:42,033
There's the link.

158
00:07:42,033 --> 00:07:45,733
And the beauty about this paper
is that it's very,

159
00:07:45,733 --> 00:07:47,466
very simple, very straightforward.

160
00:07:47,466 --> 00:07:49,866
So if you've never read a research paper

161
00:07:49,866 --> 00:07:53,733
before which you'd like to give it a go,
this is a great place to start.

162
00:07:53,733 --> 00:07:56,733
It's very short,
only ten pages, very easy to read.

163
00:07:56,933 --> 00:08:00,700
And plus the extra benefit is that now
that we've discussed convolution

164
00:08:00,700 --> 00:08:03,700
and pooling,
you will be totally comfortable

165
00:08:03,700 --> 00:08:05,866
with everything
that they're talking about in this paper.

166
00:08:05,866 --> 00:08:09,333
And you, this is a great way
to actually reinforce you knowledge.

167
00:08:09,333 --> 00:08:11,733
So I highly recommend
checking this paper out.

168
00:08:11,733 --> 00:08:13,833
I will take 20 minutes to read it.

169
00:08:13,833 --> 00:08:17,500
And you can even skip part two,
which is called related work

170
00:08:17,500 --> 00:08:20,900
if it feels a bit farfetched
or alienating, just don't read that part.

171
00:08:21,233 --> 00:08:23,733
Go straight to from part
one to part three.

172
00:08:23,733 --> 00:08:26,400
And the one thing that you do need to know
about this paper.

173
00:08:26,400 --> 00:08:29,500
They talk about a concept
called subsampling,

174
00:08:30,400 --> 00:08:33,133
while subsampling
is basically average pooling.

175
00:08:33,133 --> 00:08:37,333
So remember how here we were taking,
we were taking the maximum.

176
00:08:37,333 --> 00:08:39,833
So you know
square we're taking the maximum value.

177
00:08:39,833 --> 00:08:42,966
There's a concept called mean pooling
or some pooling.

178
00:08:42,966 --> 00:08:45,100
Some pooling is you just sum
these values up.

179
00:08:45,100 --> 00:08:46,733
Average pooling or mean pooling.

180
00:08:46,733 --> 00:08:49,733
You take the average value
out of all of these.

181
00:08:49,866 --> 00:08:53,766
And subsampling is kind of like
a generalization of mean pooling.

182
00:08:53,766 --> 00:08:57,166
It's 
it's a more kind of generalized approach

183
00:08:57,166 --> 00:09:00,766
to taking the average of,
of these values.

184
00:09:00,766 --> 00:09:02,400
And you can read a bit more about it
in the paper.

185
00:09:02,400 --> 00:09:06,233
But otherwise, just think of it as average
pooling when you're reading that paper.

186
00:09:06,800 --> 00:09:09,800
And so that's where you can get
some additional information on this topic.

187
00:09:09,833 --> 00:09:12,266
And now kind of let's recap
where have we got into it.

188
00:09:12,266 --> 00:09:14,700
So there's our input image.

189
00:09:14,700 --> 00:09:18,800
Then we applied the convolution operation
and we got the convolution layer.

190
00:09:18,900 --> 00:09:21,900
And now to each of those feature maps

191
00:09:21,900 --> 00:09:24,133
that we get
we've applied the pooling layer.

192
00:09:24,133 --> 00:09:28,433
So we've got, we've done these 
two steps convolution and pooling.

193
00:09:28,733 --> 00:09:31,766
And now we're going to do something
very fun, something exciting.

194
00:09:32,033 --> 00:09:34,333
We're going to, experiment with this.

195
00:09:34,333 --> 00:09:38,266
So this is a screenshot
I took from a, tool

196
00:09:38,633 --> 00:09:42,600
created by Adam Harley from,

197
00:09:42,600 --> 00:09:46,266
well, back when he was at Ryerson
University of Computer Science,

198
00:09:46,266 --> 00:09:50,900
and now he's at Carnegie Mellon,
I think, doing his PhD and great tool.

199
00:09:50,900 --> 00:09:52,400
So let's open up.

200
00:09:52,400 --> 00:09:54,066
Let's have a look so you can find it.

201
00:09:54,066 --> 00:09:55,700
You can't actually find it through Google.

202
00:09:55,700 --> 00:09:57,400
You have to know the URL.

203
00:09:57,400 --> 00:10:00,833
It's it's it's just hard to find this on
Google because there's no text here.

204
00:10:01,333 --> 00:10:06,500
See we're
just this URL x dot Ryerson okay.

205
00:10:06,500 --> 00:10:09,666
And then this stuff on then
and basically this

206
00:10:10,366 --> 00:10:12,600
is exactly what we're doing
but visualized.

207
00:10:12,600 --> 00:10:14,300
So here you need to draw a number.

208
00:10:14,300 --> 00:10:16,533
So let's say I draw number four.

209
00:10:16,533 --> 00:10:21,266
And this tool
will put the number four here.

210
00:10:21,266 --> 00:10:24,066
That's your image in our first step.

211
00:10:24,066 --> 00:10:27,000
Then this is the convolution step right.

212
00:10:27,000 --> 00:10:28,133
And this is the pooling step.

213
00:10:28,133 --> 00:10:30,300
And also pooling by the way
is also called downsampling.

214
00:10:30,300 --> 00:10:33,300
So pooling and downsampling
are the same things.

215
00:10:33,866 --> 00:10:35,700
So you can see it's applied convolution.

216
00:10:35,700 --> 00:10:37,366
Then it's applied pooling.

217
00:10:37,366 --> 00:10:39,033
And you can see how it exactly works.

218
00:10:39,033 --> 00:10:42,366
So you can see what kind of convolutions
that it has applied or

219
00:10:42,533 --> 00:10:44,800
what kind of filters
it applied, what they look like.

220
00:10:44,800 --> 00:10:47,100
You can see what features
it's looking out for.

221
00:10:47,100 --> 00:10:49,266
and then it's applying pooling.

222
00:10:49,266 --> 00:10:50,566
So it's reducing the size.

223
00:10:50,566 --> 00:10:53,300
And you can see here
that this is important right.

224
00:10:53,300 --> 00:10:58,700
So you can see, that
this is the convolved image

225
00:10:58,700 --> 00:11:00,033
and this is the pooled image.

226
00:11:00,033 --> 00:11:01,700
And you can still see the same features.

227
00:11:01,700 --> 00:11:04,200
It's just lesson information,
but same features, right?

228
00:11:04,200 --> 00:11:05,733
The features are preserved.

229
00:11:05,733 --> 00:11:07,566
That's the important part.

230
00:11:07,566 --> 00:11:12,066
and moreover, if you know, if all four
was a bit to the kind of like rotated

231
00:11:12,066 --> 00:11:16,700
a bit to the side, it would still be able
to pick up very similar pooled layers.

232
00:11:16,900 --> 00:11:18,500
And then after that it's got more layers.

233
00:11:18,500 --> 00:11:19,700
We haven't talked about that yet.

234
00:11:19,700 --> 00:11:23,066
So then it's got another 
convolutional, convolution

235
00:11:23,066 --> 00:11:26,066
layer here, which, we actually won't have.

236
00:11:26,266 --> 00:11:29,700
and then it has another pool layer,
but it's basically just repeating that

237
00:11:29,700 --> 00:11:30,866
same process.

238
00:11:30,866 --> 00:11:32,000
And then after that,

239
00:11:32,000 --> 00:11:34,833
this is what we're going to be talking
further down in the course,

240
00:11:34,833 --> 00:11:37,833
is got the fully connected layers
and so on.

241
00:11:37,966 --> 00:11:39,800
But you can
definitely play around with that.

242
00:11:39,800 --> 00:11:43,433
So if I delete that,
you like if I draw a seven

243
00:11:44,533 --> 00:11:47,800
you will see that, it actually tells you
the guesses.

244
00:11:47,800 --> 00:11:49,433
It guesses that this is a seven.

245
00:11:49,433 --> 00:11:52,500
And the second guess
the second likelihood is a three.

246
00:11:52,866 --> 00:11:56,366
So you can draw it some some challenging
things and see if it can pick them up.

247
00:11:56,366 --> 00:11:59,466
So let's say
if I draw something that looks like a zero

248
00:11:59,466 --> 00:12:01,900
but it's not a finished zero,
will it pick it up

249
00:12:01,900 --> 00:12:03,666
now this this time didn't pick it up.

250
00:12:03,666 --> 00:12:06,033
Looks like a nine to it to the image.

251
00:12:06,033 --> 00:12:08,400
What if I kind of like finished like that?

252
00:12:08,400 --> 00:12:11,400
So now it thinks it's a zero or a nine

253
00:12:11,500 --> 00:12:14,400
and you can see over there
what's lighting up the zero or the nine.

254
00:12:14,400 --> 00:12:16,500
But we'll talk about that part
for the dog.

255
00:12:16,500 --> 00:12:17,266
Let's do one more.

256
00:12:17,266 --> 00:12:19,766
Let's say like like eight.

257
00:12:19,766 --> 00:12:22,800
I think it's a pretty hard for this now.

258
00:12:22,800 --> 00:12:23,700
Picked up an eight.

259
00:12:23,700 --> 00:12:27,433
So you can see that goes into an eight
and then like after that

260
00:12:27,433 --> 00:12:28,800
it stops being recognizable.

261
00:12:28,800 --> 00:12:31,800
The subs making sense to us humans, right?

262
00:12:32,000 --> 00:12:34,433
These, features that it's working with.

263
00:12:34,433 --> 00:12:38,300
But at the same time, it is correctly
recognizing that it's an eight.

264
00:12:38,833 --> 00:12:40,466
Yeah. So definitely play around with that.

265
00:12:40,466 --> 00:12:43,200
You can draw a smiley
face, see what happens then.

266
00:12:44,166 --> 00:12:47,433
Looks like a three to this, to this tool

267
00:12:47,466 --> 00:12:50,700
because the tool is obviously trained up
only on digits from 0 to 9.

268
00:12:50,966 --> 00:12:53,100
So it has to recognize something.

269
00:12:53,100 --> 00:12:54,433
There are of those.

270
00:12:54,433 --> 00:12:56,866
And it recognizes a three.

271
00:12:56,866 --> 00:13:00,633
It's like in life when you when you see
something like a a type of fruit

272
00:13:00,633 --> 00:13:05,666
that you've never seen before,
like a, custard apple or something.

273
00:13:05,966 --> 00:13:09,933
And you think that it's,
like it's, it's a pear

274
00:13:10,433 --> 00:13:12,266
because you've never actually seen
one before.

275
00:13:12,266 --> 00:13:13,900
You don't know what to classify it as.

276
00:13:13,900 --> 00:13:14,600
Same thing here.

277
00:13:14,600 --> 00:13:17,600
So it hasn't actually trained
on the smiley faces.

278
00:13:17,733 --> 00:13:20,366
And that's
why it's thinks it's a tree. It's a three.

279
00:13:20,366 --> 00:13:22,566
So there you go. It's
a very powerful, powerful tool.

280
00:13:22,566 --> 00:13:25,133
It'll be helpful
for you to play around with.

281
00:13:25,133 --> 00:13:30,066
It actually when you put your mouse
over a pixel, it pixels, it shows you,

282
00:13:30,666 --> 00:13:34,633
where the, 
feature detector was to pick up that pixel

283
00:13:34,633 --> 00:13:37,366
so you can see where those, 
this pixel is coming from.

284
00:13:37,366 --> 00:13:41,333
And, also
so you can see how the filter was

285
00:13:41,333 --> 00:13:44,400
kind of like going through the image,
exactly how we talked about in the course.

286
00:13:44,400 --> 00:13:45,466
And here you can see,

287
00:13:45,466 --> 00:13:48,766
you can see the, the pooling,
you can see that the pooling is done with,

288
00:13:49,300 --> 00:13:52,533
the pooling is done with a,

289
00:13:53,300 --> 00:13:56,733
little square size of two by two.

290
00:13:57,066 --> 00:13:59,866
And you can see that
it's, it's a stride of two as well.

291
00:13:59,866 --> 00:14:03,400
Just as we discussed in,
today's tutorial.

292
00:14:03,800 --> 00:14:04,800
So there you go.

293
00:14:04,800 --> 00:14:05,933
Have a play around with that.

294
00:14:05,933 --> 00:14:09,100
And I hope you enjoyed today's, session.

295
00:14:09,100 --> 00:14:10,433
I look forward to seeing you next time.

296
00:14:10,433 --> 00:14:12,400
And until then, enjoy deep learning.