1 00:00:00,533 --> 00:00:02,866 Hello and welcome back to the course on Deep Learning. 2 00:00:02,866 --> 00:00:04,500 Today we're talking about max pooling. 3 00:00:04,500 --> 00:00:07,366 And we've got some very exciting slides coming up ahead. 4 00:00:07,366 --> 00:00:10,633 And even a special surprise at the very end of the tutorial. 5 00:00:10,866 --> 00:00:12,300 So let's get started. 6 00:00:12,300 --> 00:00:15,633 The first question is what is pooling and why do we need it? 7 00:00:15,900 --> 00:00:18,500 Well, to answer that question, let's have a look at these images. 8 00:00:18,500 --> 00:00:20,666 On these three images we've got a cheetah. 9 00:00:20,666 --> 00:00:23,566 In fact, it is the same exact cheetah on the first image. 10 00:00:23,566 --> 00:00:27,633 The image is positioned properly and the cheetah is looking straight at you. 11 00:00:27,966 --> 00:00:32,266 On the second image it's a bit rotated and the third image is a bit squashed. 12 00:00:32,633 --> 00:00:37,000 And the thing here is that we want the neural network to be able 13 00:00:37,000 --> 00:00:41,000 to recognize the cheetah in every single one of these images. 14 00:00:41,333 --> 00:00:43,166 In fact, this is just one cheetah. 15 00:00:43,166 --> 00:00:45,000 What if we have lots of different cheetahs? 16 00:00:45,000 --> 00:00:48,300 Here's a cheetah, here's a cheetah, here's another cheetah. 17 00:00:48,666 --> 00:00:51,566 Here's a cheetah, here's a cheetah, and here's a cheetah. 18 00:00:51,566 --> 00:00:54,700 And we want the neural network to recognize all of these cheetahs 19 00:00:54,700 --> 00:00:56,133 as cheaters. 20 00:00:56,133 --> 00:01:01,700 And how can it do that if they're all looking in different directions? 21 00:01:01,700 --> 00:01:04,033 They're all in different parts of the image. 22 00:01:04,033 --> 00:01:06,966 They're like their faces are positioned in different parts of the image. 23 00:01:06,966 --> 00:01:09,500 Somebody on the right hand side, somebody in the left corner, 24 00:01:09,500 --> 00:01:10,933 somebody is in the middle. 25 00:01:10,933 --> 00:01:12,500 They're all a bit different. 26 00:01:12,500 --> 00:01:14,166 The texture's a little bit different. 27 00:01:14,166 --> 00:01:16,100 The lighting is a bit different. 28 00:01:16,100 --> 00:01:17,333 There's lots of little differences. 29 00:01:17,333 --> 00:01:22,200 And so if the neural network looks for exactly a certain feature, for instance, 30 00:01:22,200 --> 00:01:25,400 a distinctive feature of the cheetah is 31 00:01:25,400 --> 00:01:29,800 the tears that are, on its face going from the eyes or the, 32 00:01:30,100 --> 00:01:32,766 the shadows that look like tears, 33 00:01:32,766 --> 00:01:35,900 the texture or the pattern that is going from its eyes down. 34 00:01:36,166 --> 00:01:37,766 It's, on the size of its nose. 35 00:01:37,766 --> 00:01:38,400 It looks like tears. 36 00:01:38,400 --> 00:01:40,766 That's a distinctive feature of this, cheetah. 37 00:01:40,766 --> 00:01:46,133 But if it's looking for that feature, which it learned from, certain cheetahs, 38 00:01:46,800 --> 00:01:50,066 in an exact location or an exact shape 39 00:01:50,066 --> 00:01:53,066 or form or texture, it'll never find these other cheetahs. 40 00:01:53,300 --> 00:01:57,400 So we have to make sure that our neural network, 41 00:01:57,966 --> 00:02:02,500 has a property called spatial invariance, meaning that it doesn't care 42 00:02:02,700 --> 00:02:06,600 where the, features are located. 43 00:02:06,633 --> 00:02:09,700 Not not so much as in which part of the image, 44 00:02:09,700 --> 00:02:13,100 because we we've kind of taken that into consideration 45 00:02:13,100 --> 00:02:16,366 with our map, with our, with our convolution layer. 46 00:02:16,633 --> 00:02:21,200 But it doesn't have to care if the features are a bit tilted, 47 00:02:21,200 --> 00:02:23,866 if the features are a bit different in texture, 48 00:02:23,866 --> 00:02:27,100 if the features are a bit closer, if features are a bit further 49 00:02:27,100 --> 00:02:30,133 apart relative to, relative to each other. 50 00:02:30,133 --> 00:02:34,400 So if the feature itself is a bit distorted, we our neural network 51 00:02:34,400 --> 00:02:39,633 has to have some level of flexibility to be able to still find that feature. 52 00:02:39,900 --> 00:02:42,566 And that is what pooling is all about. 53 00:02:42,566 --> 00:02:45,000 So let's have a look at how pooling works. 54 00:02:45,000 --> 00:02:46,066 Here's our feature map. 55 00:02:46,066 --> 00:02:50,466 So we've already done our convolution and we've completed that part. 56 00:02:50,466 --> 00:02:52,500 And now we're working with the convolution layer. 57 00:02:52,500 --> 00:02:54,600 Now we're going to apply pooling. So how does it work. 58 00:02:54,600 --> 00:02:56,600 We're going to be applying max pooling. 59 00:02:56,600 --> 00:02:57,900 there's several different types 60 00:02:57,900 --> 00:03:00,900 of pooling complies mean pooling max pooling some pooling. 61 00:03:00,900 --> 00:03:03,400 And we'll comment on those towards the end of this tutorial. 62 00:03:03,400 --> 00:03:05,000 But for now we're just applying max pooling. 63 00:03:05,000 --> 00:03:09,600 So we take a box of two by two pixels like that. 64 00:03:09,933 --> 00:03:12,266 And again it doesn't have to be two by two. 65 00:03:12,266 --> 00:03:13,466 You can choose any size of box. 66 00:03:13,466 --> 00:03:16,033 And again we'll comment on that towards our tutorial. 67 00:03:16,033 --> 00:03:19,033 And you place it in the top left hand corner 68 00:03:19,166 --> 00:03:21,833 and you find the maximum value in that box. 69 00:03:21,833 --> 00:03:26,000 And then you record only that value and you disregard the other three. 70 00:03:26,100 --> 00:03:27,800 So in your box you have four values. 71 00:03:27,800 --> 00:03:29,000 You just disregard three. 72 00:03:29,000 --> 00:03:31,666 You only keep one the maximum, which is one. In this case. 73 00:03:31,666 --> 00:03:34,566 Then you move your box to the right by a stride. 74 00:03:34,566 --> 00:03:36,033 You select the stride once again. 75 00:03:36,033 --> 00:03:41,000 So here we select a stride of two and you that's what you normally select. 76 00:03:41,000 --> 00:03:42,833 You can select a straight of one. You can select. 77 00:03:42,833 --> 00:03:44,333 So there are overlapping boxes. 78 00:03:44,333 --> 00:03:47,866 You can select any kind of stride that you like even three if you want. 79 00:03:48,666 --> 00:03:52,166 But we're selecting a stride of two here and that's what is commonly used. 80 00:03:52,333 --> 00:03:53,833 And then you repeat the repeat the process. 81 00:03:53,833 --> 00:03:55,766 You record the maximum here. 82 00:03:55,766 --> 00:03:58,933 If you crossover and it doesn't matter, you just keep continue 83 00:03:58,933 --> 00:03:59,933 doing what you're doing. 84 00:03:59,933 --> 00:04:02,800 So, you still record the max over here. 85 00:04:02,800 --> 00:04:03,900 Zero. 86 00:04:03,900 --> 00:04:05,566 here the maximum is four. 87 00:04:05,566 --> 00:04:07,200 Here the maximum is two here. 88 00:04:07,200 --> 00:04:10,533 The maximum is one, zero one as a row, two and then one. 89 00:04:11,233 --> 00:04:13,900 So as you can see, a few things happened. 90 00:04:13,900 --> 00:04:17,800 First of all, we still were able to preserve the features. 91 00:04:17,800 --> 00:04:18,400 Right. 92 00:04:18,400 --> 00:04:23,166 the maximum numbers they represent because we know how the convolution 93 00:04:23,166 --> 00:04:23,666 layer works. 94 00:04:23,666 --> 00:04:27,333 We know that the maximum or the bit large numbers in your feature map, 95 00:04:27,333 --> 00:04:31,200 they represent where you actually found the closest similarity to a feature. 96 00:04:31,500 --> 00:04:34,400 But by then pooling these features, 97 00:04:34,400 --> 00:04:38,166 we are first of all getting rid of 75% of the information 98 00:04:38,166 --> 00:04:42,166 that, is not the feature which is which is not, 99 00:04:42,500 --> 00:04:45,500 the important things that we're looking out for, 100 00:04:45,533 --> 00:04:48,933 because we are disregarding three pixels out of four. 101 00:04:49,633 --> 00:04:51,366 so we're only keeping 25%. 102 00:04:51,366 --> 00:04:54,300 And then also because 103 00:04:54,300 --> 00:04:57,566 we are taking the maximum of the, 104 00:04:58,300 --> 00:05:00,600 pixels that way or the values that we have, 105 00:05:00,600 --> 00:05:04,066 we are therefore accounting for any distortion. 106 00:05:04,066 --> 00:05:08,500 So for instance, two images in which, for example, 107 00:05:08,500 --> 00:05:11,566 the cheetah's, tears on the eyes are 108 00:05:12,066 --> 00:05:15,466 in one image, they're a bit to the left, or a bit rotated to the left. 109 00:05:15,466 --> 00:05:18,466 And another one, they're a bit and they're how they're supposed to be or 110 00:05:18,566 --> 00:05:22,233 how we, like, if we take one as the bases and another one, they're a bit 111 00:05:22,233 --> 00:05:26,466 rotate to the left, the the pooled feature will be exactly the same. 112 00:05:26,466 --> 00:05:30,300 So you can see here, if we are talking about the cheetah's tears, 113 00:05:30,400 --> 00:05:34,066 then let's say this is the four and this is where it was here. 114 00:05:34,133 --> 00:05:35,966 Then if it was a bit rotated. 115 00:05:35,966 --> 00:05:38,233 So for instance the four ended up over here. 116 00:05:38,233 --> 00:05:40,400 Then when we're doing the pooling 117 00:05:40,400 --> 00:05:43,000 we're still going to get the same pooled feature map. 118 00:05:43,000 --> 00:05:46,000 And that's kind of the the principle behind it. 119 00:05:46,400 --> 00:05:48,733 It's a very, rough explanation. 120 00:05:48,733 --> 00:05:51,600 Again, intuitive explanation, but that's the point of pooling 121 00:05:51,600 --> 00:05:54,766 that we're still being able to preserve the features. 122 00:05:54,966 --> 00:05:58,666 And moreover, accounts for, their possible 123 00:05:58,666 --> 00:06:01,966 spatial or textural or other kind of distortions. 124 00:06:02,300 --> 00:06:05,700 And in addition to all of that, we are reducing the size. 125 00:06:05,700 --> 00:06:07,266 So there's another benefit. 126 00:06:07,266 --> 00:06:09,800 So we've got we're preserving the features. 127 00:06:09,800 --> 00:06:12,000 We're introducing spatial invariance. 128 00:06:12,000 --> 00:06:15,800 We're reducing the size by 75%, 129 00:06:16,066 --> 00:06:19,266 which is huge, which is really going to help us in terms of processing. 130 00:06:19,633 --> 00:06:23,166 And moreover, another benefit of pooling is 131 00:06:23,166 --> 00:06:25,033 we are reducing the number of parameters. 132 00:06:25,033 --> 00:06:27,733 So we're reducing again by 75%. 133 00:06:27,733 --> 00:06:28,866 We're reducing the number of parameters 134 00:06:28,866 --> 00:06:32,133 that are going to go into our final layers of the neural network. 135 00:06:32,533 --> 00:06:35,133 And therefore we're preventing overfitting. 136 00:06:35,133 --> 00:06:41,100 It is a very important benefit of pooling that we're removing information. 137 00:06:41,100 --> 00:06:42,500 And that is a good thing. 138 00:06:42,500 --> 00:06:45,500 That is a good thing because that way, 139 00:06:45,500 --> 00:06:48,533 our model won't be able to overfit 140 00:06:48,533 --> 00:06:52,500 onto that information because especially because that information is not real. 141 00:06:52,500 --> 00:06:55,600 And remember, like at the very start we were talking about even for humans, 142 00:06:55,666 --> 00:06:59,100 us as humans, it's important to see exactly the features 143 00:06:59,100 --> 00:07:02,133 rather than all this other noise that is coming into our eyes. 144 00:07:02,700 --> 00:07:04,433 Well, same thing for neural networks. 145 00:07:04,433 --> 00:07:07,500 They by disregarding the unnecessary, 146 00:07:07,633 --> 00:07:11,866 not important information we're helping with preventing of overfitting. 147 00:07:12,333 --> 00:07:14,500 So there we go. That is what pooling is about. 148 00:07:14,500 --> 00:07:19,600 And the question here is, of course, why why max pooling. 149 00:07:19,600 --> 00:07:21,600 Right. There's lots of different types of pooling. 150 00:07:21,600 --> 00:07:23,600 And you know why why a stride of two? 151 00:07:23,600 --> 00:07:25,533 Why a size of two by two pixels. 152 00:07:25,533 --> 00:07:26,633 Lots of all these things. 153 00:07:26,633 --> 00:07:30,566 And on that note, I'd like to introduce you to this, 154 00:07:30,733 --> 00:07:34,500 a lovely research paper called Evaluation of Pooling Operations 155 00:07:34,500 --> 00:07:37,500 in Convolutional Architectures for Object Recognition 156 00:07:37,633 --> 00:07:40,733 by Dominic Scherrer from University of Bonn. 157 00:07:41,000 --> 00:07:42,033 There's the link. 158 00:07:42,033 --> 00:07:45,733 And the beauty about this paper is that it's very, 159 00:07:45,733 --> 00:07:47,466 very simple, very straightforward. 160 00:07:47,466 --> 00:07:49,866 So if you've never read a research paper 161 00:07:49,866 --> 00:07:53,733 before which you'd like to give it a go, this is a great place to start. 162 00:07:53,733 --> 00:07:56,733 It's very short, only ten pages, very easy to read. 163 00:07:56,933 --> 00:08:00,700 And plus the extra benefit is that now that we've discussed convolution 164 00:08:00,700 --> 00:08:03,700 and pooling, you will be totally comfortable 165 00:08:03,700 --> 00:08:05,866 with everything that they're talking about in this paper. 166 00:08:05,866 --> 00:08:09,333 And you, this is a great way to actually reinforce you knowledge. 167 00:08:09,333 --> 00:08:11,733 So I highly recommend checking this paper out. 168 00:08:11,733 --> 00:08:13,833 I will take 20 minutes to read it. 169 00:08:13,833 --> 00:08:17,500 And you can even skip part two, which is called related work 170 00:08:17,500 --> 00:08:20,900 if it feels a bit farfetched or alienating, just don't read that part. 171 00:08:21,233 --> 00:08:23,733 Go straight to from part one to part three. 172 00:08:23,733 --> 00:08:26,400 And the one thing that you do need to know about this paper. 173 00:08:26,400 --> 00:08:29,500 They talk about a concept called subsampling, 174 00:08:30,400 --> 00:08:33,133 while subsampling is basically average pooling. 175 00:08:33,133 --> 00:08:37,333 So remember how here we were taking, we were taking the maximum. 176 00:08:37,333 --> 00:08:39,833 So you know square we're taking the maximum value. 177 00:08:39,833 --> 00:08:42,966 There's a concept called mean pooling or some pooling. 178 00:08:42,966 --> 00:08:45,100 Some pooling is you just sum these values up. 179 00:08:45,100 --> 00:08:46,733 Average pooling or mean pooling. 180 00:08:46,733 --> 00:08:49,733 You take the average value out of all of these. 181 00:08:49,866 --> 00:08:53,766 And subsampling is kind of like a generalization of mean pooling. 182 00:08:53,766 --> 00:08:57,166 It's it's a more kind of generalized approach 183 00:08:57,166 --> 00:09:00,766 to taking the average of, of these values. 184 00:09:00,766 --> 00:09:02,400 And you can read a bit more about it in the paper. 185 00:09:02,400 --> 00:09:06,233 But otherwise, just think of it as average pooling when you're reading that paper. 186 00:09:06,800 --> 00:09:09,800 And so that's where you can get some additional information on this topic. 187 00:09:09,833 --> 00:09:12,266 And now kind of let's recap where have we got into it. 188 00:09:12,266 --> 00:09:14,700 So there's our input image. 189 00:09:14,700 --> 00:09:18,800 Then we applied the convolution operation and we got the convolution layer. 190 00:09:18,900 --> 00:09:21,900 And now to each of those feature maps 191 00:09:21,900 --> 00:09:24,133 that we get we've applied the pooling layer. 192 00:09:24,133 --> 00:09:28,433 So we've got, we've done these two steps convolution and pooling. 193 00:09:28,733 --> 00:09:31,766 And now we're going to do something very fun, something exciting. 194 00:09:32,033 --> 00:09:34,333 We're going to, experiment with this. 195 00:09:34,333 --> 00:09:38,266 So this is a screenshot I took from a, tool 196 00:09:38,633 --> 00:09:42,600 created by Adam Harley from, 197 00:09:42,600 --> 00:09:46,266 well, back when he was at Ryerson University of Computer Science, 198 00:09:46,266 --> 00:09:50,900 and now he's at Carnegie Mellon, I think, doing his PhD and great tool. 199 00:09:50,900 --> 00:09:52,400 So let's open up. 200 00:09:52,400 --> 00:09:54,066 Let's have a look so you can find it. 201 00:09:54,066 --> 00:09:55,700 You can't actually find it through Google. 202 00:09:55,700 --> 00:09:57,400 You have to know the URL. 203 00:09:57,400 --> 00:10:00,833 It's it's it's just hard to find this on Google because there's no text here. 204 00:10:01,333 --> 00:10:06,500 See we're just this URL x dot Ryerson okay. 205 00:10:06,500 --> 00:10:09,666 And then this stuff on then and basically this 206 00:10:10,366 --> 00:10:12,600 is exactly what we're doing but visualized. 207 00:10:12,600 --> 00:10:14,300 So here you need to draw a number. 208 00:10:14,300 --> 00:10:16,533 So let's say I draw number four. 209 00:10:16,533 --> 00:10:21,266 And this tool will put the number four here. 210 00:10:21,266 --> 00:10:24,066 That's your image in our first step. 211 00:10:24,066 --> 00:10:27,000 Then this is the convolution step right. 212 00:10:27,000 --> 00:10:28,133 And this is the pooling step. 213 00:10:28,133 --> 00:10:30,300 And also pooling by the way is also called downsampling. 214 00:10:30,300 --> 00:10:33,300 So pooling and downsampling are the same things. 215 00:10:33,866 --> 00:10:35,700 So you can see it's applied convolution. 216 00:10:35,700 --> 00:10:37,366 Then it's applied pooling. 217 00:10:37,366 --> 00:10:39,033 And you can see how it exactly works. 218 00:10:39,033 --> 00:10:42,366 So you can see what kind of convolutions that it has applied or 219 00:10:42,533 --> 00:10:44,800 what kind of filters it applied, what they look like. 220 00:10:44,800 --> 00:10:47,100 You can see what features it's looking out for. 221 00:10:47,100 --> 00:10:49,266 and then it's applying pooling. 222 00:10:49,266 --> 00:10:50,566 So it's reducing the size. 223 00:10:50,566 --> 00:10:53,300 And you can see here that this is important right. 224 00:10:53,300 --> 00:10:58,700 So you can see, that this is the convolved image 225 00:10:58,700 --> 00:11:00,033 and this is the pooled image. 226 00:11:00,033 --> 00:11:01,700 And you can still see the same features. 227 00:11:01,700 --> 00:11:04,200 It's just lesson information, but same features, right? 228 00:11:04,200 --> 00:11:05,733 The features are preserved. 229 00:11:05,733 --> 00:11:07,566 That's the important part. 230 00:11:07,566 --> 00:11:12,066 and moreover, if you know, if all four was a bit to the kind of like rotated 231 00:11:12,066 --> 00:11:16,700 a bit to the side, it would still be able to pick up very similar pooled layers. 232 00:11:16,900 --> 00:11:18,500 And then after that it's got more layers. 233 00:11:18,500 --> 00:11:19,700 We haven't talked about that yet. 234 00:11:19,700 --> 00:11:23,066 So then it's got another convolutional, convolution 235 00:11:23,066 --> 00:11:26,066 layer here, which, we actually won't have. 236 00:11:26,266 --> 00:11:29,700 and then it has another pool layer, but it's basically just repeating that 237 00:11:29,700 --> 00:11:30,866 same process. 238 00:11:30,866 --> 00:11:32,000 And then after that, 239 00:11:32,000 --> 00:11:34,833 this is what we're going to be talking further down in the course, 240 00:11:34,833 --> 00:11:37,833 is got the fully connected layers and so on. 241 00:11:37,966 --> 00:11:39,800 But you can definitely play around with that. 242 00:11:39,800 --> 00:11:43,433 So if I delete that, you like if I draw a seven 243 00:11:44,533 --> 00:11:47,800 you will see that, it actually tells you the guesses. 244 00:11:47,800 --> 00:11:49,433 It guesses that this is a seven. 245 00:11:49,433 --> 00:11:52,500 And the second guess the second likelihood is a three. 246 00:11:52,866 --> 00:11:56,366 So you can draw it some some challenging things and see if it can pick them up. 247 00:11:56,366 --> 00:11:59,466 So let's say if I draw something that looks like a zero 248 00:11:59,466 --> 00:12:01,900 but it's not a finished zero, will it pick it up 249 00:12:01,900 --> 00:12:03,666 now this this time didn't pick it up. 250 00:12:03,666 --> 00:12:06,033 Looks like a nine to it to the image. 251 00:12:06,033 --> 00:12:08,400 What if I kind of like finished like that? 252 00:12:08,400 --> 00:12:11,400 So now it thinks it's a zero or a nine 253 00:12:11,500 --> 00:12:14,400 and you can see over there what's lighting up the zero or the nine. 254 00:12:14,400 --> 00:12:16,500 But we'll talk about that part for the dog. 255 00:12:16,500 --> 00:12:17,266 Let's do one more. 256 00:12:17,266 --> 00:12:19,766 Let's say like like eight. 257 00:12:19,766 --> 00:12:22,800 I think it's a pretty hard for this now. 258 00:12:22,800 --> 00:12:23,700 Picked up an eight. 259 00:12:23,700 --> 00:12:27,433 So you can see that goes into an eight and then like after that 260 00:12:27,433 --> 00:12:28,800 it stops being recognizable. 261 00:12:28,800 --> 00:12:31,800 The subs making sense to us humans, right? 262 00:12:32,000 --> 00:12:34,433 These, features that it's working with. 263 00:12:34,433 --> 00:12:38,300 But at the same time, it is correctly recognizing that it's an eight. 264 00:12:38,833 --> 00:12:40,466 Yeah. So definitely play around with that. 265 00:12:40,466 --> 00:12:43,200 You can draw a smiley face, see what happens then. 266 00:12:44,166 --> 00:12:47,433 Looks like a three to this, to this tool 267 00:12:47,466 --> 00:12:50,700 because the tool is obviously trained up only on digits from 0 to 9. 268 00:12:50,966 --> 00:12:53,100 So it has to recognize something. 269 00:12:53,100 --> 00:12:54,433 There are of those. 270 00:12:54,433 --> 00:12:56,866 And it recognizes a three. 271 00:12:56,866 --> 00:13:00,633 It's like in life when you when you see something like a a type of fruit 272 00:13:00,633 --> 00:13:05,666 that you've never seen before, like a, custard apple or something. 273 00:13:05,966 --> 00:13:09,933 And you think that it's, like it's, it's a pear 274 00:13:10,433 --> 00:13:12,266 because you've never actually seen one before. 275 00:13:12,266 --> 00:13:13,900 You don't know what to classify it as. 276 00:13:13,900 --> 00:13:14,600 Same thing here. 277 00:13:14,600 --> 00:13:17,600 So it hasn't actually trained on the smiley faces. 278 00:13:17,733 --> 00:13:20,366 And that's why it's thinks it's a tree. It's a three. 279 00:13:20,366 --> 00:13:22,566 So there you go. It's a very powerful, powerful tool. 280 00:13:22,566 --> 00:13:25,133 It'll be helpful for you to play around with. 281 00:13:25,133 --> 00:13:30,066 It actually when you put your mouse over a pixel, it pixels, it shows you, 282 00:13:30,666 --> 00:13:34,633 where the, feature detector was to pick up that pixel 283 00:13:34,633 --> 00:13:37,366 so you can see where those, this pixel is coming from. 284 00:13:37,366 --> 00:13:41,333 And, also so you can see how the filter was 285 00:13:41,333 --> 00:13:44,400 kind of like going through the image, exactly how we talked about in the course. 286 00:13:44,400 --> 00:13:45,466 And here you can see, 287 00:13:45,466 --> 00:13:48,766 you can see the, the pooling, you can see that the pooling is done with, 288 00:13:49,300 --> 00:13:52,533 the pooling is done with a, 289 00:13:53,300 --> 00:13:56,733 little square size of two by two. 290 00:13:57,066 --> 00:13:59,866 And you can see that it's, it's a stride of two as well. 291 00:13:59,866 --> 00:14:03,400 Just as we discussed in, today's tutorial. 292 00:14:03,800 --> 00:14:04,800 So there you go. 293 00:14:04,800 --> 00:14:05,933 Have a play around with that. 294 00:14:05,933 --> 00:14:09,100 And I hope you enjoyed today's, session. 295 00:14:09,100 --> 00:14:10,433 I look forward to seeing you next time. 296 00:14:10,433 --> 00:14:12,400 And until then, enjoy deep learning.