1 00:00:00,366 --> 00:00:02,700 Hello and welcome back to the course on Deep Learning. 2 00:00:02,700 --> 00:00:06,800 Today we're kicking off convolutional Neural networks is going to be exciting. 3 00:00:06,800 --> 00:00:08,466 Let's dive straight into it. 4 00:00:08,466 --> 00:00:10,733 We're going to start off with an image. 5 00:00:10,733 --> 00:00:13,466 What do you see when you look at this image. 6 00:00:13,466 --> 00:00:17,366 Do you see a person looking at you or do you see a person looking to the right? 7 00:00:18,033 --> 00:00:21,566 You can see that your brain is is struggling, 8 00:00:21,566 --> 00:00:25,700 is struggling to adjust if you look to the right side of the image. 9 00:00:25,733 --> 00:00:27,333 Just look at the right border of the image. 10 00:00:27,333 --> 00:00:30,200 You'll see a person looking to the right. If you look at the left 11 00:00:30,200 --> 00:00:33,200 border of the image, you'll see a person looking at you. 12 00:00:33,600 --> 00:00:36,666 And this just proves that 13 00:00:37,133 --> 00:00:42,066 what our brain is looking for when we see things is features. 14 00:00:42,066 --> 00:00:44,500 Depending on the features that it sees, depending 15 00:00:44,500 --> 00:00:48,333 on the features that you process, you categorize things in certain ways. 16 00:00:48,566 --> 00:00:51,566 So when you look on the right side of the image, 17 00:00:51,666 --> 00:00:53,933 you see certain features of a person looking to the right 18 00:00:53,933 --> 00:00:56,933 because they're closer to your center of focus, 19 00:00:57,100 --> 00:01:00,500 and therefore your brain classifies that as a person looking to the right. 20 00:01:00,800 --> 00:01:04,200 When you look to the left side of the image, you see more features 21 00:01:04,200 --> 00:01:09,000 of a person looking at you and therefore your brain classifies it as such. 22 00:01:09,400 --> 00:01:11,100 So let's have a look at another one. 23 00:01:11,100 --> 00:01:12,733 This is a very famous image. 24 00:01:12,733 --> 00:01:15,600 You probably have already seen it, but what do you see here? 25 00:01:16,666 --> 00:01:19,666 So some people will say that they see a 26 00:01:19,766 --> 00:01:23,366 young lady wearing a dress looking away. 27 00:01:23,700 --> 00:01:26,833 Some people will say they see an old lady 28 00:01:27,100 --> 00:01:30,100 wearing a scarf on her head, looking down. 29 00:01:30,100 --> 00:01:33,433 So I'm going to point these features out and you'll see that it'll become 30 00:01:33,566 --> 00:01:34,200 very obvious. 31 00:01:34,200 --> 00:01:37,400 So this is the face of the young lady looking away. 32 00:01:37,400 --> 00:01:40,300 She's looking into the distance. That's her coat. 33 00:01:40,300 --> 00:01:41,166 That's her hair. 34 00:01:41,166 --> 00:01:43,500 That's her little feather in her hair. 35 00:01:43,500 --> 00:01:48,900 And on the other hand, this is the head of the old lady looking down. 36 00:01:48,900 --> 00:01:49,766 That's her nose. 37 00:01:49,766 --> 00:01:52,200 That's her mouth, that's her chin. 38 00:01:52,200 --> 00:01:53,433 That's the scarf on her head. 39 00:01:53,433 --> 00:01:55,600 And she's looking down. 40 00:01:55,600 --> 00:01:59,266 So, as you can see, two and one, and depending on which features 41 00:01:59,266 --> 00:02:02,266 your brain picks up, it will switch between 42 00:02:02,433 --> 00:02:05,866 classifying each the image as one or the other. 43 00:02:06,733 --> 00:02:09,666 The oldest one of these illusions, 44 00:02:09,666 --> 00:02:13,500 recorded in the printed work, is this one. 45 00:02:13,800 --> 00:02:15,133 It's the duck or the rabbit. 46 00:02:15,133 --> 00:02:16,866 So is this a duck or is this a rabbit? 47 00:02:16,866 --> 00:02:18,266 Another example. 48 00:02:18,266 --> 00:02:22,400 And now I'm going to show you an image which will just for a second, 49 00:02:22,433 --> 00:02:25,533 just look at it and see what, what's what is emotions 50 00:02:25,533 --> 00:02:28,533 or what kind of experience visual experience you go through. 51 00:02:28,966 --> 00:02:31,000 So what do you see? 52 00:02:31,000 --> 00:02:35,433 Does you feel like a bit not dizzy, but a little bit dazzled? 53 00:02:35,566 --> 00:02:38,700 Like your brain is trying to try and understand what it is, what it is like. 54 00:02:39,000 --> 00:02:40,133 It's trying to. 55 00:02:40,133 --> 00:02:44,200 It's jumping between her eyes up and down, eyes. And, 56 00:02:45,166 --> 00:02:46,166 this is a classic 57 00:02:46,166 --> 00:02:49,833 example of when there are certain features 58 00:02:49,833 --> 00:02:53,400 where it could be this, it could be that, but your brain cannot decide. 59 00:02:54,000 --> 00:02:58,566 And because both seem, plausible and, yeah. 60 00:02:58,566 --> 00:03:03,066 So basically, all of these examples illustrate to us how the brain works, that 61 00:03:03,266 --> 00:03:08,266 it processes certain features on an image or on whatever you see in, in real life. 62 00:03:08,633 --> 00:03:10,733 And it classifies that as such. 63 00:03:10,733 --> 00:03:15,200 And you probably been in situations when you look over your shoulder quickly 64 00:03:15,200 --> 00:03:18,833 and you see something, you think it's I don't know if it's like a 65 00:03:19,966 --> 00:03:23,833 a ball, but it turns out to be a cat or you think it's a, it's a car, 66 00:03:23,833 --> 00:03:25,433 but it turns out to be a shadow and things like that. 67 00:03:25,433 --> 00:03:28,233 That's because you don't have enough time to process those features, 68 00:03:28,233 --> 00:03:31,100 or you don't have enough features to classify things as such. 69 00:03:31,100 --> 00:03:34,433 And this is for me, it's 70 00:03:34,466 --> 00:03:37,466 this is very interesting because what we're going to be doing with, 71 00:03:37,700 --> 00:03:40,700 neural networks, with convolutional neural networks is very similar. 72 00:03:40,700 --> 00:03:44,466 And you'll find that the way that computers are going to be processing 73 00:03:44,466 --> 00:03:48,100 images is going to be extremely similar to the way we are processing images. 74 00:03:48,100 --> 00:03:52,200 So it's it's very valuable to understand and just kind of remember these things 75 00:03:52,200 --> 00:03:53,500 that this is how we do it. 76 00:03:53,500 --> 00:03:56,500 And I'm going to take this lady off your screens because it's 77 00:03:56,533 --> 00:03:58,466 she's probably already freaking you out by now. 78 00:03:58,466 --> 00:04:00,866 So here's, something different. 79 00:04:00,866 --> 00:04:02,000 Here's an experiment. 80 00:04:02,000 --> 00:04:06,900 An experiment, done on computers, on convolutional neural networks. 81 00:04:06,900 --> 00:04:10,500 So we're slowly moving now, from humans to computers. 82 00:04:11,233 --> 00:04:14,233 And this slide is a is from a told by Geoffrey Hinton. 83 00:04:14,433 --> 00:04:17,233 and here you have 84 00:04:17,233 --> 00:04:19,866 basically it describes an experiment that he had done 85 00:04:19,866 --> 00:04:23,600 on, some convolutional neural networks that he had trained up. 86 00:04:24,300 --> 00:04:27,333 So here you see three images, and we're going to go through them 87 00:04:27,333 --> 00:04:30,000 left to right and see how you would classify them, 88 00:04:30,000 --> 00:04:31,700 and then see how the computer classify them. 89 00:04:31,700 --> 00:04:34,033 So on the left, what do you think this is. 90 00:04:35,333 --> 00:04:37,600 He probably said cheetah and you will be right. 91 00:04:37,600 --> 00:04:38,766 And this is what the computer said. 92 00:04:38,766 --> 00:04:42,600 So and right away, right off the bat we're going to learn how to read these images. 93 00:04:42,600 --> 00:04:47,033 Because, if you're going to go deep into convolutional 94 00:04:47,033 --> 00:04:50,466 neural networks, no pun intended, if you're going to, 95 00:04:51,066 --> 00:04:53,900 start learning more and more about them and using them, you'll see a lot of these. 96 00:04:53,900 --> 00:04:57,000 So and I've actually seen people read them incorrectly. 97 00:04:57,000 --> 00:05:01,333 So here at the top, cheetah is what it actually is. 98 00:05:01,333 --> 00:05:03,900 So that's the actual correct label. 99 00:05:03,900 --> 00:05:04,800 of the image. 100 00:05:04,800 --> 00:05:09,533 That's what, the label of the images, regardless of any processing and, and, 101 00:05:09,833 --> 00:05:11,133 the computer vision. 102 00:05:11,133 --> 00:05:13,800 and then here are the guesses. 103 00:05:13,800 --> 00:05:16,800 The top 4 or 5 sometimes guesses of the, 104 00:05:16,933 --> 00:05:20,533 algorithm and they're given the probability. 105 00:05:20,533 --> 00:05:23,900 So the computer said or the neural network said 106 00:05:23,900 --> 00:05:27,133 cheetah, leopard, snow leopard or Egyptian cat can be one of the four. 107 00:05:27,400 --> 00:05:29,033 And cheetah has the highest vote. 108 00:05:29,033 --> 00:05:32,766 And throughout this, part of the course, you will understand what these votes mean. 109 00:05:32,766 --> 00:05:34,666 And, how they are derived. 110 00:05:34,666 --> 00:05:36,400 But for now, it's pretty intuitive. Right? 111 00:05:36,400 --> 00:05:38,233 So, it's a cheetah in reality. 112 00:05:38,233 --> 00:05:40,566 And the neural network guessed right. 113 00:05:40,566 --> 00:05:43,433 It said with, with a high probability, about like 95, 99%. 114 00:05:43,433 --> 00:05:44,033 It's a cheetah. 115 00:05:45,300 --> 00:05:47,366 then the second one, what do you think? 116 00:05:47,366 --> 00:05:50,766 Is it that is that is a bullet train 117 00:05:51,133 --> 00:05:54,633 and the neural network was able to distinguish between 118 00:05:54,633 --> 00:05:57,933 bullet train, passenger car, subway train, electric locomotive. 119 00:05:57,933 --> 00:06:00,366 Those are the top choices. Of course, it had many more options. 120 00:06:00,366 --> 00:06:03,600 These neural networks, learn to distinguish from 121 00:06:04,200 --> 00:06:08,666 not just four categories from dozens, thousands of categories at the same time. 122 00:06:08,666 --> 00:06:10,800 So those are the four options that it picked. 123 00:06:10,800 --> 00:06:12,700 And so that's bullet train editable train. 124 00:06:12,700 --> 00:06:15,700 So what do you think the last one is? 125 00:06:16,266 --> 00:06:18,466 very there are a couple of options there. 126 00:06:18,466 --> 00:06:21,433 It's not very clear what is it could be a frying pan. 127 00:06:21,433 --> 00:06:22,733 It could be a magnifying glass. 128 00:06:22,733 --> 00:06:26,966 It could be even maybe a pair of scissors. 129 00:06:26,966 --> 00:06:29,166 Some might say, well, the neural network said 130 00:06:29,166 --> 00:06:32,333 it was a pair of scissors, but you can see how you can go wrong here. 131 00:06:32,433 --> 00:06:35,366 First of all, it's not a very clear image. 132 00:06:35,366 --> 00:06:38,233 And also you can see that the, 133 00:06:38,233 --> 00:06:41,700 probabilities are not as clear here. 134 00:06:41,700 --> 00:06:46,200 So the neural network was a bit confused, a bit indecisive, just as we are. 135 00:06:46,200 --> 00:06:49,600 So, it said scissors with the highest probability, but then it had hand gloss, 136 00:06:49,600 --> 00:06:53,700 which it actually was with not, not so far away on the second place. 137 00:06:53,700 --> 00:06:55,766 And frying pan. stethoscope. 138 00:06:55,766 --> 00:06:58,500 So basically, here you can see that 139 00:06:58,500 --> 00:07:01,533 scissors was its first guess, but the correct option was number two. 140 00:07:01,533 --> 00:07:03,133 And that's why it's highlighted in red. 141 00:07:03,133 --> 00:07:03,900 So there we go. 142 00:07:03,900 --> 00:07:06,933 That's that's what neural networks are already capable of. 143 00:07:06,933 --> 00:07:08,800 And this is actually quite an old slide. 144 00:07:08,800 --> 00:07:10,500 This was several years ago. 145 00:07:10,500 --> 00:07:11,733 Now they're even better. 146 00:07:11,733 --> 00:07:13,333 And you will see that from 147 00:07:13,333 --> 00:07:16,500 the practical application that you will be coding together with headland. 148 00:07:16,800 --> 00:07:20,100 But now let's try to understand a bit better what Convnets or convolutional 149 00:07:20,100 --> 00:07:23,533 neural networks actually are and why are they gaining so much popularity. 150 00:07:23,800 --> 00:07:25,666 And they actually are gaining popularity. 151 00:07:25,666 --> 00:07:30,833 So you can see here a, Google Trends comparison I did just yesterday. 152 00:07:31,666 --> 00:07:35,600 Here you can see that, kind of convolutional neural networks 153 00:07:35,600 --> 00:07:39,333 are even taking over artificial neural networks. 154 00:07:39,333 --> 00:07:43,100 So, a massive increase. 155 00:07:43,100 --> 00:07:47,700 And this just going to keep going that way because it is a very important field 156 00:07:47,966 --> 00:07:52,466 that that is where all, the things happen such as, like self-driving cars. 157 00:07:52,466 --> 00:07:54,066 How do they recognize, 158 00:07:54,066 --> 00:07:57,833 people on the road, how to recognize stop signs and things like that? 159 00:07:57,833 --> 00:08:01,033 How do, how does Facebook how's Facebook 160 00:08:01,066 --> 00:08:04,833 able to tag images or people in images? 161 00:08:04,833 --> 00:08:08,666 And not only just like remember previously, years ago, 162 00:08:08,700 --> 00:08:12,600 you had to tag people yourself, then it would recognize faces. 163 00:08:12,600 --> 00:08:14,166 You had to add them and add the names. 164 00:08:14,166 --> 00:08:18,000 And now it just recognizes the faces and adds the names at the same time. 165 00:08:18,433 --> 00:08:23,400 Well, that is what convolutional neural networks are capable of. 166 00:08:23,666 --> 00:08:29,433 And speaking of Facebook, if Geoffrey Hinton is the godfather of, 167 00:08:30,266 --> 00:08:33,566 artificial neural networks and deep learning, then Yann 168 00:08:33,566 --> 00:08:38,733 LeCun is the grandfather of convolutional neural networks. 169 00:08:39,000 --> 00:08:42,033 Yann LeCun is a student of Geoffrey Hinton's. 170 00:08:42,466 --> 00:08:45,466 And, in fact, here you can see them together. 171 00:08:45,566 --> 00:08:48,566 And, Geoffrey Hinton now is, 172 00:08:48,566 --> 00:08:51,266 pioneering deep learning at Google. 173 00:08:51,266 --> 00:08:53,333 Yann LeCun is the director of Facebook 174 00:08:53,333 --> 00:08:56,500 Artificial Intelligence Research and also professor at NYU. 175 00:08:56,833 --> 00:09:00,000 So Australia, where, I love this part of the course. 176 00:09:00,000 --> 00:09:03,766 Slowly we're building up this, name, these names or this, 177 00:09:04,266 --> 00:09:08,966 kind of picture of the profiles of the people who are driving this field. 178 00:09:09,266 --> 00:09:14,300 And, next, in the next couple of parts will get to know about a few more, 179 00:09:14,300 --> 00:09:17,366 and we'll have this whole mafia, as they call themselves, 180 00:09:17,366 --> 00:09:21,000 or Yann LeCun calls them mafia or conspiracy of deep learning. 181 00:09:21,000 --> 00:09:24,000 And you'll learn a bit more about how this whole field develops. 182 00:09:24,400 --> 00:09:24,600 yeah. 183 00:09:24,600 --> 00:09:27,300 It's just these are just some great, great people. 184 00:09:27,300 --> 00:09:31,766 And so Yann LeCun back in, in the 80s, in the 90s, made 185 00:09:32,000 --> 00:09:36,166 significant contributions to the field of, convolutional neural networks. 186 00:09:36,166 --> 00:09:40,900 And as we will see, throughout this, course has been able 187 00:09:40,900 --> 00:09:46,100 to, develop or help the world develop something so extremely powerful. 188 00:09:46,466 --> 00:09:50,800 So moving on to how convolutional neural networks work. 189 00:09:51,366 --> 00:09:53,233 you haven't input. It's very simple. 190 00:09:53,233 --> 00:09:54,200 It's very straightforward. 191 00:09:54,200 --> 00:09:58,000 So you have an input image, it goes through the convolutional neural network 192 00:09:58,233 --> 00:09:59,700 and you have an output label. 193 00:09:59,700 --> 00:10:02,700 So it classifies that image as something 194 00:10:03,233 --> 00:10:06,300 like as a cheetah or a bullet train or something else. 195 00:10:06,600 --> 00:10:10,200 Now, kind of like going into a bit more, detail. 196 00:10:10,433 --> 00:10:14,400 For instance, you can, after neural network has been trained up, 197 00:10:14,900 --> 00:10:18,166 on uncertain images, on certain, 198 00:10:18,166 --> 00:10:22,833 classified images or categorized images that have been categorized prior, 199 00:10:23,100 --> 00:10:26,100 after that, you can give it, let's say a neural network 200 00:10:26,100 --> 00:10:30,066 has been trained up to recognize, facial expressions and motions. 201 00:10:30,366 --> 00:10:34,966 You can give it a face, of a smiling person, not just a face, 202 00:10:35,133 --> 00:10:39,166 like a drawing of a face like this, but actual face of a person smiling. 203 00:10:39,266 --> 00:10:41,466 And it'll tell you that that person is happy. 204 00:10:41,466 --> 00:10:44,733 And, you can give it a face of a person that's frowning. 205 00:10:44,733 --> 00:10:47,166 It will tell you that the person is sad. 206 00:10:47,166 --> 00:10:48,466 It can recognize these emotions. 207 00:10:48,466 --> 00:10:48,966 And as you can see, 208 00:10:48,966 --> 00:10:53,200 that's already very powerful in terms of so many different applications. 209 00:10:53,200 --> 00:10:57,500 Just this one, example you can think of right away. 210 00:10:57,500 --> 00:11:00,433 And, and in both cases, it'll give you a probability. 211 00:11:00,433 --> 00:11:04,866 So it won't say, you know, with 100% the person's, happy or sad, 212 00:11:04,866 --> 00:11:11,700 it'll be 99 or 98, or maybe 80% when it's unclear of what's going on. 213 00:11:11,700 --> 00:11:14,700 And just like we are right, sometimes we can mistake 214 00:11:15,033 --> 00:11:16,500 things for what they're not. 215 00:11:16,500 --> 00:11:17,933 Or sometimes we can. 216 00:11:17,933 --> 00:11:22,200 sometimes it's it's just not clear if the person is smiling or frowning 217 00:11:22,200 --> 00:11:25,200 or if it's, if it's a dog or a cat or if it's, 218 00:11:25,600 --> 00:11:28,166 a train or a bullet train. 219 00:11:28,166 --> 00:11:28,433 Right. 220 00:11:28,433 --> 00:11:30,933 Sometimes we don't have we haven't seen enough features 221 00:11:30,933 --> 00:11:34,866 and all goes down to features, because that's how we, process 222 00:11:34,866 --> 00:11:38,800 visual information, as we saw from the start of this, tutorial. So. 223 00:11:39,100 --> 00:11:40,966 But how does a neural network, 224 00:11:40,966 --> 00:11:44,033 how is a neural network able to recognize these features? 225 00:11:44,033 --> 00:11:47,500 Well, it all starts at the very, basic level. 226 00:11:48,000 --> 00:11:50,733 you have let's say you have an image, you have two images. 227 00:11:50,733 --> 00:11:53,733 one is a black and white image of two by two pixels, 228 00:11:53,900 --> 00:11:56,366 and one is a colored image of two by two pixels. 229 00:11:56,366 --> 00:11:59,433 Well, neural networks leverage the fact that, 230 00:11:59,900 --> 00:12:04,600 the black and white image is a two dimensional like array. 231 00:12:04,600 --> 00:12:05,700 So the way we see it 232 00:12:05,700 --> 00:12:09,600 right now on the left is just the visual representation, right? 233 00:12:09,600 --> 00:12:11,100 So it's some kind of picture. 234 00:12:11,100 --> 00:12:13,933 And for simplicity's sake, it's just a two by two picture. 235 00:12:13,933 --> 00:12:16,866 But in computer terms it's actually a two dimensional array 236 00:12:16,866 --> 00:12:21,866 with every single of those, one of those pixels having a value between 0 and 255. 237 00:12:22,200 --> 00:12:27,566 So that's eight eight bits of information to the two, to the power of eight is 256. 238 00:12:27,566 --> 00:12:30,266 So therefore the values are from 0 to 255. 239 00:12:30,266 --> 00:12:32,100 And that's intensity of the color. 240 00:12:32,100 --> 00:12:33,433 And in this case the color white. 241 00:12:33,433 --> 00:12:38,533 So zero will be a completely black pixel 255 will be a completely white pixel. 242 00:12:38,533 --> 00:12:44,300 And between them you have the grayscale range of possible options for this pixel. 243 00:12:44,466 --> 00:12:49,900 And based on that information, computers are able to, then work with the image. 244 00:12:49,900 --> 00:12:53,033 And that's kind of like the starting point that any image is 245 00:12:53,033 --> 00:12:56,266 actually has a digital representation, has a digital form, 246 00:12:56,433 --> 00:12:59,300 and those are just basically ones and zeros 247 00:12:59,300 --> 00:13:03,133 that form a number 0 to 255 for every single pixel. 248 00:13:03,133 --> 00:13:04,233 And that's what the computer works with. 249 00:13:04,233 --> 00:13:05,833 It doesn't actually work with, 250 00:13:05,833 --> 00:13:08,000 you know, the colors or anything works with the ones and zeros. 251 00:13:08,000 --> 00:13:12,200 At the end of the day, that's that's kind of like the foundation of it all. 252 00:13:12,766 --> 00:13:16,900 and in a color image, it's actually a three dimensional array. 253 00:13:17,066 --> 00:13:21,633 You've got, blue pixel, you've got a blue layer, a green layer and a red layer. 254 00:13:21,900 --> 00:13:24,900 And, and that stands for RGB, a red green, blue. 255 00:13:25,266 --> 00:13:29,700 And each one of those, colors has its own intensity. 256 00:13:29,700 --> 00:13:32,700 So basically a pixel has, 257 00:13:32,800 --> 00:13:36,700 three, three values assigned to it. 258 00:13:36,833 --> 00:13:40,400 Each one of them is between 0 and 256 255. 259 00:13:40,933 --> 00:13:45,666 and therefore you can, find out what's this image, 260 00:13:46,200 --> 00:13:50,233 what color exactly this pixel is by combining those three values. 261 00:13:50,233 --> 00:13:53,233 And again, computers are going to be working with that. 262 00:13:53,366 --> 00:13:55,700 So that's, the foundation of it all. 263 00:13:55,700 --> 00:13:56,633 That's the red channel. 264 00:13:56,633 --> 00:13:58,966 The green channel, the blue channel. 265 00:13:58,966 --> 00:14:01,966 and finally, let's have a look at, 266 00:14:02,466 --> 00:14:05,733 for instance, an example, a very trivial example of, 267 00:14:06,300 --> 00:14:09,533 a smiling face in, in computer terms, 268 00:14:09,533 --> 00:14:14,933 if we just really simplify things instead of having from 0 to 255, 269 00:14:15,533 --> 00:14:17,066 instead of having those values 270 00:14:17,066 --> 00:14:20,700 just so that we can understand things better and really grasp the concepts, 271 00:14:20,900 --> 00:14:26,700 we're going to say zero is, is white one is black, right? 272 00:14:26,700 --> 00:14:30,400 So we're just going to simplify things to, to the extreme. 273 00:14:30,766 --> 00:14:33,766 And you will see that that image can be represented like that. 274 00:14:33,833 --> 00:14:35,800 So the reason why we've brought this up 275 00:14:35,800 --> 00:14:38,800 is because we go into all of our intuition stores. 276 00:14:38,800 --> 00:14:40,700 We're going to structure on images like this, 277 00:14:40,700 --> 00:14:43,700 which are very simple, but at the same time, then 278 00:14:43,700 --> 00:14:47,200 all those concepts can translate back to the 0 to 256 279 00:14:47,200 --> 00:14:50,266 range of values, and everything applies the same way there. 280 00:14:50,566 --> 00:14:52,300 And the steps that we're going to be going through with 281 00:14:52,300 --> 00:14:54,800 these images are step number one convolution. 282 00:14:54,800 --> 00:14:56,700 Step number two max pooling. 283 00:14:56,700 --> 00:14:58,366 Step number three flattening. 284 00:14:58,366 --> 00:15:00,433 And step number four full connection. 285 00:15:00,433 --> 00:15:03,300 And I can imagine that probably none of these words 286 00:15:03,300 --> 00:15:05,600 mean much to you at the moment. 287 00:15:05,600 --> 00:15:09,833 But by the end of this section of the course, you will understand 288 00:15:09,833 --> 00:15:13,833 them in great detail and exactly what they're doing. 289 00:15:13,833 --> 00:15:15,900 So we'll get started in the next tutorial. 290 00:15:15,900 --> 00:15:21,866 For now, the additional reading that you might want to look into is Yann LeCun 291 00:15:22,033 --> 00:15:27,600 original paper, that gave the rise to convolutional neural networks. 292 00:15:28,133 --> 00:15:31,166 it's called gradient based learning applied to document recognition. 293 00:15:31,633 --> 00:15:34,400 you may have seen this image before floating around the internet. 294 00:15:34,400 --> 00:15:35,700 It is from that paper. 295 00:15:35,700 --> 00:15:40,000 So if you want to go back to the very beginnings, of how 296 00:15:40,000 --> 00:15:43,533 it all happened, where it all came from, this is the paper to look into, 297 00:15:44,233 --> 00:15:46,266 and I look forward to seeing you in the next tutorial. 298 00:15:46,266 --> 00:15:48,233 Until then, enjoy deep learning.