1 00:00:00,300 --> 00:00:01,166 Hello my friends. 2 00:00:01,166 --> 00:00:04,933 Welcome back to this implementation and mostly welcome to part three 3 00:00:04,933 --> 00:00:07,800 training, the CNN. In the previous tutorial, 4 00:00:07,800 --> 00:00:11,700 we built the brain, which contained the eyes of the eye. 5 00:00:11,700 --> 00:00:14,066 You know, thanks to the convolutional layers. 6 00:00:14,066 --> 00:00:17,100 And now we're going to make the brain smart with that 7 00:00:17,100 --> 00:00:21,100 training of the CNN on all our training set images. 8 00:00:21,333 --> 00:00:24,200 And at the same time, as you see, we're going to evaluate 9 00:00:24,200 --> 00:00:27,366 our same model on the test set over the epochs. 10 00:00:27,366 --> 00:00:30,600 You know, we're going to train our CNN over 25 epochs. 11 00:00:30,800 --> 00:00:31,866 And at each epoch 12 00:00:31,866 --> 00:00:36,833 we'll actually see how our model is performing on our test set images. 13 00:00:37,166 --> 00:00:40,866 So this is a different kind of training as what we before because we always 14 00:00:40,866 --> 00:00:43,866 used to separate the training and the evaluation. 15 00:00:44,133 --> 00:00:46,200 But here this will happen at the same time. 16 00:00:46,200 --> 00:00:50,300 That's because we're doing some specific application which is computer vision. 17 00:00:50,900 --> 00:00:51,433 All right. 18 00:00:51,433 --> 00:00:52,266 Are you ready? 19 00:00:52,266 --> 00:00:53,200 Let's do this. 20 00:00:53,200 --> 00:00:57,000 Let's start with the first step compiling the CNN. 21 00:00:57,566 --> 00:00:57,866 All right. 22 00:00:57,866 --> 00:01:00,800 So in a new code cell we're going to compile the CNN. 23 00:01:00,800 --> 00:01:04,966 Meaning we're going to connect it to an up to miser a loss function. 24 00:01:04,966 --> 00:01:07,500 And some metrics. 25 00:01:07,500 --> 00:01:11,700 And well you know here once again we're doing binary classification. 26 00:01:11,700 --> 00:01:15,433 And so very simply we're going to compile our CNN 27 00:01:15,700 --> 00:01:19,266 exactly the same way as how we build our A. 28 00:01:19,266 --> 00:01:21,000 And then in the previous section, 29 00:01:21,000 --> 00:01:24,066 because indeed we're going to choose once again an Adam optimizer 30 00:01:24,066 --> 00:01:27,700 to, you know, perform stochastic gradient descent to update the weights 31 00:01:27,966 --> 00:01:32,833 in order to reduce the loss error between the predictions and the target. 32 00:01:33,300 --> 00:01:35,733 Then we're going to choose the same loss. 33 00:01:35,733 --> 00:01:38,933 You know, the binary cross-entropy loss once again 34 00:01:38,933 --> 00:01:42,366 because we're doing executives same task binary classification. 35 00:01:42,833 --> 00:01:44,533 And then same for the metrics. 36 00:01:44,533 --> 00:01:47,666 We're going to choose the accuracy metrics because that's, you know, 37 00:01:47,666 --> 00:01:50,800 the most relevant way to measure the performance 38 00:01:50,800 --> 00:01:54,433 of a classification model, which is exactly the case of our CNN. 39 00:01:55,000 --> 00:01:56,733 And therefore, to compile the CNN. 40 00:01:56,733 --> 00:01:58,633 Well, it's going to be a piece of cake, 41 00:01:58,633 --> 00:02:01,700 because indeed we're going to do exactly the same as before. 42 00:02:01,700 --> 00:02:04,700 So we're going to start by taking our CNN, 43 00:02:04,800 --> 00:02:08,533 from which we're going to call the compile method, 44 00:02:09,000 --> 00:02:13,433 which will take as inputs well first or up to miser, 45 00:02:13,766 --> 00:02:17,466 which will choose to be the Adam optimizer. 46 00:02:18,000 --> 00:02:23,000 Then the loss function, which will choose to be the binary 47 00:02:24,000 --> 00:02:26,566 cross entropy. 48 00:02:26,566 --> 00:02:27,466 All right. 49 00:02:27,466 --> 00:02:30,566 And finally final argument the metrics. 50 00:02:30,933 --> 00:02:32,133 We will choose only one. 51 00:02:32,133 --> 00:02:35,133 But remember we have the choice to take several of them. 52 00:02:35,166 --> 00:02:36,666 But just accuracy is fine. 53 00:02:36,666 --> 00:02:40,800 And therefore in these pair of square brackets we're going to input in quotes. 54 00:02:40,800 --> 00:02:42,800 Well that's accuracy. 55 00:02:42,800 --> 00:02:43,800 And done. 56 00:02:43,800 --> 00:02:47,833 This compiles successfully the CNN to the optimizer 57 00:02:47,833 --> 00:02:49,600 the loss function and the metric. 58 00:02:49,600 --> 00:02:51,633 So exactly the same as before. 59 00:02:51,633 --> 00:02:55,200 However now to train the CNN on the training set 60 00:02:55,200 --> 00:02:58,433 and evaluating it at the same time on the test set. 61 00:02:58,800 --> 00:03:01,500 Well it will not be exactly the same as before, but 62 00:03:01,500 --> 00:03:03,700 once again, very, very similar. 63 00:03:03,700 --> 00:03:04,633 Let's check this out. 64 00:03:04,633 --> 00:03:06,166 Let's create a new code cell. 65 00:03:06,166 --> 00:03:09,600 And now you can actually guess the first two steps. 66 00:03:09,600 --> 00:03:10,900 They're always the same. 67 00:03:10,900 --> 00:03:14,733 The first step is to take of course our CNN from which. 68 00:03:14,733 --> 00:03:16,266 And that's the second step. 69 00:03:16,266 --> 00:03:18,600 What method do we need to call here. 70 00:03:18,600 --> 00:03:21,000 Well once again this never changes. 71 00:03:21,000 --> 00:03:24,800 This is of course to fit method, the fit method, 72 00:03:24,800 --> 00:03:28,566 which as always will train the CNN on the training set. 73 00:03:29,133 --> 00:03:29,666 Okay. 74 00:03:29,666 --> 00:03:32,866 So this time what are going to be the inputs. 75 00:03:33,300 --> 00:03:35,766 Well the first input is always the same. 76 00:03:35,766 --> 00:03:36,400 It's going to be 77 00:03:36,400 --> 00:03:39,733 of course the set, you know, the data set on which you're going to train 78 00:03:39,866 --> 00:03:43,600 your model here, the CNN and that's of course the training set. 79 00:03:43,600 --> 00:03:46,633 And the name of the parameter for that is simply x 80 00:03:46,900 --> 00:03:51,400 and therefore will specify x to be, you know, our training set, 81 00:03:51,900 --> 00:03:56,933 that exact same training set which we created in part one. 82 00:03:57,133 --> 00:04:02,766 You know right here that training set, that training set to which we applied 83 00:04:02,766 --> 00:04:07,500 this image, data generated to to indeed perform image augmentation. 84 00:04:08,000 --> 00:04:08,400 All right. 85 00:04:08,400 --> 00:04:09,000 That training set. 86 00:04:09,000 --> 00:04:13,400 That's the first input of our parameter in this fit method 87 00:04:13,833 --> 00:04:16,500 okay then next parameter. 88 00:04:16,500 --> 00:04:16,766 All right. 89 00:04:16,766 --> 00:04:20,200 So the next parameter is this time you know the difference. 90 00:04:20,200 --> 00:04:22,466 The difference with what we did before. 91 00:04:22,466 --> 00:04:26,100 So it has to do of course with the fact that we're not only training the CNN 92 00:04:26,100 --> 00:04:30,733 on the training set, but also evaluating it at the same time on the test set. 93 00:04:31,000 --> 00:04:34,033 And that second parameter corresponds exactly to this. 94 00:04:34,200 --> 00:04:37,800 We have to specify here the validation data. 95 00:04:37,933 --> 00:04:39,266 That's the name of the parameter. 96 00:04:39,266 --> 00:04:44,433 But that's of course the set on which we want to evaluate our CNN. 97 00:04:44,433 --> 00:04:47,433 And as of course the test set that will be the value of the parameter. 98 00:04:47,700 --> 00:04:50,433 But the name of that parameter is, as I just said, 99 00:04:50,433 --> 00:04:54,000 validation underscore data. 100 00:04:54,000 --> 00:04:58,733 And this is equal, of course, to the test set. 101 00:04:58,733 --> 00:05:03,300 Once again, that exact same test set which we created here in part 102 00:05:03,300 --> 00:05:07,166 one when preprocessing it to set this one to which of course 103 00:05:07,300 --> 00:05:10,833 no transformation was applied, only features killing. 104 00:05:11,400 --> 00:05:13,900 Okay. So test set good. 105 00:05:13,900 --> 00:05:17,533 And now we have one final argument, which is of course 106 00:05:17,533 --> 00:05:19,466 you can totally guess what it is, right? 107 00:05:19,466 --> 00:05:23,766 This is the inevitable argument when training a deep neural network. 108 00:05:24,300 --> 00:05:29,100 I'm talking of course about the epochs parameter, which is the number of epochs, 109 00:05:29,633 --> 00:05:34,233 and well, you know, to justify to you which number I chose. 110 00:05:34,433 --> 00:05:37,433 Well, I actually start with ten epochs. 111 00:05:37,500 --> 00:05:40,900 And I noticed that the accuracy was not converging. 112 00:05:41,166 --> 00:05:43,500 So then I tried with 15 epochs, 113 00:05:43,500 --> 00:05:46,000 because you're going to see that one epoch is actually pretty slow. 114 00:05:46,000 --> 00:05:47,366 You know, it's actually a bit long. 115 00:05:47,366 --> 00:05:51,033 I mean, much longer than the epochs in the training of our previous neural 116 00:05:51,033 --> 00:05:51,766 network, you know? 117 00:05:51,766 --> 00:05:53,800 And so I started with ten. 118 00:05:53,800 --> 00:05:57,200 It was not enough then 15 still not enough, you know, still not converging. 119 00:05:57,300 --> 00:06:00,300 And then 25 and 25 was perfect. 120 00:06:00,700 --> 00:06:03,700 I had an accuracy that pretty much converged 121 00:06:04,133 --> 00:06:07,066 not only on the training set, but also on the test set, you'll see. 122 00:06:07,066 --> 00:06:10,066 So for the number of epochs here, we're going to choose 25. 123 00:06:10,233 --> 00:06:12,066 Feel free to increase it if you have time. 124 00:06:12,066 --> 00:06:15,066 You know, if you want to let your computer run for now. 125 00:06:15,133 --> 00:06:17,700 More with 25 epochs here, it will be fine. 126 00:06:17,700 --> 00:06:19,966 It will just take 10 to 15 minutes. 127 00:06:19,966 --> 00:06:20,933 So that will be all good. 128 00:06:20,933 --> 00:06:23,666 We'll get our results pretty fast. 129 00:06:23,666 --> 00:06:25,666 All right. And well actually that's it. 130 00:06:25,666 --> 00:06:28,666 That's only what we need to train our CNN 131 00:06:28,666 --> 00:06:32,700 on the training set while evaluating it on the test set. 132 00:06:33,133 --> 00:06:34,033 So perfect. 133 00:06:34,033 --> 00:06:35,666 We smashed by three. 134 00:06:35,666 --> 00:06:39,133 And we can now already move on to part four where 135 00:06:39,133 --> 00:06:43,500 we're going to make our single prediction, which I remind will consist 136 00:06:43,500 --> 00:06:49,500 of deploying our model on the two images of the single prediction 137 00:06:49,500 --> 00:06:53,700 folder, this one for which, of course, our model will have to recognize 138 00:06:53,700 --> 00:06:57,966 that there is a dog, and this one where our model will have to recognize 139 00:06:58,000 --> 00:06:59,133 there is a cat. 140 00:06:59,133 --> 00:07:00,866 So let's hope that it is right, 141 00:07:00,866 --> 00:07:03,266 but we won't have the predictions in the next tutorial. 142 00:07:03,266 --> 00:07:05,566 We will have them in the one after that. 143 00:07:05,566 --> 00:07:08,700 Because remember, we're going to run our implementation 144 00:07:08,700 --> 00:07:11,900 from Jupyter Notebook because we can do it in Google Colab. 145 00:07:11,900 --> 00:07:14,466 The data set is to be all right. 146 00:07:14,466 --> 00:07:18,800 So well, as soon as you're ready for part four, let's do this together. 147 00:07:18,966 --> 00:07:21,000 And until then, enjoy machine learning.