1 00:00:00,620 --> 00:00:07,880 I hope you're ready because now it's time to train a model on the full data we verified that everything 2 00:00:07,880 --> 00:00:09,010 is working. 3 00:00:09,080 --> 00:00:14,540 By going through checking our models predictions evaluating them and going Yep our model is definitely 4 00:00:14,540 --> 00:00:15,680 learning something. 5 00:00:15,740 --> 00:00:21,920 But since we've only trained it on 1000 images let's see if we can train a full scale model like our 6 00:00:21,920 --> 00:00:24,650 first full blown deep learning model. 7 00:00:24,650 --> 00:00:27,320 Training on 10000 plus images. 8 00:00:27,320 --> 00:00:32,500 And this is where the power of the functions that we created earlier are really going to come into play. 9 00:00:32,600 --> 00:00:35,800 So let's make a little heading for ourselves. 10 00:00:36,110 --> 00:00:38,750 Training a model. 11 00:00:38,750 --> 00:00:47,750 Actually let's call it training a big dog model and it might put a little merger here on the full data 12 00:00:47,870 --> 00:00:48,500 and you get it. 13 00:00:48,500 --> 00:00:52,070 It's a big dog model because we're training on lots of images of dogs. 14 00:00:52,880 --> 00:00:54,370 I love this project. 15 00:00:54,740 --> 00:01:00,360 Now code what we need is remember right up the top if you don't that's okay. 16 00:01:00,360 --> 00:01:01,720 We're gonna have a look at it now. 17 00:01:01,980 --> 00:01:10,680 We have X and Y land X land y and this is all of our training data. 18 00:01:10,680 --> 00:01:12,950 These are the file names and these are the labels. 19 00:01:12,990 --> 00:01:20,370 So we come up here X is all of the file names in train so if we have a look at the first maybe 10 of 20 00:01:20,370 --> 00:01:26,330 x 10 of X these are all our training image file names. 21 00:01:26,410 --> 00:01:33,410 We split these before into X train which is only 800 in length. 22 00:01:33,430 --> 00:01:34,030 There we go. 23 00:01:34,690 --> 00:01:37,030 So if we have a look at land X train 24 00:01:39,790 --> 00:01:40,900 only eight hundred images. 25 00:01:40,930 --> 00:01:44,210 So at the moment our model has only trained on eight hundred images. 26 00:01:44,440 --> 00:01:51,640 And if we want to have a look at why why is the labels associated with each file path in the form of 27 00:01:51,640 --> 00:01:53,090 boolean arrays. 28 00:01:53,260 --> 00:01:57,810 So what do we need to do to train a full model. 29 00:01:57,820 --> 00:02:01,720 If we come back to our Kino what's our workflow. 30 00:02:01,810 --> 00:02:04,480 Get Data ready turn it into tenses. 31 00:02:04,750 --> 00:02:07,050 Well we've got a function for that. 32 00:02:07,180 --> 00:02:09,220 Do you remember create data batches. 33 00:02:09,220 --> 00:02:10,270 If you don't it's right. 34 00:02:10,270 --> 00:02:16,270 We've created a function for that and we got pick a model to suit your problem so it tends to flow. 35 00:02:16,300 --> 00:02:16,900 We've got that. 36 00:02:16,900 --> 00:02:21,940 We can just use create model and then fit the model to the data and make a prediction. 37 00:02:21,940 --> 00:02:27,520 Let's give it a crack so let's go create. 38 00:02:28,090 --> 00:02:35,600 Create a data batch with the full data set. 39 00:02:35,670 --> 00:02:40,140 So this is where functions really help out is later on in your notebook. 40 00:02:40,140 --> 00:02:49,930 I told you it was worth it for data equals create data batches look at this. 41 00:02:49,940 --> 00:02:55,930 This is our function that we've written above creates data batches of data of image X and label Y pairs. 42 00:02:55,940 --> 00:02:59,360 Beautiful shuffles the data if it's training but doesn't. 43 00:02:59,420 --> 00:03:01,550 Well that's a bit of a hinderance we can't see that. 44 00:03:01,790 --> 00:03:05,620 Also accepts test data as input no labels so that's alright right. 45 00:03:05,630 --> 00:03:07,160 This is not training data. 46 00:03:07,490 --> 00:03:14,490 So let's create a full data data batch. 47 00:03:14,510 --> 00:03:23,390 Now we have a look at full data. 48 00:03:23,580 --> 00:03:28,410 It's a batch dataset beautiful with images and labels. 49 00:03:28,410 --> 00:03:29,430 Wonderful. 50 00:03:29,430 --> 00:03:31,890 And if you're wondering what create data batches actually does. 51 00:03:31,890 --> 00:03:33,300 Let's go right back up. 52 00:03:33,300 --> 00:03:39,300 If we have a look at our index here what do we do on visualizing data batches turn our data into batches 53 00:03:43,180 --> 00:03:49,960 so here's our big function create data batches it takes X and Y and here's what's happening. 54 00:03:49,960 --> 00:03:55,570 This is because it's a training dataset creating training data batches turns. 55 00:03:55,570 --> 00:04:02,950 X and Y and tenses it shuffles them and then it maps out processing image function as well as get labeled 56 00:04:02,950 --> 00:04:08,070 function to the data and then it turns the data into a batch size of 32. 57 00:04:08,170 --> 00:04:13,060 But we've already seen a video on on there so if you need a refresher go back and check out the video 58 00:04:13,090 --> 00:04:14,560 where we created this function. 59 00:04:15,220 --> 00:04:16,560 So let's come back to where we were. 60 00:04:18,260 --> 00:04:20,480 So now we've got our data batch. 61 00:04:20,480 --> 00:04:23,480 What's the next step. 62 00:04:23,480 --> 00:04:33,900 Well we need a model so let's do create a model for full model form model eagles. 63 00:04:33,930 --> 00:04:40,020 Create model mean what I create model function did no doctoring. 64 00:04:40,060 --> 00:04:41,080 Mm hmm. 65 00:04:41,270 --> 00:04:42,610 Well let's run that anyway. 66 00:04:42,620 --> 00:04:43,800 When go check it out. 67 00:04:43,850 --> 00:04:48,440 So building a model with TFR that Dev image net mobile Net V2. 68 00:04:48,440 --> 00:04:49,730 So we've got a model. 69 00:04:49,800 --> 00:04:50,150 Okay. 70 00:04:50,180 --> 00:04:52,880 We've got a batch dataset and we've got a model. 71 00:04:52,970 --> 00:04:58,390 Now if you want to check out what create model does where is it building a model. 72 00:04:59,760 --> 00:05:03,880 If we come down here this is our create model function. 73 00:05:04,590 --> 00:05:05,510 It doesn't have a doctorate. 74 00:05:05,510 --> 00:05:06,840 Maybe that's something you could add. 75 00:05:07,370 --> 00:05:09,470 So create a function which builds a carer's model. 76 00:05:10,400 --> 00:05:15,320 So what it's gonna go through is go through the steps that it took before to create a carer's sequential 77 00:05:15,320 --> 00:05:21,350 model using the model we downloaded from tensor flow hub and a dense layer on the top to make sure our 78 00:05:21,410 --> 00:05:24,450 outputs are the same size as the labels that we have. 79 00:05:24,560 --> 00:05:29,870 It's gonna compile the model with a loss function and optimizer which our friend Adam the guy at the 80 00:05:29,870 --> 00:05:35,660 bottom of the hill telling us how to descend down the hill at the International Hill descent championships 81 00:05:36,380 --> 00:05:41,510 and metrics is accuracy is what we're what we're measuring and then we're building the model feeding 82 00:05:41,510 --> 00:05:44,970 at the input shape which is the size of our image. 83 00:05:45,070 --> 00:05:47,320 Wonderful comeback. 84 00:05:47,480 --> 00:05:49,220 So how might we train this what's next. 85 00:05:49,220 --> 00:05:50,570 What do we do next. 86 00:05:52,000 --> 00:05:59,080 Well we created some callbacks so we might have to make some specific callbacks for our for our models 87 00:05:59,080 --> 00:06:15,300 so let's do that let's go create full model callbacks for model tensor bald eagles create tensor board 88 00:06:16,320 --> 00:06:21,600 callback there we go there's our function from before to create a tense a board callback remember the 89 00:06:21,600 --> 00:06:27,180 tensor board callback helps us to track the performance of our model and compare it to others and so 90 00:06:28,110 --> 00:06:40,700 there's no validation set when training on all the data so without early stopping callback we can't 91 00:06:40,790 --> 00:06:51,810 monitor validation accuracy so for out early stopping callback we need full model early stopping equals 92 00:06:51,810 --> 00:07:03,390 T F carers callbacks don't early stopping there we go and we're gonna get it to monitor accuracy when 93 00:07:03,390 --> 00:07:11,560 our model is training on the full data when its accuracy stops going up for three epochs we're going 94 00:07:11,560 --> 00:07:16,010 to get it to stop training so it doesn't over fit all we're going to try and prevent it from overheating 95 00:07:16,020 --> 00:07:17,370 too much. 96 00:07:17,370 --> 00:07:26,060 So shift and enter wonderful things like we might have a recipe to to start training our model. 97 00:07:26,060 --> 00:07:32,550 So fit the full model to the full data. 98 00:07:32,690 --> 00:07:36,190 So here's where we can go we can go full model not fit. 99 00:07:36,290 --> 00:07:44,350 So we're just calling the model we created dot fit X equals the full data. 100 00:07:44,590 --> 00:07:45,940 Now there's no validation set. 101 00:07:45,940 --> 00:07:51,150 Remember because we're training on the full data epochs is going to be 100. 102 00:07:51,160 --> 00:07:58,150 Or it could just be numb epochs because we've set Naam epochs before and we're gonna have callbacks 103 00:07:58,240 --> 00:08:02,620 which are the two callbacks we've just created. 104 00:08:02,620 --> 00:08:05,220 Now this is just doing the exact same steps as what we've done. 105 00:08:05,220 --> 00:08:08,410 Training a smaller model but this time with more data 106 00:08:11,630 --> 00:08:12,260 wonderful 107 00:08:16,180 --> 00:08:21,100 and now I put a little note here to give you a little head's up. 108 00:08:21,130 --> 00:08:29,710 And while I'm going to speak it out so you know as well running the cell below will take a little while 109 00:08:31,360 --> 00:08:42,520 maybe up to 30 minutes or maybe a little bit longer 30 minutes for the first epoch because the GPA way 110 00:08:42,580 --> 00:08:50,300 using in the runtime passed to load all of the images into memory. 111 00:08:51,250 --> 00:08:56,410 So the first epoch remember without smaller model on a thousand images the first epoch took a couple 112 00:08:56,410 --> 00:08:59,620 of minutes maybe three or four minutes as you could imagine. 113 00:08:59,620 --> 00:09:04,400 Because this one's on the full data set it's training with 10000 images. 114 00:09:04,660 --> 00:09:11,860 So it's gonna take a little bit longer to load all of the data for the first epoch so before we run 115 00:09:11,860 --> 00:09:15,910 this as long as every all the codes correct actually it might error out. 116 00:09:16,090 --> 00:09:21,910 That's why rather than train on the full data from the beginning we've made sure everything works. 117 00:09:21,910 --> 00:09:27,700 Because when training a full machine learning motor a full deep learning model can take a fairly long 118 00:09:27,700 --> 00:09:29,940 time and hours we're pretty lucky here. 119 00:09:29,950 --> 00:09:36,280 This is going to train within an hour because we're using a GP you but you could imagine if you're working 120 00:09:36,280 --> 00:09:39,660 at a bigger company you might leave something training for a week. 121 00:09:39,700 --> 00:09:44,350 So that's why you want to make sure that when you set up experiments imagine training a deep learning 122 00:09:44,350 --> 00:09:48,410 model for a whole week and then it turns out that it wasn't working. 123 00:09:48,610 --> 00:09:53,430 So that's why we did all of the steps above to make sure that everything's working. 124 00:09:53,470 --> 00:09:58,840 So before we train a massive model we can be sure okay things are in place. 125 00:09:58,840 --> 00:10:02,770 Let's kick off a full training round so you're ready. 126 00:10:02,770 --> 00:10:11,200 We're about to train our first full blown deep learning model on 10000 plus images dog vision is truly 127 00:10:11,230 --> 00:10:14,700 becoming a reality here with me. 128 00:10:14,700 --> 00:10:20,860 Three to one shift and into fingers crossed hopefully this works. 129 00:10:20,860 --> 00:10:27,050 We should see a little loading bar come up and because we've set number of epochs to 100 it's going 130 00:10:27,050 --> 00:10:29,580 to keep training for up to 100 epochs. 131 00:10:29,590 --> 00:10:35,680 But if it stops improving we've got our full model early stopping callback that's going to stop at training 132 00:10:35,800 --> 00:10:43,790 early and then we can check the results using tensor board after it's finished so we'll just wait for 133 00:10:43,790 --> 00:10:44,620 this to kick off. 134 00:10:45,230 --> 00:10:51,040 And then what I'll probably do is pause this video because it's going to take about half an hour. 135 00:10:51,070 --> 00:10:52,040 I'm going to take a little break. 136 00:10:52,040 --> 00:10:54,760 Maybe you could to have a little walk around chill out. 137 00:10:54,770 --> 00:10:58,190 Oh maybe it's gonna take a little longer than half an hour. 138 00:10:58,230 --> 00:11:02,090 This time it should go down unless my testing is incorrect. 139 00:11:02,100 --> 00:11:09,110 But wait a few seconds and then I'll come back to this video once the model has stopped. 140 00:11:09,120 --> 00:11:17,260 So once the model early stopping the full model early stopping callback has kicked in Okay so I'm going 141 00:11:17,260 --> 00:11:20,260 to let this run for as long as it needs to run. 142 00:11:20,260 --> 00:11:23,490 I'll come back once my model has stopped itself. 143 00:11:23,530 --> 00:11:27,690 Thanks to the early stopping callback so I will see you in a couple of seconds. 144 00:11:27,700 --> 00:11:29,940 But for me it's gonna be a couple hours in the future. 145 00:11:30,010 --> 00:11:31,990 Maybe it's a couple hours in your future as well. 146 00:11:33,930 --> 00:11:35,040 And we're back. 147 00:11:35,580 --> 00:11:36,310 All righty. 148 00:11:36,330 --> 00:11:41,700 So let's have a look at our fully trained model as you can see my first epoch took a fairly long time 149 00:11:41,700 --> 00:11:44,930 and this is actually let's take a moment if you finish this. 150 00:11:44,930 --> 00:11:50,150 If you've gone through this you need to put yourself on the back because this is like a rite of passage. 151 00:11:50,150 --> 00:11:55,490 Training your first end and deep learning neural network and it's taking a long time. 152 00:11:55,490 --> 00:11:59,330 One of the things you'll have to figure out when you become a data scientist or a machine learning engineer 153 00:11:59,690 --> 00:12:03,980 is ways to pass the time whilst your model trains. 154 00:12:03,980 --> 00:12:08,420 So I went for a walk with my dogs because this took one how many seconds is that. 155 00:12:08,450 --> 00:12:15,890 So four thousand seven hundred thirty divided by 60 78 minutes. 156 00:12:15,890 --> 00:12:18,320 So just over an hour to do the first epoch. 157 00:12:18,320 --> 00:12:24,410 But as I said after the images are loaded into the jeep use memory checking out how fast it goes after 158 00:12:24,410 --> 00:12:30,170 that and what you probably want to do is as this cell is running I should've told you this before but 159 00:12:30,170 --> 00:12:33,460 you can take note for the future as this cell is running. 160 00:12:33,590 --> 00:12:37,000 You want to have a save model running after it. 161 00:12:37,340 --> 00:12:44,740 That way once your model stops training thanks to early stopping our early stopping callback up here. 162 00:12:44,870 --> 00:12:51,260 It's going to automatically call our save model function and save our full model to our models directory. 163 00:12:51,260 --> 00:13:00,470 Now let's come up here and have a look into models Oh I've got a few there so let's see it's gonna be 164 00:13:00,470 --> 00:13:02,320 the one with the. 165 00:13:02,420 --> 00:13:02,980 There we go. 166 00:13:03,020 --> 00:13:04,990 Full Image set mobile Net. 167 00:13:05,120 --> 00:13:11,510 2nd of February and I started at about 5 p.m. So this time is actually wrong for me because I'm not 168 00:13:11,510 --> 00:13:13,940 sure maybe date time is getting the wrong time for me. 169 00:13:13,940 --> 00:13:15,860 But there we go that's saved here. 170 00:13:16,040 --> 00:13:19,810 We've got our path so now we can have a look. 171 00:13:20,070 --> 00:13:22,040 Our model actually ended up doing pretty well. 172 00:13:22,060 --> 00:13:22,450 Look at that. 173 00:13:22,450 --> 00:13:25,150 That's training accuracy but I'm a little bit skeptic here. 174 00:13:25,150 --> 00:13:25,690 You know why. 175 00:13:25,690 --> 00:13:27,400 Because we aren't validating this. 176 00:13:27,400 --> 00:13:30,960 This is just a model training on the full training data. 177 00:13:31,460 --> 00:13:33,120 So let's come here. 178 00:13:33,130 --> 00:13:35,710 It's training on all the images in this train file. 179 00:13:35,730 --> 00:13:36,250 Okay. 180 00:13:36,280 --> 00:13:45,920 X and Y and so we can't really test on a validation set but what we can do is import the test data set 181 00:13:46,670 --> 00:13:48,890 that we originally downloaded from Kaggle. 182 00:13:48,890 --> 00:13:53,990 Here we go the test file and we can import those. 183 00:13:54,020 --> 00:14:01,160 Turn them into a data bunch and then use our model to make predictions and submit them to Kaggle. 184 00:14:01,160 --> 00:14:03,860 How epic is this so let's our model is saved. 185 00:14:03,920 --> 00:14:11,110 Let's test out our load model function so loaded for model equals load model. 186 00:14:12,130 --> 00:14:13,720 We're just gonna copy this part here 187 00:14:17,620 --> 00:14:24,180 we'll go there loading the full model and if we copy this line here 188 00:14:27,010 --> 00:14:28,110 we don't want to put it there. 189 00:14:28,120 --> 00:14:30,990 We want to put it in here as a string. 190 00:14:31,040 --> 00:14:31,900 Let's check that out 191 00:14:37,480 --> 00:14:42,070 wonderful that's going through that's what we're after. 192 00:14:42,070 --> 00:14:47,810 So again we're getting those warnings that the tensor fly documentation says can be ignored. 193 00:14:47,890 --> 00:14:48,800 All righty. 194 00:14:49,030 --> 00:14:50,590 Now what's next. 195 00:14:50,590 --> 00:14:56,890 Well I think now that we've got a fully trained model on all of the 10000 plus images I think we see 196 00:14:56,890 --> 00:15:01,430 how to make some predictions on the test dataset so let's do that the next video.