1
00:00:00,620 --> 00:00:07,880
I hope you're ready because now it's time to train a model on the full data we verified that everything

2
00:00:07,880 --> 00:00:09,010
is working.

3
00:00:09,080 --> 00:00:14,540
By going through checking our models predictions evaluating them and going Yep our model is definitely

4
00:00:14,540 --> 00:00:15,680
learning something.

5
00:00:15,740 --> 00:00:21,920
But since we've only trained it on 1000 images let's see if we can train a full scale model like our

6
00:00:21,920 --> 00:00:24,650
first full blown deep learning model.

7
00:00:24,650 --> 00:00:27,320
Training on 10000 plus images.

8
00:00:27,320 --> 00:00:32,500
And this is where the power of the functions that we created earlier are really going to come into play.

9
00:00:32,600 --> 00:00:35,800
So let's make a little heading for ourselves.

10
00:00:36,110 --> 00:00:38,750
Training a model.

11
00:00:38,750 --> 00:00:47,750
Actually let's call it training a big dog model and it might put a little merger here on the full data

12
00:00:47,870 --> 00:00:48,500
and you get it.

13
00:00:48,500 --> 00:00:52,070
It's a big dog model because we're training on lots of images of dogs.

14
00:00:52,880 --> 00:00:54,370
I love this project.

15
00:00:54,740 --> 00:01:00,360
Now code what we need is remember right up the top if you don't that's okay.

16
00:01:00,360 --> 00:01:01,720
We're gonna have a look at it now.

17
00:01:01,980 --> 00:01:10,680
We have X and Y land X land y and this is all of our training data.

18
00:01:10,680 --> 00:01:12,950
These are the file names and these are the labels.

19
00:01:12,990 --> 00:01:20,370
So we come up here X is all of the file names in train so if we have a look at the first maybe 10 of

20
00:01:20,370 --> 00:01:26,330
x 10 of X these are all our training image file names.

21
00:01:26,410 --> 00:01:33,410
We split these before into X train which is only 800 in length.

22
00:01:33,430 --> 00:01:34,030
There we go.

23
00:01:34,690 --> 00:01:37,030
So if we have a look at land X train

24
00:01:39,790 --> 00:01:40,900
only eight hundred images.

25
00:01:40,930 --> 00:01:44,210
So at the moment our model has only trained on eight hundred images.

26
00:01:44,440 --> 00:01:51,640
And if we want to have a look at why why is the labels associated with each file path in the form of

27
00:01:51,640 --> 00:01:53,090
boolean arrays.

28
00:01:53,260 --> 00:01:57,810
So what do we need to do to train a full model.

29
00:01:57,820 --> 00:02:01,720
If we come back to our Kino what's our workflow.

30
00:02:01,810 --> 00:02:04,480
Get Data ready turn it into tenses.

31
00:02:04,750 --> 00:02:07,050
Well we've got a function for that.

32
00:02:07,180 --> 00:02:09,220
Do you remember create data batches.

33
00:02:09,220 --> 00:02:10,270
If you don't it's right.

34
00:02:10,270 --> 00:02:16,270
We've created a function for that and we got pick a model to suit your problem so it tends to flow.

35
00:02:16,300 --> 00:02:16,900
We've got that.

36
00:02:16,900 --> 00:02:21,940
We can just use create model and then fit the model to the data and make a prediction.

37
00:02:21,940 --> 00:02:27,520
Let's give it a crack so let's go create.

38
00:02:28,090 --> 00:02:35,600
Create a data batch with the full data set.

39
00:02:35,670 --> 00:02:40,140
So this is where functions really help out is later on in your notebook.

40
00:02:40,140 --> 00:02:49,930
I told you it was worth it for data equals create data batches look at this.

41
00:02:49,940 --> 00:02:55,930
This is our function that we've written above creates data batches of data of image X and label Y pairs.

42
00:02:55,940 --> 00:02:59,360
Beautiful shuffles the data if it's training but doesn't.

43
00:02:59,420 --> 00:03:01,550
Well that's a bit of a hinderance we can't see that.

44
00:03:01,790 --> 00:03:05,620
Also accepts test data as input no labels so that's alright right.

45
00:03:05,630 --> 00:03:07,160
This is not training data.

46
00:03:07,490 --> 00:03:14,490
So let's create a full data data batch.

47
00:03:14,510 --> 00:03:23,390
Now we have a look at full data.

48
00:03:23,580 --> 00:03:28,410
It's a batch dataset beautiful with images and labels.

49
00:03:28,410 --> 00:03:29,430
Wonderful.

50
00:03:29,430 --> 00:03:31,890
And if you're wondering what create data batches actually does.

51
00:03:31,890 --> 00:03:33,300
Let's go right back up.

52
00:03:33,300 --> 00:03:39,300
If we have a look at our index here what do we do on visualizing data batches turn our data into batches

53
00:03:43,180 --> 00:03:49,960
so here's our big function create data batches it takes X and Y and here's what's happening.

54
00:03:49,960 --> 00:03:55,570
This is because it's a training dataset creating training data batches turns.

55
00:03:55,570 --> 00:04:02,950
X and Y and tenses it shuffles them and then it maps out processing image function as well as get labeled

56
00:04:02,950 --> 00:04:08,070
function to the data and then it turns the data into a batch size of 32.

57
00:04:08,170 --> 00:04:13,060
But we've already seen a video on on there so if you need a refresher go back and check out the video

58
00:04:13,090 --> 00:04:14,560
where we created this function.

59
00:04:15,220 --> 00:04:16,560
So let's come back to where we were.

60
00:04:18,260 --> 00:04:20,480
So now we've got our data batch.

61
00:04:20,480 --> 00:04:23,480
What's the next step.

62
00:04:23,480 --> 00:04:33,900
Well we need a model so let's do create a model for full model form model eagles.

63
00:04:33,930 --> 00:04:40,020
Create model mean what I create model function did no doctoring.

64
00:04:40,060 --> 00:04:41,080
Mm hmm.

65
00:04:41,270 --> 00:04:42,610
Well let's run that anyway.

66
00:04:42,620 --> 00:04:43,800
When go check it out.

67
00:04:43,850 --> 00:04:48,440
So building a model with TFR that Dev image net mobile Net V2.

68
00:04:48,440 --> 00:04:49,730
So we've got a model.

69
00:04:49,800 --> 00:04:50,150
Okay.

70
00:04:50,180 --> 00:04:52,880
We've got a batch dataset and we've got a model.

71
00:04:52,970 --> 00:04:58,390
Now if you want to check out what create model does where is it building a model.

72
00:04:59,760 --> 00:05:03,880
If we come down here this is our create model function.

73
00:05:04,590 --> 00:05:05,510
It doesn't have a doctorate.

74
00:05:05,510 --> 00:05:06,840
Maybe that's something you could add.

75
00:05:07,370 --> 00:05:09,470
So create a function which builds a carer's model.

76
00:05:10,400 --> 00:05:15,320
So what it's gonna go through is go through the steps that it took before to create a carer's sequential

77
00:05:15,320 --> 00:05:21,350
model using the model we downloaded from tensor flow hub and a dense layer on the top to make sure our

78
00:05:21,410 --> 00:05:24,450
outputs are the same size as the labels that we have.

79
00:05:24,560 --> 00:05:29,870
It's gonna compile the model with a loss function and optimizer which our friend Adam the guy at the

80
00:05:29,870 --> 00:05:35,660
bottom of the hill telling us how to descend down the hill at the International Hill descent championships

81
00:05:36,380 --> 00:05:41,510
and metrics is accuracy is what we're what we're measuring and then we're building the model feeding

82
00:05:41,510 --> 00:05:44,970
at the input shape which is the size of our image.

83
00:05:45,070 --> 00:05:47,320
Wonderful comeback.

84
00:05:47,480 --> 00:05:49,220
So how might we train this what's next.

85
00:05:49,220 --> 00:05:50,570
What do we do next.

86
00:05:52,000 --> 00:05:59,080
Well we created some callbacks so we might have to make some specific callbacks for our for our models

87
00:05:59,080 --> 00:06:15,300
so let's do that let's go create full model callbacks for model tensor bald eagles create tensor board

88
00:06:16,320 --> 00:06:21,600
callback there we go there's our function from before to create a tense a board callback remember the

89
00:06:21,600 --> 00:06:27,180
tensor board callback helps us to track the performance of our model and compare it to others and so

90
00:06:28,110 --> 00:06:40,700
there's no validation set when training on all the data so without early stopping callback we can't

91
00:06:40,790 --> 00:06:51,810
monitor validation accuracy so for out early stopping callback we need full model early stopping equals

92
00:06:51,810 --> 00:07:03,390
T F carers callbacks don't early stopping there we go and we're gonna get it to monitor accuracy when

93
00:07:03,390 --> 00:07:11,560
our model is training on the full data when its accuracy stops going up for three epochs we're going

94
00:07:11,560 --> 00:07:16,010
to get it to stop training so it doesn't over fit all we're going to try and prevent it from overheating

95
00:07:16,020 --> 00:07:17,370
too much.

96
00:07:17,370 --> 00:07:26,060
So shift and enter wonderful things like we might have a recipe to to start training our model.

97
00:07:26,060 --> 00:07:32,550
So fit the full model to the full data.

98
00:07:32,690 --> 00:07:36,190
So here's where we can go we can go full model not fit.

99
00:07:36,290 --> 00:07:44,350
So we're just calling the model we created dot fit X equals the full data.

100
00:07:44,590 --> 00:07:45,940
Now there's no validation set.

101
00:07:45,940 --> 00:07:51,150
Remember because we're training on the full data epochs is going to be 100.

102
00:07:51,160 --> 00:07:58,150
Or it could just be numb epochs because we've set Naam epochs before and we're gonna have callbacks

103
00:07:58,240 --> 00:08:02,620
which are the two callbacks we've just created.

104
00:08:02,620 --> 00:08:05,220
Now this is just doing the exact same steps as what we've done.

105
00:08:05,220 --> 00:08:08,410
Training a smaller model but this time with more data

106
00:08:11,630 --> 00:08:12,260
wonderful

107
00:08:16,180 --> 00:08:21,100
and now I put a little note here to give you a little head's up.

108
00:08:21,130 --> 00:08:29,710
And while I'm going to speak it out so you know as well running the cell below will take a little while

109
00:08:31,360 --> 00:08:42,520
maybe up to 30 minutes or maybe a little bit longer 30 minutes for the first epoch because the GPA way

110
00:08:42,580 --> 00:08:50,300
using in the runtime passed to load all of the images into memory.

111
00:08:51,250 --> 00:08:56,410
So the first epoch remember without smaller model on a thousand images the first epoch took a couple

112
00:08:56,410 --> 00:08:59,620
of minutes maybe three or four minutes as you could imagine.

113
00:08:59,620 --> 00:09:04,400
Because this one's on the full data set it's training with 10000 images.

114
00:09:04,660 --> 00:09:11,860
So it's gonna take a little bit longer to load all of the data for the first epoch so before we run

115
00:09:11,860 --> 00:09:15,910
this as long as every all the codes correct actually it might error out.

116
00:09:16,090 --> 00:09:21,910
That's why rather than train on the full data from the beginning we've made sure everything works.

117
00:09:21,910 --> 00:09:27,700
Because when training a full machine learning motor a full deep learning model can take a fairly long

118
00:09:27,700 --> 00:09:29,940
time and hours we're pretty lucky here.

119
00:09:29,950 --> 00:09:36,280
This is going to train within an hour because we're using a GP you but you could imagine if you're working

120
00:09:36,280 --> 00:09:39,660
at a bigger company you might leave something training for a week.

121
00:09:39,700 --> 00:09:44,350
So that's why you want to make sure that when you set up experiments imagine training a deep learning

122
00:09:44,350 --> 00:09:48,410
model for a whole week and then it turns out that it wasn't working.

123
00:09:48,610 --> 00:09:53,430
So that's why we did all of the steps above to make sure that everything's working.

124
00:09:53,470 --> 00:09:58,840
So before we train a massive model we can be sure okay things are in place.

125
00:09:58,840 --> 00:10:02,770
Let's kick off a full training round so you're ready.

126
00:10:02,770 --> 00:10:11,200
We're about to train our first full blown deep learning model on 10000 plus images dog vision is truly

127
00:10:11,230 --> 00:10:14,700
becoming a reality here with me.

128
00:10:14,700 --> 00:10:20,860
Three to one shift and into fingers crossed hopefully this works.

129
00:10:20,860 --> 00:10:27,050
We should see a little loading bar come up and because we've set number of epochs to 100 it's going

130
00:10:27,050 --> 00:10:29,580
to keep training for up to 100 epochs.

131
00:10:29,590 --> 00:10:35,680
But if it stops improving we've got our full model early stopping callback that's going to stop at training

132
00:10:35,800 --> 00:10:43,790
early and then we can check the results using tensor board after it's finished so we'll just wait for

133
00:10:43,790 --> 00:10:44,620
this to kick off.

134
00:10:45,230 --> 00:10:51,040
And then what I'll probably do is pause this video because it's going to take about half an hour.

135
00:10:51,070 --> 00:10:52,040
I'm going to take a little break.

136
00:10:52,040 --> 00:10:54,760
Maybe you could to have a little walk around chill out.

137
00:10:54,770 --> 00:10:58,190
Oh maybe it's gonna take a little longer than half an hour.

138
00:10:58,230 --> 00:11:02,090
This time it should go down unless my testing is incorrect.

139
00:11:02,100 --> 00:11:09,110
But wait a few seconds and then I'll come back to this video once the model has stopped.

140
00:11:09,120 --> 00:11:17,260
So once the model early stopping the full model early stopping callback has kicked in Okay so I'm going

141
00:11:17,260 --> 00:11:20,260
to let this run for as long as it needs to run.

142
00:11:20,260 --> 00:11:23,490
I'll come back once my model has stopped itself.

143
00:11:23,530 --> 00:11:27,690
Thanks to the early stopping callback so I will see you in a couple of seconds.

144
00:11:27,700 --> 00:11:29,940
But for me it's gonna be a couple hours in the future.

145
00:11:30,010 --> 00:11:31,990
Maybe it's a couple hours in your future as well.

146
00:11:33,930 --> 00:11:35,040
And we're back.

147
00:11:35,580 --> 00:11:36,310
All righty.

148
00:11:36,330 --> 00:11:41,700
So let's have a look at our fully trained model as you can see my first epoch took a fairly long time

149
00:11:41,700 --> 00:11:44,930
and this is actually let's take a moment if you finish this.

150
00:11:44,930 --> 00:11:50,150
If you've gone through this you need to put yourself on the back because this is like a rite of passage.

151
00:11:50,150 --> 00:11:55,490
Training your first end and deep learning neural network and it's taking a long time.

152
00:11:55,490 --> 00:11:59,330
One of the things you'll have to figure out when you become a data scientist or a machine learning engineer

153
00:11:59,690 --> 00:12:03,980
is ways to pass the time whilst your model trains.

154
00:12:03,980 --> 00:12:08,420
So I went for a walk with my dogs because this took one how many seconds is that.

155
00:12:08,450 --> 00:12:15,890
So four thousand seven hundred thirty divided by 60 78 minutes.

156
00:12:15,890 --> 00:12:18,320
So just over an hour to do the first epoch.

157
00:12:18,320 --> 00:12:24,410
But as I said after the images are loaded into the jeep use memory checking out how fast it goes after

158
00:12:24,410 --> 00:12:30,170
that and what you probably want to do is as this cell is running I should've told you this before but

159
00:12:30,170 --> 00:12:33,460
you can take note for the future as this cell is running.

160
00:12:33,590 --> 00:12:37,000
You want to have a save model running after it.

161
00:12:37,340 --> 00:12:44,740
That way once your model stops training thanks to early stopping our early stopping callback up here.

162
00:12:44,870 --> 00:12:51,260
It's going to automatically call our save model function and save our full model to our models directory.

163
00:12:51,260 --> 00:13:00,470
Now let's come up here and have a look into models Oh I've got a few there so let's see it's gonna be

164
00:13:00,470 --> 00:13:02,320
the one with the.

165
00:13:02,420 --> 00:13:02,980
There we go.

166
00:13:03,020 --> 00:13:04,990
Full Image set mobile Net.

167
00:13:05,120 --> 00:13:11,510
2nd of February and I started at about 5 p.m. So this time is actually wrong for me because I'm not

168
00:13:11,510 --> 00:13:13,940
sure maybe date time is getting the wrong time for me.

169
00:13:13,940 --> 00:13:15,860
But there we go that's saved here.

170
00:13:16,040 --> 00:13:19,810
We've got our path so now we can have a look.

171
00:13:20,070 --> 00:13:22,040
Our model actually ended up doing pretty well.

172
00:13:22,060 --> 00:13:22,450
Look at that.

173
00:13:22,450 --> 00:13:25,150
That's training accuracy but I'm a little bit skeptic here.

174
00:13:25,150 --> 00:13:25,690
You know why.

175
00:13:25,690 --> 00:13:27,400
Because we aren't validating this.

176
00:13:27,400 --> 00:13:30,960
This is just a model training on the full training data.

177
00:13:31,460 --> 00:13:33,120
So let's come here.

178
00:13:33,130 --> 00:13:35,710
It's training on all the images in this train file.

179
00:13:35,730 --> 00:13:36,250
Okay.

180
00:13:36,280 --> 00:13:45,920
X and Y and so we can't really test on a validation set but what we can do is import the test data set

181
00:13:46,670 --> 00:13:48,890
that we originally downloaded from Kaggle.

182
00:13:48,890 --> 00:13:53,990
Here we go the test file and we can import those.

183
00:13:54,020 --> 00:14:01,160
Turn them into a data bunch and then use our model to make predictions and submit them to Kaggle.

184
00:14:01,160 --> 00:14:03,860
How epic is this so let's our model is saved.

185
00:14:03,920 --> 00:14:11,110
Let's test out our load model function so loaded for model equals load model.

186
00:14:12,130 --> 00:14:13,720
We're just gonna copy this part here

187
00:14:17,620 --> 00:14:24,180
we'll go there loading the full model and if we copy this line here

188
00:14:27,010 --> 00:14:28,110
we don't want to put it there.

189
00:14:28,120 --> 00:14:30,990
We want to put it in here as a string.

190
00:14:31,040 --> 00:14:31,900
Let's check that out

191
00:14:37,480 --> 00:14:42,070
wonderful that's going through that's what we're after.

192
00:14:42,070 --> 00:14:47,810
So again we're getting those warnings that the tensor fly documentation says can be ignored.

193
00:14:47,890 --> 00:14:48,800
All righty.

194
00:14:49,030 --> 00:14:50,590
Now what's next.

195
00:14:50,590 --> 00:14:56,890
Well I think now that we've got a fully trained model on all of the 10000 plus images I think we see

196
00:14:56,890 --> 00:15:01,430
how to make some predictions on the test dataset so let's do that the next video.