1
00:00:00,300 --> 00:00:03,210
In the last video we broke down what's going on here.

2
00:00:04,370 --> 00:00:12,020
What we're doing is essentially just passing Kerry's sequential model and input a.k.a. our image of

3
00:00:12,020 --> 00:00:19,460
a dog and then mobile Net V2 a.k.a. this model you are all that we're using is going to find the patterns

4
00:00:19,490 --> 00:00:26,090
inside it and then we're going hey no we don't want the same shape as mobile Net V2 we'd rather convert

5
00:00:26,090 --> 00:00:33,530
it to the size of our own output shape which is the amount of labels that we have with the amount of

6
00:00:33,530 --> 00:00:35,770
unique labels that we have.

7
00:00:35,780 --> 00:00:38,570
So now let's talk about what's going on in compile.

8
00:00:38,810 --> 00:00:47,380
And by the way if you do want to dive deeper on any of this I encourage you to search it up and figure

9
00:00:47,380 --> 00:00:49,000
it out and try out yourself.

10
00:00:49,000 --> 00:00:51,840
We're going to see the outputs of this later on.

11
00:00:51,850 --> 00:00:56,860
So we're going to actually see the code running but whenever you look up something and try to figure

12
00:00:56,860 --> 00:01:01,570
it out even if you don't understand it the first time trying to figure things out for yourself is a

13
00:01:01,570 --> 00:01:03,710
way to really cement your knowledge.

14
00:01:03,730 --> 00:01:06,370
So if anything here you want to look up and find more.

15
00:01:06,370 --> 00:01:11,650
Don't be afraid to ask questions and don't be afraid to look it up and check it out what's going on

16
00:01:11,650 --> 00:01:13,880
behind the scenes for yourself.

17
00:01:13,890 --> 00:01:18,460
Now what is happening with model dot compile this is something else you'll see when we're building a

18
00:01:18,460 --> 00:01:27,670
model with carers and so I think this one is best explained with a story let's say you're at the International

19
00:01:27,730 --> 00:01:29,860
Hill descending championships.

20
00:01:29,860 --> 00:01:30,910
That's right.

21
00:01:31,060 --> 00:01:39,080
Your one of the world's best at going down hills and you start standing on the top of a hill and now

22
00:01:39,080 --> 00:01:47,380
your goal is to get to the bottom of the hill but the catch is that you're blindfolded and luckily your

23
00:01:47,380 --> 00:01:54,850
friend Adam is standing at the bottom of the hill shouting instructions at you on how to get down and

24
00:01:54,870 --> 00:01:59,180
at the bottom of the hill there's a judge evaluating how well you're doing.

25
00:01:59,530 --> 00:02:01,760
They know where you need to end up.

26
00:02:02,020 --> 00:02:08,950
So they can pair how you're doing to where you're supposed to be and their comparison is how you get

27
00:02:08,950 --> 00:02:14,550
scored a.k.a. your accuracy at getting down the hill you might be wondering.

28
00:02:14,720 --> 00:02:18,280
Daniel why am I at the International Hill descending championships.

29
00:02:18,290 --> 00:02:23,050
Well let me break it down this is where the model dot compile comes into play.

30
00:02:24,560 --> 00:02:29,770
Let's channel into this terminology because this can seem very confusing when you first begin.

31
00:02:29,930 --> 00:02:34,730
So the loss so the loss is the height of the hill.

32
00:02:34,760 --> 00:02:42,110
Now our model's goal is to minimize the loss getting it to zero a.k.a. getting to the bottom of the

33
00:02:42,110 --> 00:02:43,230
hill.

34
00:02:43,250 --> 00:02:46,960
So this means that the model is learning perfectly.

35
00:02:46,970 --> 00:02:50,380
Loss is a measure of when the model is learning.

36
00:02:51,050 --> 00:02:59,440
So when it's going through the training set and comparing the image to a label the loss is a measure

37
00:02:59,440 --> 00:03:01,630
of how well the model is guessing.

38
00:03:01,660 --> 00:03:07,090
So when it's trying to make a prediction what is this learning the higher the loss the worse the prediction

39
00:03:07,090 --> 00:03:08,070
is.

40
00:03:08,230 --> 00:03:13,510
So the worse the model is learning patterns the lower the loss the better the model is learning patterns

41
00:03:13,510 --> 00:03:18,370
just like you if you're descending the hill the higher you are on the Hill at the International Hill

42
00:03:18,370 --> 00:03:22,290
descending championships the higher you are the worse off you're doing.

43
00:03:22,510 --> 00:03:25,160
So your goal is to get to the bottom of the hill.

44
00:03:25,270 --> 00:03:32,090
And now if we come back let's discuss what the optimizer is in our Hill story so your friend Adam at

45
00:03:32,090 --> 00:03:39,050
the bottom of the Hill who's telling you how to get down the hill is the optimizer he is the one telling

46
00:03:39,050 --> 00:03:43,220
you how to navigate the hill a.k.a. lower the lost function.

47
00:03:43,220 --> 00:03:47,600
So how high you are on the Hill because he can see what's going on.

48
00:03:47,600 --> 00:03:56,120
He can see your movements so he's basing his instructions on what you've done so far and now his name

49
00:03:56,120 --> 00:04:02,270
is Adam your friend's name is Adam because the Adam optimizer Yes it's actually called the Adam optimizer

50
00:04:02,960 --> 00:04:10,770
is a great general optimizer which performs well on most models so if we have a look at this we go to

51
00:04:10,840 --> 00:04:18,940
Adam optimizer Adam the latest trends in deep learning gentle instruction for Adam optimization and

52
00:04:18,940 --> 00:04:30,400
if we go machine learning model optimizes it's gonna tell us a few more types of optimization algorithms.

53
00:04:30,700 --> 00:04:32,220
So this is what's going on here.

54
00:04:32,230 --> 00:04:35,980
The lost function is how high you are on the Hill you're trying to minimize that because you're at the

55
00:04:36,040 --> 00:04:38,790
International Hill descent championships.

56
00:04:39,100 --> 00:04:42,400
The optimizer is your friend Adam at the bottom.

57
00:04:42,400 --> 00:04:49,210
You can also use another optimizer called ALM s prop or stochastic gradient descent but generally to

58
00:04:49,210 --> 00:04:51,940
begin with Adam is pretty good on most problems.

59
00:04:52,150 --> 00:04:57,190
So Adam's telling you how to get down the hill because rember you're blindfolded you're a model who's

60
00:04:57,190 --> 00:05:02,320
going through the training data for the first time and trying to learn patterns or trying to go down

61
00:05:02,320 --> 00:05:05,290
the bottom of the hill for the first time blindfolded.

62
00:05:05,290 --> 00:05:08,750
And then finally the metrics here.

63
00:05:08,860 --> 00:05:11,380
This is the on look at the bottom of the hill.

64
00:05:11,440 --> 00:05:16,750
This is the judge of how well your performance is at the bottom of the hill telling you how you're going

65
00:05:16,750 --> 00:05:17,640
at the championship.

66
00:05:19,050 --> 00:05:28,300
So in our case it's giving us the accuracy of how well our model is predicting the correct image label.

67
00:05:29,280 --> 00:05:34,550
So that's a fair bit to go through but these are three parts of most deep learning models.

68
00:05:34,680 --> 00:05:37,600
Basically all of them you're going to have some sort of lost function.

69
00:05:37,680 --> 00:05:43,350
So how well your model is guessing you're gonna have an optimizer function so a function that helps

70
00:05:43,350 --> 00:05:51,720
your model improve its guesses and then a metric which is a way of evaluating those guesses after it's

71
00:05:51,720 --> 00:05:52,830
learned.

72
00:05:52,890 --> 00:05:57,240
So this is a step that we're gonna have to take with when we're building any carries deep learning model.

73
00:05:57,240 --> 00:06:05,580
We define the model as some sort of layers and then we define how the model is going to learn so hold

74
00:06:05,580 --> 00:06:07,070
onto that little story.

75
00:06:07,170 --> 00:06:11,400
If you have any questions about the story or if you want to figure out what lost function and want to

76
00:06:11,400 --> 00:06:20,540
use search up something like what loss function should I use how to choose loss functions when training

77
00:06:20,540 --> 00:06:26,270
deep learning models beautiful is a great resource or it might be.

78
00:06:26,740 --> 00:06:32,590
I'll leave a resource there how you can choose a lost function but mostly it's going to depend on what

79
00:06:32,590 --> 00:06:34,510
problem you're dealing with.

80
00:06:34,570 --> 00:06:39,970
So if you're using binary classification a.k.a. if you're predicting whether something is one thing

81
00:06:39,970 --> 00:06:47,690
or another such as images of cats or dogs you would want to change your activation function to sigmoid

82
00:06:48,020 --> 00:06:53,360
and then your loss function to binary cross entropy.

83
00:06:53,360 --> 00:07:03,590
So we change this to binary cross entropy but because we're doing multi class we keep it at categorical

84
00:07:03,590 --> 00:07:10,520
cross entropy because if we come here multi class classification activation soft Max loss categorical

85
00:07:10,520 --> 00:07:12,410
cross entropy.

86
00:07:12,410 --> 00:07:13,760
Now this is a lot to take on.

87
00:07:14,000 --> 00:07:16,150
So don't worry if you don't get it to begin with.

88
00:07:16,160 --> 00:07:17,930
But remember the whole story.

89
00:07:17,930 --> 00:07:19,310
Remember what the loss is.

90
00:07:19,310 --> 00:07:25,070
Remember what the optimizer is just your friend Adam telling you how to walk down the hill and the metrics

91
00:07:25,340 --> 00:07:27,110
you could use a bunch of different metrics here.

92
00:07:27,140 --> 00:07:34,850
So we go here TAF carer's metrics is gonna tell you some metrics that you can use.

93
00:07:34,880 --> 00:07:42,620
Accuracy is the default one for classification but we've got area under the curve categorical accuracy

94
00:07:43,760 --> 00:07:44,600
a whole bunch.

95
00:07:44,620 --> 00:07:52,760
Main precision recall so a whole bunch of different options I'll be sure to link those as well but that

96
00:07:52,850 --> 00:07:58,550
we've gone through what compiling the model does and then finally we can finish off with this one because

97
00:07:58,550 --> 00:08:05,240
building I think you can imagine what's happening here is just another little way to say hey this is

98
00:08:05,240 --> 00:08:07,540
the input shape we're going to take to our model.

99
00:08:07,700 --> 00:08:11,960
So we're using a carer's layer from tensor flow hub.

100
00:08:11,960 --> 00:08:18,830
And if we come back to here it says that if we want to use this layer we set up a sequential model and

101
00:08:18,830 --> 00:08:27,050
then build the model with our input shape and it's this shape because that is the size of images that

102
00:08:27,050 --> 00:08:36,320
mobile Net V2 was trained on all right now that has been enough talking we've broken down what our create

103
00:08:36,320 --> 00:08:38,110
model function does.

104
00:08:38,250 --> 00:08:44,450
If you have any questions be sure to leave it in the question and answer or in the discord chat but

105
00:08:44,840 --> 00:08:47,300
let's have a look at what's going on.

106
00:08:47,360 --> 00:08:51,860
Well probably actually in the next video just quickly debrief what's going on in summary.

107
00:08:52,070 --> 00:08:57,620
But the next thing we want to do is create some callbacks for our model.

108
00:08:57,650 --> 00:09:03,770
So while our model is training callbacks are gonna be some functions that are going to implement a few

109
00:09:03,770 --> 00:09:09,530
little helpful things that our model can do while it's learning patterns in the data.

110
00:09:09,530 --> 00:09:11,060
So I'll see you in the next video.