1
00:00:00,340 --> 00:00:01,710
All righty then.

2
00:00:01,740 --> 00:00:06,900
Now that we'd figured out different ways to evaluate our machine learning models you'll likely next

3
00:00:06,900 --> 00:00:10,060
question is how can we improve upon these metrics.

4
00:00:10,110 --> 00:00:12,340
And that's what we're gonna cover in this next section.

5
00:00:12,480 --> 00:00:22,720
Section 5 which is improving a model let's make a heading here will go five improving a model.

6
00:00:22,740 --> 00:00:28,440
Now the thing to remember here is usually actually more often than not the first predictions you make

7
00:00:28,440 --> 00:00:29,040
on a model.

8
00:00:29,040 --> 00:00:30,290
I'm not its last right.

9
00:00:30,360 --> 00:00:37,590
And these first predictions are often referred to as let's write this down first predictions equal baseline

10
00:00:38,430 --> 00:00:45,260
predictions and first model is often referred to as baseline model.

11
00:00:45,360 --> 00:00:51,450
What you'll try to do is as you go on is once you've made some first predictions and evaluated them

12
00:00:51,450 --> 00:00:55,800
and once you've built a first model like a baseline model you'll try to improve upon that.

13
00:00:55,830 --> 00:01:00,340
So a.k.a. improve your predictions and improve your model.

14
00:01:00,360 --> 00:01:02,600
Now how can we go about that.

15
00:01:02,610 --> 00:01:04,820
Well there's two main ways.

16
00:01:04,830 --> 00:01:07,610
The first one is from a data perspective.

17
00:01:07,800 --> 00:01:17,100
You'll ask questions like could we collect more data because more data means that a machine learning

18
00:01:17,100 --> 00:01:20,670
model may have more of a chance to learn patterns within that data.

19
00:01:20,820 --> 00:01:28,690
And there's a saying in machine learning is generally the more data the better.

20
00:01:28,860 --> 00:01:33,240
And it's similar to how if you were to practice something right and if if you do a bit more practice

21
00:01:33,240 --> 00:01:40,170
than usual you might get a little bit better than that thing maybe not exorbitantly better but certainly

22
00:01:40,170 --> 00:01:45,240
if you do 10 sessions of practice you'll probably be a little bit better if you do one session of practice.

23
00:01:45,240 --> 00:01:47,180
Same thing goes on machine learning right.

24
00:01:47,220 --> 00:01:53,520
If there's 10000 examples rather than 1000 example of something chances are if there's patterns in that

25
00:01:53,520 --> 00:01:56,590
data the machine learning model will pick them up.

26
00:01:56,600 --> 00:02:03,280
The second one is could we improve our data now.

27
00:02:03,420 --> 00:02:10,470
This is if you had for example an hour car sales problem where we're using the make the color the odometer

28
00:02:10,620 --> 00:02:14,930
and the number of doors to try and predict the sale price of a car.

29
00:02:14,940 --> 00:02:19,580
Now if you were given that information it would be kind of hard to predict the sale price of that car

30
00:02:19,590 --> 00:02:20,710
right.

31
00:02:20,760 --> 00:02:27,810
So what you would search for here is maybe more features about each car so you would have more information

32
00:02:27,870 --> 00:02:29,380
about each sample.

33
00:02:29,430 --> 00:02:35,790
So rather than just more samples you'd go to more depth of information within each sample.

34
00:02:35,790 --> 00:02:45,300
And then what we're going to be focusing on is from a model perspective and this Rs question like Is

35
00:02:45,300 --> 00:02:49,540
there a better model we could use.

36
00:02:49,590 --> 00:02:54,120
So if we look at the psychic loan machine learning math

37
00:02:56,960 --> 00:02:59,890
we've seen an example of this before.

38
00:03:00,130 --> 00:03:05,050
So for example when we followed it through and we started off with a linear SBC and we figured out those

39
00:03:05,050 --> 00:03:08,650
results aren't too good back in our classification problem.

40
00:03:08,650 --> 00:03:15,400
So then we went up and tried an ensemble classifier and now what we mean by this is there a better model

41
00:03:15,400 --> 00:03:17,170
we could use.

42
00:03:17,170 --> 00:03:21,290
So if you've started out with a simple model could we use a more complex one.

43
00:03:21,430 --> 00:03:27,280
And what I mean by this is when someone says simple model usually what you do is the models the first

44
00:03:27,280 --> 00:03:31,390
ones you come across here are relatively simple compared to the ones you'll end up here.

45
00:03:31,990 --> 00:03:37,450
And now the reason you start out with a simpler model is because generally these models are faster to

46
00:03:37,450 --> 00:03:44,050
train on whatever compute power you have an ensemble classifier such as a random forest or ensemble

47
00:03:44,050 --> 00:03:49,930
models in general are generally considered more complex because you're training a bunch of little smaller

48
00:03:49,930 --> 00:03:51,550
models at once.

49
00:03:51,550 --> 00:03:58,330
And so this is why you'll hear the term simple model versus complex model linear SBC is often referred

50
00:03:58,330 --> 00:04:03,010
to as a simpler model and then coming up here to like ensemble classifiers because it harnesses the

51
00:04:03,010 --> 00:04:04,790
power of lots of different models.

52
00:04:04,810 --> 00:04:07,270
They're referred to as more complex.

53
00:04:07,330 --> 00:04:09,970
So that's kind of what we've already tried right.

54
00:04:09,970 --> 00:04:14,920
We started off with a simpler model and we moved to something more complex with the random forest and

55
00:04:14,920 --> 00:04:20,620
then if we are trying something more complex like an ensemble method a.k.a. using our random forest

56
00:04:20,620 --> 00:04:31,540
classifier is could we improve the current model now in this case if the model we're using performs

57
00:04:31,540 --> 00:04:32,100
well.

58
00:04:32,140 --> 00:04:35,630
Straight out of the box which is what our random forest classifier does.

59
00:04:35,740 --> 00:04:42,640
Can we improve the hybrid parameters of this model to make it even better now.

60
00:04:42,640 --> 00:04:48,250
A little note here is that patterns in the data you find that our machine learning finds is often referred

61
00:04:48,250 --> 00:04:50,350
to as data parameters.

62
00:04:50,350 --> 00:04:57,250
And the difference between parameters and hyper parameters is a machine learning model seeks to find

63
00:04:57,250 --> 00:05:00,130
patterns in data on its own.

64
00:05:00,130 --> 00:05:07,540
So a machine learning model will find parameters in data on its own whereas hyper parameters are settings

65
00:05:07,540 --> 00:05:10,720
on a model that you can adjust.

66
00:05:10,720 --> 00:05:12,930
So we'll put this here.

67
00:05:12,930 --> 00:05:27,220
Parameters equals model finds these patterns in data and high parameters Eagles settings on a model

68
00:05:27,940 --> 00:05:30,800
you can adjust to.

69
00:05:30,850 --> 00:05:36,760
Now we'll put this in brackets because this is a bit of a trial and error process to potentially improve

70
00:05:37,120 --> 00:05:45,150
its ability to find patterns now we've covered a fair bit of ground not just in this one cell before

71
00:05:45,150 --> 00:05:49,440
we've even seen to code but this is kind of setting up the framework for what we're going to cover in

72
00:05:49,440 --> 00:05:50,630
the next couple of videos.

73
00:05:50,660 --> 00:05:53,750
So let's put that there.

74
00:05:54,300 --> 00:05:56,720
Oh this is a little bit confusing here.

75
00:05:56,750 --> 00:06:03,990
We might put a little heading here hyper parameters this parameters.

76
00:06:04,130 --> 00:06:07,010
Now how do you find a model's height and parameters.

77
00:06:07,100 --> 00:06:09,100
Well that's a good question.

78
00:06:09,140 --> 00:06:09,890
Let's check it out.

79
00:06:10,320 --> 00:06:14,310
We'll go from S.K. line dot ensemble.

80
00:06:14,810 --> 00:06:26,490
Import random forest classifier F equals will instantiate a random forest classifier random forest classifier.

81
00:06:26,590 --> 00:06:28,150
Wonderful.

82
00:06:28,150 --> 00:06:34,150
Now once we have a model instantiated you can find its hybrid parameters by calling the function on

83
00:06:34,150 --> 00:06:37,620
it get parameters.

84
00:06:37,780 --> 00:06:38,800
Wonderful.

85
00:06:38,800 --> 00:06:45,590
Now these are all different hyper parameters that we can adjust on our random forest classifier.

86
00:06:45,760 --> 00:06:49,290
And now if you're looking at these and thinking wow there's a lot here.

87
00:06:49,300 --> 00:06:51,120
This is very confusing.

88
00:06:51,160 --> 00:06:53,420
I'll show you where you can find out more about each one.

89
00:06:54,120 --> 00:06:58,570
Well let's go to S.K. learn random forest classifier

90
00:07:01,610 --> 00:07:02,760
wonderful.

91
00:07:02,810 --> 00:07:07,400
So if we were to go through this documentation what we're going to see here is parameters.

92
00:07:07,400 --> 00:07:13,160
Now this can be where it gets confusing because I've used the word hyper parameters but psychic line

93
00:07:13,160 --> 00:07:15,140
kind of defines them as parameters.

94
00:07:15,140 --> 00:07:21,950
The reason why as parameters here is because in Python terms each of these is a parameter that you can

95
00:07:21,950 --> 00:07:25,790
pass this function to adjust.

96
00:07:25,790 --> 00:07:29,480
But in reality these are hyper parameters.

97
00:07:29,480 --> 00:07:32,020
So this is where this list comes from.

98
00:07:32,090 --> 00:07:37,760
If we were to read through all of these we could come to this I learn random forest classified documentation

99
00:07:38,120 --> 00:07:41,510
and read through all of them and find out what what exactly they are.

100
00:07:41,660 --> 00:07:46,670
Then if we come down here so I could learn all include a few notes about the random forest classifier

101
00:07:46,670 --> 00:07:49,670
and little tips on how we can adjust the hyper parameters.

102
00:07:49,670 --> 00:07:53,390
Now the same goes for any other model that you might choose.

103
00:07:53,420 --> 00:07:59,450
If you click on any of these you'll come across something similar like this documentation which will

104
00:07:59,450 --> 00:08:03,290
tell you the models hyper parameters that you can adjust.

105
00:08:03,450 --> 00:08:09,000
Okay now you still might be thinking I'm still not getting the concept of hyper parameters and so we

106
00:08:09,000 --> 00:08:15,250
come to this little diagram this may help your understanding improving a model via hyper parameter tuning.

107
00:08:15,300 --> 00:08:22,410
So if we imagine this food source here as our data and what we want a machine learning model to do is

108
00:08:22,410 --> 00:08:27,060
figure out the patterns that it can use to make our favorite dish.

109
00:08:27,090 --> 00:08:28,470
If you're cooking your favorite dish.

110
00:08:28,500 --> 00:08:33,690
This is you cooking here not a machine learning model what you might find is that if you put the oven

111
00:08:33,690 --> 00:08:39,660
on for one hour at 180 degrees your favorite dish the roast chicken dish doesn't come out exactly how

112
00:08:39,660 --> 00:08:44,610
you want it but after a little bit of practice after a little bit of trial and error you find that if

113
00:08:44,610 --> 00:08:52,380
you cook it one hour at 200 degrees your dish comes out exactly how you want it perfect.

114
00:08:52,380 --> 00:08:59,610
Now the analogy here to understand is that in our case adjusting hyper parameters is like adjusting

115
00:08:59,670 --> 00:09:01,980
the temperature on the oven.

116
00:09:01,980 --> 00:09:04,730
Our initial machine learning model straight out of the box.

117
00:09:04,730 --> 00:09:12,000
Maybe this this oven here cooking at one hour 180 degrees and it may find some patterns in how to combine

118
00:09:12,000 --> 00:09:17,100
these ingredients together to get our favorite dish but it might not be just exactly how we want it.

119
00:09:17,400 --> 00:09:23,430
So what we do is we try a few different parameters a few different type of parameters and eventually

120
00:09:23,430 --> 00:09:29,760
we figure out we want to change the temperature to 200 degrees and that brings out our perfect dish

121
00:09:29,790 --> 00:09:31,320
just the way we like it.

122
00:09:31,590 --> 00:09:34,600
The same thing goes with our machine learning models.

123
00:09:34,710 --> 00:09:40,590
If we just used our random forest classifier as it is as it came out of the box with its default parameters

124
00:09:40,620 --> 00:09:46,650
these ones here it may find patterns in our data pretty well which it is it's doing about 85 percent

125
00:09:46,650 --> 00:09:47,510
accuracy.

126
00:09:47,670 --> 00:09:54,870
But what we might be able to do is adjust these much like our setting on our oven and so that it finds

127
00:09:54,870 --> 00:10:00,820
these patterns in the data a little bit better than what it does straight out of the box.

128
00:10:00,840 --> 00:10:03,900
Now again there's a lot to take in but don't worry.

129
00:10:03,900 --> 00:10:06,420
It took me a while to understand this concept as well.

130
00:10:07,140 --> 00:10:15,480
So we're going to look at over the next few videos three ways to adjust hyper parameters.

131
00:10:15,480 --> 00:10:17,450
The first one is by hand.

132
00:10:17,910 --> 00:10:19,860
So we'll try and tune them by hand.

133
00:10:19,860 --> 00:10:28,330
The second one is randomly with random search CV which is a function in its socket line.

134
00:10:28,440 --> 00:10:41,150
And the third one is exhaustively with grid search saving OK so have a little bit of a read over this.

135
00:10:41,190 --> 00:10:42,520
Just read through here.

136
00:10:43,020 --> 00:10:47,850
If you don't fully grasp it yet that's completely fine but just remember this is what we're going to

137
00:10:47,850 --> 00:10:53,880
cover and this section is going to improve a model focused on hyper parameter tuning and hyper parameters

138
00:10:53,940 --> 00:11:00,330
are settings on a model you can adjust to potentially improve its ability to find patterns in data much

139
00:11:00,330 --> 00:11:07,740
like if you were cooking your favorite dish you might adjust the temperature of your oven from its baseline

140
00:11:07,740 --> 00:11:15,520
setting of 180 degrees to get the perfect outcome of your favorite dish so keep that in mind and we'll

141
00:11:15,520 --> 00:11:16,480
see you in the next video.