1
00:00:00,390 --> 00:00:08,970
Alrighty let's finish up this section on evaluating machine learning models so we can do so by tackling

2
00:00:09,330 --> 00:00:15,110
the third and final way to evaluate a machine learning more metric functions.

3
00:00:15,120 --> 00:00:21,960
So in essence we've kind of already covered this because all the metrics we've previously seen have

4
00:00:22,110 --> 00:00:24,700
their own function in psychic loan.

5
00:00:24,750 --> 00:00:25,940
Let's see what I mean by this.

6
00:00:25,940 --> 00:00:37,790
So using different valuation metrics as psychic loan functions four point three.

7
00:00:37,910 --> 00:00:38,960
Wonderful.

8
00:00:38,960 --> 00:00:46,190
So to do so for classification we had accuracy we had precision we had recall we had EF 1 and then for

9
00:00:46,190 --> 00:00:51,510
regression we had r squared mean absolute era and means squared error.

10
00:00:51,570 --> 00:00:52,320
All right.

11
00:00:52,340 --> 00:00:57,890
So let's do what we always do and say the code first and then we'll talk.

12
00:00:57,890 --> 00:00:58,120
Right.

13
00:00:58,120 --> 00:01:03,270
So from SBA loan import metrics I'm gonna do a full example as we always do right.

14
00:01:03,310 --> 00:01:05,680
Because that's what we like to do.

15
00:01:05,720 --> 00:01:11,270
We like to be complete with what we're working on precision score recall score you might be out to figure

16
00:01:11,270 --> 00:01:14,840
out what the one is for F1 score or typed in a bit too quick.

17
00:01:14,840 --> 00:01:22,710
So from S.K. loan we'll also import our model because what we might do is create a section here like

18
00:01:24,870 --> 00:01:29,940
classification evaluation functions.

19
00:01:29,940 --> 00:01:34,450
Now again this section here is just another way to do what we've done before.

20
00:01:34,560 --> 00:01:36,810
So ensemble but it's good to practice.

21
00:01:36,810 --> 00:01:43,650
It's always good to practice random forest classifier and then we want S.K. learned of model selection

22
00:01:45,150 --> 00:02:00,000
import train test split MP dot random seed 42 lovely X equals heart disease dot drop we can almost write

23
00:02:00,030 --> 00:02:01,080
this in our sleep.

24
00:02:01,090 --> 00:02:09,420
Now y equals heart disease target and if you can't that is more than okay write it then the reason I

25
00:02:09,420 --> 00:02:15,360
can write this sort of out is because I've had a fair bit of practice with it and you'll be the same

26
00:02:15,360 --> 00:02:15,750
too.

27
00:02:15,830 --> 00:02:19,950
If you're starting out now you might be looking at all these functions or this code and going whole

28
00:02:20,130 --> 00:02:25,590
league goodness there is so much to remember but the beautiful thing is is that it's here it's available

29
00:02:25,590 --> 00:02:30,450
for you you can run it in Jupiter a notebook and you can practice as much as you like.

30
00:02:30,480 --> 00:02:36,870
So really your only roadblock is just put in the work and practicing learning and the Don't forget learning

31
00:02:36,870 --> 00:02:41,520
something new especially machine learning takes time and it's not going away.

32
00:02:41,520 --> 00:02:44,130
So you've got plenty of time.

33
00:02:44,520 --> 00:02:51,120
So what we're doing here we've seen this code before importing some metrics specifically accuracy score

34
00:02:51,330 --> 00:02:59,190
precision score recall score f1 score and we're importing a model and we're importing trying to split

35
00:02:59,550 --> 00:03:05,320
we're splitting our data into x and y we're splitting it into train and test sets where instantiating

36
00:03:05,320 --> 00:03:11,130
a random forest classifier and fitting it to the training data beautiful running machine learning code

37
00:03:12,450 --> 00:03:14,190
and we'll make some predictions

38
00:03:16,740 --> 00:03:23,580
and then we'll go y spreads because remember what is an evaluation metric doing.

39
00:03:23,580 --> 00:03:30,000
If you said comparing our model's predictions to the truth labels to the actual labels you would be

40
00:03:30,000 --> 00:03:36,930
correct evaluate the classifier so now what we're going to do is going to take advantage of these inbuilt

41
00:03:37,290 --> 00:03:38,380
functions here.

42
00:03:38,570 --> 00:03:38,760
Right.

43
00:03:38,760 --> 00:03:42,150
We could use Skoal we could use this going parameter but we've already covered those.

44
00:03:42,150 --> 00:03:45,330
This is using psychic loan functions.

45
00:03:45,540 --> 00:03:54,660
So evaluate the classifier what we'll do is we'll print out something nice maybe classifier metrics

46
00:03:55,320 --> 00:04:01,860
on the test set wonderful and then we're going to print out we'll do a f string.

47
00:04:01,860 --> 00:04:10,140
Accuracy is going to be we use the accuracy score function on y test and Y parades.

48
00:04:10,170 --> 00:04:11,080
Wonderful.

49
00:04:11,080 --> 00:04:12,600
Then we're gonna times out by 100.

50
00:04:12,600 --> 00:04:19,620
So it comes out in a nice neat percentage because I prefer that or we'll prefer that than the decimals.

51
00:04:19,620 --> 00:04:22,380
Maybe you don't know we got here.

52
00:04:22,490 --> 00:04:24,320
We're getting a precision now.

53
00:04:24,330 --> 00:04:30,330
Again we could function something like this and we probably will in a future video but just for examples

54
00:04:30,330 --> 00:04:37,200
sake we'll type it out we'll practice typing it out back and say it like that well it needs the end

55
00:04:37,200 --> 00:04:38,410
of a string.

56
00:04:38,440 --> 00:04:39,000
There we go.

57
00:04:39,420 --> 00:04:40,710
Now we're gonna do a recall.

58
00:04:41,100 --> 00:04:48,120
How would you do this one if I start you off with the F string that's right.

59
00:04:48,220 --> 00:04:49,780
We'll keep going.

60
00:04:49,780 --> 00:04:52,510
So recall score why test.

61
00:04:52,510 --> 00:04:58,410
Remember just comparing our predictions to the test labels to the truth labels.

62
00:04:58,540 --> 00:05:01,350
And then finally we're going to go F one.

63
00:05:01,360 --> 00:05:06,760
So this is something you might do like if you're reporting to your colleague or to your boss or to your

64
00:05:06,760 --> 00:05:11,470
manager or something like that or to the greater public like how your model is doing you might give

65
00:05:11,470 --> 00:05:14,370
them all these different evaluation metrics so they can start to understand.

66
00:05:14,380 --> 00:05:14,940
Okay.

67
00:05:15,040 --> 00:05:19,840
The accuracy is a certain thing but the precision is there so they have an idea of how many false positives

68
00:05:19,840 --> 00:05:24,010
there are and the recall is there so they have an idea of how many false negatives there are.

69
00:05:24,010 --> 00:05:28,090
And the F one is kind of a combination between the precision and recall.

70
00:05:28,090 --> 00:05:30,810
So we'll hit shift and enter walla.

71
00:05:31,000 --> 00:05:38,260
Now we've taken advantage of the third method of evaluating models and that's by directly using functions

72
00:05:38,260 --> 00:05:45,590
such as accuracy score precision score recall score an F 1 score we're to appear in the documentation.

73
00:05:45,670 --> 00:05:49,180
This is a metric function right classification metrics.

74
00:05:49,180 --> 00:05:52,050
Here we go as a whole bunch more there if you want to check them out.

75
00:05:52,710 --> 00:05:56,140
But these are some of the main ones that we've covered and the principle is still the exact same for

76
00:05:56,140 --> 00:05:57,240
the rest of them.

77
00:05:57,280 --> 00:06:08,110
And so if you come in here we're going to do regression evaluation functions turn that into markdown.

78
00:06:08,410 --> 00:06:14,130
So same thing again you could almost do this yourself I reckon and if not don't why we're about to type

79
00:06:14,130 --> 00:06:16,210
it out but how would you go about it.

80
00:06:16,340 --> 00:06:21,420
If we look at our classification evaluation functions what you might do is from S K low end up metrics

81
00:06:21,450 --> 00:06:28,950
import some regression functions then import the regression model then import this trying to split create

82
00:06:28,950 --> 00:06:34,410
the data split it into training and test instantiate your regression model make some predictions and

83
00:06:34,410 --> 00:06:40,050
then evaluate them but this time instead of valuing classifier you're evaluating the regression model

84
00:06:40,170 --> 00:06:50,490
using regression metrics but just for completeness Let's type it out again metrics import to score wonderful

85
00:06:51,020 --> 00:06:58,920
mean absolutely around and we've seen these before mean absolute error mean squared error.

86
00:06:58,920 --> 00:07:03,990
BAIER You travel S.K. learn dot ensemble.

87
00:07:04,170 --> 00:07:12,860
Import random forest regress I so this is a kind of workflow you might do for your own problems right.

88
00:07:12,870 --> 00:07:16,880
If you're working on a regression problem you might have some sort of import statement at the top.

89
00:07:16,880 --> 00:07:20,360
Your notebook like this import train test split.

90
00:07:20,370 --> 00:07:23,100
In our case we've already got our data in a data frame.

91
00:07:23,100 --> 00:07:26,330
You may have some more lines of code getting your data to a proper data frame.

92
00:07:26,330 --> 00:07:27,920
Oh we almost forgot.

93
00:07:27,970 --> 00:07:32,130
MP random seed so that out you actually don't need a random seed.

94
00:07:32,140 --> 00:07:36,960
I'd just like to have one and you'll see them all over the place just so if you run the results the

95
00:07:36,960 --> 00:07:47,850
same as what someone else was getting target access equals one line equals Boston DLF target.

96
00:07:48,450 --> 00:07:49,220
Beautiful.

97
00:07:49,290 --> 00:07:51,240
So and another reason the random seed right.

98
00:07:51,240 --> 00:07:57,270
So if I ran these cells and then you took this notebook as a resource for the course and then you wanted

99
00:07:57,270 --> 00:08:01,920
to compare your results to mine without the random seed they'd probably be different because all of

100
00:08:01,920 --> 00:08:07,650
the randomness in this notebook such as train test split randomly splitting our data into training and

101
00:08:07,650 --> 00:08:12,810
test sets would use different samples for each and so we'd get different numbers and that would cause

102
00:08:12,810 --> 00:08:17,940
confusion which is not what we're about writing for all about communicating what we're finding.

103
00:08:18,120 --> 00:08:20,060
So we're instantiating a model here.

104
00:08:20,120 --> 00:08:22,000
So random forest Progresso.

105
00:08:23,250 --> 00:08:24,240
Wonderful.

106
00:08:24,240 --> 00:08:37,620
When we go model dot fit a train line train beautiful make predictions using our regression model model

107
00:08:37,620 --> 00:08:40,070
dot predict x test.

108
00:08:40,260 --> 00:08:41,460
Yes yes yes.

109
00:08:41,520 --> 00:08:46,260
And now evaluate the regression model.

110
00:08:46,260 --> 00:08:55,590
So I'm going to go here print regression model metrics on the test set again you could function eyes

111
00:08:55,590 --> 00:09:01,050
this to pass your regression model as well as metrics but we're just gonna ride it out here just for

112
00:09:01,860 --> 00:09:05,030
just for good practice to school.

113
00:09:05,050 --> 00:09:08,080
Why test y parades.

114
00:09:08,220 --> 00:09:09,500
Wonderful.

115
00:09:09,540 --> 00:09:11,640
Now we just need to end the string.

116
00:09:11,700 --> 00:09:20,040
We can do the same for main absolute error so m80 equals mean tab complete that one.

117
00:09:20,040 --> 00:09:21,470
Of course we will.

118
00:09:21,600 --> 00:09:22,500
Why parades

119
00:09:25,500 --> 00:09:26,830
wonderful guy.

120
00:09:26,940 --> 00:09:29,200
Print f MSE.

121
00:09:31,070 --> 00:09:39,540
Main squared error comparing it predictions to the actual labels finish and off with the string and

122
00:09:39,630 --> 00:09:40,030
boom.

123
00:09:40,050 --> 00:09:41,600
Oh we're going there of course we did.

124
00:09:42,120 --> 00:09:46,770
And this is gonna give us a warning because out an estimate is not equal to 100.

125
00:09:46,890 --> 00:09:54,840
And what is our other era found input variables with inconsistent number of samples 102.

126
00:09:55,320 --> 00:09:56,550
What has happened here.

127
00:09:58,940 --> 00:10:03,930
One hundred and two boy test y prints vs. sixty one.

128
00:10:04,040 --> 00:10:05,850
You know what it was happening.

129
00:10:05,980 --> 00:10:12,490
Sixty one is the number of ah here we go.

130
00:10:13,270 --> 00:10:13,820
There we go.

131
00:10:13,830 --> 00:10:20,400
You know how I knew that is because if we go up here and if our classification problem if we go Len

132
00:10:21,240 --> 00:10:25,130
why spreads before we instantiate our regression problem it's sixty one.

133
00:10:25,320 --> 00:10:27,820
So because we didn't set y spreads here.

134
00:10:27,820 --> 00:10:28,140
Right.

135
00:10:28,140 --> 00:10:31,410
Previously this was just this.

136
00:10:31,410 --> 00:10:34,230
It was using Y spreads from above.

137
00:10:34,230 --> 00:10:37,960
That's where I got called out from using the same variable names throughout the notebook right.

138
00:10:38,910 --> 00:10:44,880
Ideally we'd have different variable names for our classification and regression problems but just to

139
00:10:44,880 --> 00:10:48,320
illustrate purposes this is usually called Y produce something of the like.

140
00:10:48,410 --> 00:10:49,080
And there we go.

141
00:10:49,080 --> 00:10:50,970
Regression Model metrics on the test set.

142
00:10:50,970 --> 00:10:53,160
Now we've seen similar metrics before.

143
00:10:53,490 --> 00:10:54,180
What can we do now.

144
00:10:54,720 --> 00:10:57,630
Well we've covered a whole bunch right.

145
00:10:57,960 --> 00:11:03,910
And the reason being is because evaluating a machine learning model is paramount.

146
00:11:04,170 --> 00:11:09,390
It's one thing to train one but then again there's nothing worse than training a machine learning model

147
00:11:09,420 --> 00:11:13,520
and optimizing it for the wrong evaluation metric.

148
00:11:13,620 --> 00:11:18,320
So keep the metrics and evaluation methods we've gone through when training your future models.

149
00:11:18,330 --> 00:11:22,330
Make sure you keep them in mind go through them and have a little read here.

150
00:11:22,380 --> 00:11:28,710
This is probably the most important section that you read in the entire psychic loan documentation but

151
00:11:28,710 --> 00:11:34,170
after you've done that you'll naturally probably start to ask is that how do we improve these numbers.

152
00:11:34,170 --> 00:11:36,000
How do we make them better.

153
00:11:36,030 --> 00:11:37,190
They're kind of stagnant.

154
00:11:37,200 --> 00:11:42,330
We've been using a random seed and we've been seeing just the same numbers for accuracy precision recall

155
00:11:42,330 --> 00:11:49,780
in F1 over and over and saying with r squared MBA and MSE So the next section that's what we're going

156
00:11:49,780 --> 00:11:50,250
to cover.

157
00:11:50,280 --> 00:11:53,630
So we look back our list what we're covering.

158
00:11:53,730 --> 00:11:56,420
Number five we're up to improving a model.

159
00:11:56,460 --> 00:11:56,790
All right.

160
00:11:56,790 --> 00:11:58,460
So take a little break.

161
00:11:58,470 --> 00:12:04,530
Have a look at the psychic loan documentation for metrics and scoring quantifying the quality of predictions.

162
00:12:04,530 --> 00:12:10,620
You can find it by just going to this you are all here or searching SBA loan evaluator model but otherwise

163
00:12:10,770 --> 00:12:12,120
get ready for the next section.

164
00:12:12,180 --> 00:12:14,640
We're gonna see how to improve our models.