1
00:00:00,390 --> 00:00:05,220
We've seen how to make predictions with our machine learning models once it's loans and patterns from

2
00:00:05,220 --> 00:00:08,390
the data a.k.a. using those patterns.

3
00:00:08,430 --> 00:00:14,820
Now how do we figure out whether those predictions are valid such as Could we use them in production.

4
00:00:14,850 --> 00:00:19,780
Or is our model just making things up or those predictions do they actually hold water.

5
00:00:19,800 --> 00:00:24,680
So what we're going to cover in this section is step for evaluating a model.

6
00:00:24,690 --> 00:00:35,690
So we'll get rid of this but we'll put in a little heading here evaluating machine learning model beautiful.

7
00:00:35,760 --> 00:00:40,860
Now the first place we're going to have a look at is up here and this is a socket loan documentation.

8
00:00:40,920 --> 00:00:48,070
We can actually find this by going socket learned evaluate a model that should come up three point three

9
00:00:48,130 --> 00:00:49,480
metrics and scoring.

10
00:00:49,480 --> 00:00:51,800
That's what we're after.

11
00:00:51,870 --> 00:00:58,050
So as you can see here there are three different API is for evaluating the quality of a model's prediction.

12
00:00:58,050 --> 00:00:59,880
We're going to have a look at each of these.

13
00:00:59,880 --> 00:01:02,250
So we're going to estimate a score method.

14
00:01:02,250 --> 00:01:06,510
We've got the scoring parameter and we've got metric functions

15
00:01:09,370 --> 00:01:14,030
we could read through this but I prefer as you probably do preferred to is getting hands on with the

16
00:01:14,030 --> 00:01:14,440
codes.

17
00:01:14,460 --> 00:01:22,280
Let's see it in action all right now to do so we're going to bring back our heart disease classification

18
00:01:22,280 --> 00:01:22,770
problem.

19
00:01:22,790 --> 00:01:24,400
We could scroll up and copy it.

20
00:01:24,620 --> 00:01:28,690
But again I want you to to get some practice writing it out right.

21
00:01:28,690 --> 00:01:32,830
Because this is what we're here for We're here to practice writing machine learning code.

22
00:01:32,840 --> 00:01:36,230
So we're going to import is a markdown sales.

23
00:01:36,230 --> 00:01:38,320
I'm going to change that to code.

24
00:01:38,360 --> 00:01:41,540
We're going to import the random forest classifier.

25
00:01:41,600 --> 00:01:45,230
We've seen this before and then we're going to set up a random seed.

26
00:01:45,470 --> 00:01:46,830
Wonderful.

27
00:01:47,270 --> 00:01:50,810
And then we're going to create our x and y our feature variables.

28
00:01:50,810 --> 00:01:52,550
Heart disease don't drop.

29
00:01:52,550 --> 00:01:57,500
We've already imported the data from heart disease and access equals one.

30
00:01:57,590 --> 00:01:58,130
Beautiful.

31
00:01:58,130 --> 00:02:00,080
And we'll create our labels.

32
00:02:00,110 --> 00:02:02,010
Heart disease.

33
00:02:02,180 --> 00:02:04,400
This is target.

34
00:02:04,460 --> 00:02:05,500
Excellent.

35
00:02:05,510 --> 00:02:07,600
Then we'll split it into train and test.

36
00:02:07,600 --> 00:02:19,850
So x test y train y in test Eagles train test split x y test size equals zero point two.

37
00:02:19,940 --> 00:02:21,080
Wonderful.

38
00:02:21,080 --> 00:02:27,100
Then we'll instantiate our random forest classifier random forest we can probably breast have here.

39
00:02:27,410 --> 00:02:28,340
We certainly can.

40
00:02:28,340 --> 00:02:29,870
And then we're going to fit it.

41
00:02:29,870 --> 00:02:30,860
That wasn't too hard right.

42
00:02:30,870 --> 00:02:35,600
By now we're becoming experts at writing this little section of code and I'm being realistic here.

43
00:02:35,600 --> 00:02:38,000
That's a full blown machine learning pipeline right there.

44
00:02:38,540 --> 00:02:43,660
As long as the data's in the right format and we've got the target column we can do this pretty quickly.

45
00:02:43,670 --> 00:02:45,690
So now we run this.

46
00:02:45,860 --> 00:02:48,020
We can see that our model fits itself to the data.

47
00:02:48,020 --> 00:02:54,060
So basically it's finding the patterns in X train and Y train or between those two.

48
00:02:54,140 --> 00:02:58,750
And so now we can use the scoring parameter what we might do actually is copy this.

49
00:02:58,760 --> 00:03:07,480
We go three ways to evaluate psychic loan models slash estimates.

50
00:03:07,490 --> 00:03:10,880
Now this is just from the documentation.

51
00:03:10,960 --> 00:03:16,220
So we want one is estimate a score method.

52
00:03:16,220 --> 00:03:23,620
This is what we'll have to look at first and then two is the scoring parameter scoring parameter.

53
00:03:23,630 --> 00:03:31,820
We'll have a look at that shortly and then three is problem specific metric functions.

54
00:03:31,820 --> 00:03:32,800
Beautiful.

55
00:03:32,840 --> 00:03:34,990
Now I've got a little heading we know what we're working with.

56
00:03:35,030 --> 00:03:40,670
So the first things first we're going to check out the score methods may you put another heading in

57
00:03:40,670 --> 00:03:41,610
here.

58
00:03:41,660 --> 00:03:50,860
One two three four point one evaluating a model with a score method.

59
00:03:50,870 --> 00:03:56,670
Now we've already seen this one right because this is basically the default it's a way to get a quick

60
00:03:56,670 --> 00:03:59,580
sniff a quick understanding of how our is doing.

61
00:03:59,580 --> 00:04:07,110
So if we call CSF dot score right because that's the score method every estimate in psychic loan has

62
00:04:07,110 --> 00:04:08,670
this little score method.

63
00:04:08,880 --> 00:04:14,160
So once you've instantiated machine learning model here and you fit it to some sort of data you can

64
00:04:14,160 --> 00:04:15,030
get its score.

65
00:04:15,330 --> 00:04:22,880
So look we could even get its score on the training data how does it go on here 1 so it fits the training

66
00:04:22,880 --> 00:04:24,230
data perfectly.

67
00:04:24,230 --> 00:04:25,370
And then if we go here.

68
00:04:25,370 --> 00:04:30,740
Score on the test data 85 percent.

69
00:04:30,770 --> 00:04:35,610
So we've seen this figures before right now what is happening here.

70
00:04:35,640 --> 00:04:36,330
Well let's have a look.

71
00:04:36,360 --> 00:04:37,560
Let's press shift tab.

72
00:04:37,570 --> 00:04:42,560
Remember you can press shift have within any method to see what it does or see its Doc string returns

73
00:04:42,560 --> 00:04:46,280
the mean accuracy on the given test data and labels.

74
00:04:46,500 --> 00:04:52,160
And so what's happening here is that a model that would predict perfectly would get 100 percent here

75
00:04:52,410 --> 00:04:57,750
and actually I would say you should be skeptical of any model that gets always 100 percent because no

76
00:04:57,750 --> 00:05:01,890
model is perfect right a huge machine learning model is always getting its predictions right.

77
00:05:01,890 --> 00:05:06,060
I'd say there's some sort of error in your data or some sort of error in the way you've trained it.

78
00:05:06,120 --> 00:05:12,030
So our model doesn't get everything correct but at 85 percent it's still far better than just guessing

79
00:05:12,180 --> 00:05:12,380
right.

80
00:05:12,390 --> 00:05:16,020
Because remember we've got two labels heart disease or not.

81
00:05:16,230 --> 00:05:20,090
And so guessing would be just getting about 50 per cent now.

82
00:05:20,190 --> 00:05:25,650
Let's do the same as above except with some regression code and this time I'll let you off of this video.

83
00:05:25,680 --> 00:05:31,350
We've already typed out a little machine learning pipeline will come up here and we'll copy our regression

84
00:05:31,350 --> 00:05:32,070
code.

85
00:05:32,070 --> 00:05:37,210
We know it's regression because we've got the random forest aggressor and so we write it in a little

86
00:05:37,210 --> 00:05:38,260
come in here.

87
00:05:38,260 --> 00:05:41,470
Let's do the same but for regression

88
00:05:45,780 --> 00:05:50,920
beautiful what we want do is just fit it all.

89
00:05:51,000 --> 00:05:54,810
We've already got the fit function they're saying confusing myself.

90
00:05:54,810 --> 00:05:55,930
Beautiful.

91
00:05:56,170 --> 00:05:57,330
So the model is now fit.

92
00:05:57,390 --> 00:06:03,780
And now we'll do the score because we've run this sell our x test data has been replaced with the Boston

93
00:06:04,080 --> 00:06:05,850
data frame rather in the heart disease.

94
00:06:05,850 --> 00:06:12,500
So now we can just call it on x test and then Y test wonderful.

95
00:06:12,610 --> 00:06:17,050
And so you might be thinking well these numbers are quite similar here.

96
00:06:17,050 --> 00:06:23,050
Point a five point eighty seven and you're right they are pretty close but in fact our regression model

97
00:06:23,050 --> 00:06:25,580
is actually when we call score.

98
00:06:25,880 --> 00:06:33,340
It's actually using a different metric returns the coefficient of determination or r squared of the

99
00:06:33,340 --> 00:06:34,660
prediction.

100
00:06:34,660 --> 00:06:39,410
Now we'll drive a little bit deeper into some specific metrics per problem.

101
00:06:39,640 --> 00:06:46,270
But the thing to remember here is that the score function on every machine learning model has some kind

102
00:06:46,270 --> 00:06:49,450
of default evaluation metric built into it.

103
00:06:49,480 --> 00:06:51,640
So if we call the random forest regress.

104
00:06:51,940 --> 00:06:57,590
Chances are it will use the coefficient of determination as the default score metric.

105
00:06:57,600 --> 00:06:59,280
Now if we call any regress.

106
00:06:59,410 --> 00:07:06,760
If we go back to our machine learning map we call any one of these estimates here in the green boxes

107
00:07:07,300 --> 00:07:14,350
the default metric will likely be the coefficient of determination because they will all be regression

108
00:07:14,350 --> 00:07:17,160
models and the same goes for classification here.

109
00:07:17,280 --> 00:07:17,990
Right.

110
00:07:18,010 --> 00:07:21,660
Returns the mean accuracy on the given test data and labels.

111
00:07:22,000 --> 00:07:28,840
So for all of these classification models in the green squares here the default evaluation metric is

112
00:07:28,930 --> 00:07:30,010
accuracy.

113
00:07:30,010 --> 00:07:37,180
And so what happens when the score method gets called the model makes predictions on X test creates

114
00:07:37,180 --> 00:07:44,710
y predictions like we've seen up here before y reds and then it compares those predictions to the test

115
00:07:44,710 --> 00:07:50,320
to the actual labels and then returns back some sort of metric to compare how well our model actually

116
00:07:50,320 --> 00:07:51,600
did.

117
00:07:51,750 --> 00:07:52,170
Alright.

118
00:07:52,360 --> 00:07:57,640
That's the score parameter in a nutshell make some predictions compares them to the actual real labels

119
00:07:57,760 --> 00:08:00,510
and then give us an idea of how well our models are doing.

120
00:08:00,520 --> 00:08:04,900
So this is probably the first one that you'll call when you first train and fit a model.

121
00:08:05,080 --> 00:08:06,580
You call the score parameter.

122
00:08:06,580 --> 00:08:11,350
That's why it's listed as the first one here in three point three The Psychic loan documentation for

123
00:08:11,350 --> 00:08:13,380
metrics and scoring.

124
00:08:13,420 --> 00:08:18,460
So now we've seen score let's check out the scoring parameter.

125
00:08:18,670 --> 00:08:22,030
So what we'll do here we'll create another heading ready for the next video.

126
00:08:22,150 --> 00:08:28,510
For point two evaluating a model using the scoring parameter.

127
00:08:29,830 --> 00:08:35,890
So before we get into that one I would say press shift tab here and have a read of the doctoring here

128
00:08:35,890 --> 00:08:41,140
and see if you can figure out what the coefficient of determination is and the same thing goes for the

129
00:08:41,140 --> 00:08:45,280
accuracy here or the classification default score metric.

130
00:08:45,280 --> 00:08:49,600
Press shift tab and have a read through here and see if you can understand what's going on.

131
00:08:49,600 --> 00:08:51,880
If the doctoring doesn't really help you try.

132
00:08:51,880 --> 00:08:53,620
Check out the documentation here.

133
00:08:53,620 --> 00:08:55,030
Model evaluation.

134
00:08:55,240 --> 00:08:57,550
But otherwise I'll see in the next video.