1
00:00:00,270 --> 00:00:01,300
Wonderful.

2
00:00:01,350 --> 00:00:06,660
We've seen how to quickly get a sniff of how our machine learning models doing and evaluate it using

3
00:00:06,660 --> 00:00:11,910
the score method and that will return a default evaluation metric depending on the problem we're working

4
00:00:11,910 --> 00:00:19,080
on in regression it's returns the coefficient of determination and in classification it returns the

5
00:00:19,080 --> 00:00:20,820
mean accuracy.

6
00:00:20,820 --> 00:00:27,210
However when you get further into a problem it's likely you'll want to start using some more powerful

7
00:00:27,210 --> 00:00:30,330
metrics to evaluate your model's performance.

8
00:00:30,330 --> 00:00:36,140
And so naturally the next step up from using score is to use a custom scoring parameter.

9
00:00:36,180 --> 00:00:41,340
So if we see here we've looked at this one estimate a score method and now we're gonna have a look at

10
00:00:41,340 --> 00:00:48,390
using a scoring parameter model evaluation tools using cross validation or we might have to see what

11
00:00:48,390 --> 00:00:48,930
that is.

12
00:00:49,290 --> 00:00:50,920
So let's dive into it.

13
00:00:50,910 --> 00:00:55,160
Hey what we're going to do is to work this out.

14
00:00:55,170 --> 00:00:59,400
Now you'll have to bear with me for the next few videos because we're going to cover a fair bit here.

15
00:00:59,400 --> 00:01:03,860
Reason being is because evaluating a machine learning model is such an important step.

16
00:01:03,860 --> 00:01:09,150
It's one thing to call a fit function on some data but the next most important thing is going hey is

17
00:01:09,150 --> 00:01:11,670
that model actually working is it learning something.

18
00:01:11,670 --> 00:01:14,320
Could we use that to predict in the future.

19
00:01:14,400 --> 00:01:18,270
And so that's why making sure that we're covering a lot of ground here.

20
00:01:18,330 --> 00:01:19,820
So let's get started.

21
00:01:20,400 --> 00:01:28,120
What we're going to do is go from S.K. learn not model selection import cross vowel score.

22
00:01:28,160 --> 00:01:33,930
And as always we're gonna run the code first before we dive into what's actually going on and what we

23
00:01:33,930 --> 00:01:37,050
do need is some classification code.

24
00:01:37,080 --> 00:01:37,800
So what we're going to do.

25
00:01:37,890 --> 00:01:39,510
We can copy this.

26
00:01:39,510 --> 00:01:43,300
You're allowed to copy this by the way because we've written in a fair few times now.

27
00:01:43,410 --> 00:01:50,710
So only thing different here of what we've done is we've imported cross vowel score from cyclones model

28
00:01:50,760 --> 00:01:51,730
selection.

29
00:01:51,840 --> 00:01:58,080
Now next thing we're going to do because we've called FET there whereas put a little semicolon so we

30
00:01:58,080 --> 00:01:59,400
don't get a big output.

31
00:01:59,490 --> 00:02:03,690
You know what this warning keeps coming up so I might just keep changing that and estimate as equals

32
00:02:03,690 --> 00:02:04,140
100.

33
00:02:04,740 --> 00:02:06,350
Wonderful.

34
00:02:06,350 --> 00:02:15,010
And so cross Val score it has the word score in it but we've seen what does this cross Val doing.

35
00:02:15,010 --> 00:02:16,700
Well let's have a look.

36
00:02:16,740 --> 00:02:24,910
Look I see a dot score we're going to do the test data first so we can compare and then we'll do the

37
00:02:24,910 --> 00:02:34,060
same but this time using cross Val scored cross Val school and cross Val score takes our classifier

38
00:02:34,480 --> 00:02:43,070
it takes X data and it takes a y data on not the test and not the train we'll see what's going on in

39
00:02:43,070 --> 00:02:47,770
the second huh what's happening here.

40
00:02:47,790 --> 00:02:52,560
And again we're getting another warning this is just to say that the default value of CV will change

41
00:02:52,560 --> 00:02:53,740
from three to five.

42
00:02:53,970 --> 00:03:01,040
So if we have a look at this shift tab let's read the doctoring evaluating a score by cross validation

43
00:03:01,070 --> 00:03:05,770
What even is cross validation and so CV is gonna give us a warning.

44
00:03:05,900 --> 00:03:07,430
This is where that warning is coming from.

45
00:03:07,490 --> 00:03:13,650
If we change this from CV Eagles the default is three if we change it to five.

46
00:03:13,760 --> 00:03:20,330
That warning is going to go away and then we're going to get an array back of five different scores.

47
00:03:20,780 --> 00:03:27,350
So that's really the first difference that you'll notice cross Val score returns an array whereas score

48
00:03:27,770 --> 00:03:31,410
only returns a single number okay.

49
00:03:31,450 --> 00:03:33,380
So how can we figure this out.

50
00:03:33,640 --> 00:03:39,640
We need to figure out what cross validation is because that's after all what cross Val score is doing

51
00:03:40,300 --> 00:03:47,400
come back into the doctoring value add a score by cross validation on lucky his on the not prepared

52
00:03:47,430 --> 00:03:49,420
earlier to demonstrate.

53
00:03:49,500 --> 00:03:54,990
And as always I understand things better visually you might do the same to demonstrate what cross validation

54
00:03:55,230 --> 00:03:56,580
is doing.

55
00:03:56,580 --> 00:04:00,280
So what we've done before in our normal training test splint.

56
00:04:00,330 --> 00:04:06,300
So we've split our data into training so this could be a stay we started off with 100 patient records

57
00:04:06,670 --> 00:04:12,240
we'd split it into a training split in our case we've used 80 percent and this would contain X train

58
00:04:12,270 --> 00:04:17,970
and Y train 80 samples and I've used the number 100 here we've really got more than that but this is

59
00:04:17,970 --> 00:04:24,680
just because the numbers work out visually better here and in our test data set we've got 20 percent

60
00:04:24,690 --> 00:04:28,710
of the data which would contain x test and Y test.

61
00:04:28,710 --> 00:04:35,370
Now the difference here with cross validation and in our case this image here is demonstrating five

62
00:04:35,370 --> 00:04:36,970
fold cross validation.

63
00:04:37,140 --> 00:04:43,230
What you probably see cross validation referred to is K fold where k is an arbitrary number and the

64
00:04:43,230 --> 00:04:48,260
reason why we're using five is because we've come back here CV equals five there.

65
00:04:48,270 --> 00:04:54,420
So that stands for cross validation and this is what I'm talking about splitting our data into training

66
00:04:54,420 --> 00:04:58,070
and test using 20 per cent for the test size.

67
00:04:58,080 --> 00:05:05,250
So that means naturally the other 80 per cent goes to the training data which is where this level split

68
00:05:05,250 --> 00:05:06,900
here comes from.

69
00:05:06,990 --> 00:05:12,610
Now what cross validation does is it does five different splits.

70
00:05:12,630 --> 00:05:14,970
So it will use the first 20 percent.

71
00:05:15,000 --> 00:05:18,940
So see how here this is kind of using the last tail 20 percent here.

72
00:05:18,990 --> 00:05:24,780
So it'll create a test data set here and a training data set here then we'll move over here and use

73
00:05:24,780 --> 00:05:28,920
this as the test data set and then again and again and again.

74
00:05:29,010 --> 00:05:36,180
And so what happens is that cross validation trains five different versions of the model and then it

75
00:05:36,300 --> 00:05:44,010
evaluates that those models trained on each training data on five different versions of the test data.

76
00:05:44,070 --> 00:05:46,830
So what's the purpose of this.

77
00:05:46,830 --> 00:05:55,380
Well as you could imagine if we're only training one model it could be a lucky split like say this 80

78
00:05:55,380 --> 00:06:00,090
percent of rows say that had a whole bunch of information in the model was able to learn really well

79
00:06:00,300 --> 00:06:06,600
on these 80 rows on these 80 patient records and then it got a really good score on this test set.

80
00:06:06,660 --> 00:06:11,400
Is that a true reflection of how our model would understand the data or figure out the patterns in the

81
00:06:11,400 --> 00:06:12,420
data.

82
00:06:12,420 --> 00:06:15,120
Well not really right because it's just luck right.

83
00:06:15,120 --> 00:06:20,910
If we're just splitting this randomly and somehow a bunch of easy patient records have gotten here and

84
00:06:20,910 --> 00:06:25,380
the models figured out certain patterns and it's gone over to here this test data center it's got an

85
00:06:25,380 --> 00:06:29,970
amazing score we could be tricking ourselves we could be fooling ourselves into thinking that our model

86
00:06:30,180 --> 00:06:32,680
is far better than what it actually is.

87
00:06:32,700 --> 00:06:35,960
So that's where cross validation comes in to play.

88
00:06:36,540 --> 00:06:43,710
It aims to provide a solution to not training on all the data and avoiding getting those lucky scores

89
00:06:43,950 --> 00:06:45,810
on just a single split of data.

90
00:06:45,840 --> 00:06:49,110
So it'll create five different splits.

91
00:06:49,200 --> 00:06:56,760
So no matter what our model is going to be training on all of the data and evaluated on all of the data.

92
00:06:56,820 --> 00:07:04,590
And so this is why if we come back to C cross value score it's why gives us back five different scores

93
00:07:04,590 --> 00:07:06,510
here.

94
00:07:06,540 --> 00:07:12,690
So that's starting to make sense of we call the score parameter on only our x test data and our y test

95
00:07:12,690 --> 00:07:15,610
data is gonna give back one score.

96
00:07:15,780 --> 00:07:23,070
But if we call cross value score because we refer back to the graphic here it's going to make five different

97
00:07:23,070 --> 00:07:29,010
splits and remember five fold is just an arbitrary number you could do 10 fold you could do three fold

98
00:07:29,010 --> 00:07:34,080
you could even do 100 fold but five fold is the default of the library it's usually pretty good depending

99
00:07:34,080 --> 00:07:35,070
on the size of your data.

100
00:07:35,070 --> 00:07:39,710
So we'll use fivefold to demonstrate here just to prove it to you.

101
00:07:39,720 --> 00:07:42,040
We can go in here cross Val score.

102
00:07:42,130 --> 00:07:43,390
We could even do 10.

103
00:07:43,410 --> 00:07:50,430
So this means it's just gonna make 10 different splits exactly the same as this and then return 10 different

104
00:07:50,530 --> 00:07:54,950
scores so you see here this is a great example on split 1.

105
00:07:54,990 --> 00:07:57,080
It's got a score of point nine.

106
00:07:57,090 --> 00:08:03,570
So that could be 90 percent which is higher here but on a later split it's got something much lower

107
00:08:03,830 --> 00:08:05,210
72.

108
00:08:05,220 --> 00:08:13,530
And so what we do here is to figure out in a more ideal performance metric or evaluation metric for

109
00:08:13,530 --> 00:08:20,880
our model is that we can take the average of these five scores let's see it happen we'll do it all in

110
00:08:20,880 --> 00:08:30,490
one cell random seed and what we're going to do is get a single training and test lit score we're going

111
00:08:30,490 --> 00:08:40,080
to make sure see a left single score equals CnF score x test it's gonna use the same data here why test

112
00:08:41,380 --> 00:08:42,010
wonderful.

113
00:08:42,510 --> 00:08:53,920
And then we're gonna go take mane of five fold cross validation school CSF cross vowel school Eagles

114
00:08:54,070 --> 00:09:04,480
NDP mane then we need cross vowels score X on No we need to pass it our classifier x y and will you

115
00:09:04,480 --> 00:09:18,460
see the Eagles five and then compare the two so I see a live single score and then see a left cross

116
00:09:18,580 --> 00:09:22,130
Val score burn.

117
00:09:22,730 --> 00:09:25,860
So what do you see here.

118
00:09:25,880 --> 00:09:32,720
Well in our case our original single score which is now down here just the exact same number because

119
00:09:32,720 --> 00:09:39,290
we're using a random seed using the same test data our single score is point eight five but when we

120
00:09:39,290 --> 00:09:47,270
use cross validation when we use five splits because CV equals five we get a score of point eight two

121
00:09:47,870 --> 00:09:54,170
so it's slightly lower but in this case if you are asked to report the accuracy of your model even though

122
00:09:54,170 --> 00:10:01,190
it is lower you'd prefer the cross validation metric over the non cross validation metric.

123
00:10:01,370 --> 00:10:07,080
Now wait we haven't even used the scoring parameter at all.

124
00:10:07,310 --> 00:10:12,180
Well that's because by default it's set to None.

125
00:10:12,200 --> 00:10:19,190
Let's have a look at the scoring parameter set to None by default.

126
00:10:19,340 --> 00:10:31,550
So if we call cross Val school seal Elif x y CV we can pass scoring here and it's gonna be set to None.

127
00:10:31,550 --> 00:10:34,520
So if we do shift tab on this how do I know this.

128
00:10:34,560 --> 00:10:39,350
Well luckily the docs string comes in handy C by default it's set to None.

129
00:10:39,350 --> 00:10:43,030
So if we keep scrolling down move our notebook.

130
00:10:43,050 --> 00:10:47,430
So scoring string callable on one optional default.

131
00:10:47,430 --> 00:10:49,110
None.

132
00:10:49,410 --> 00:10:51,310
Okay a string.

133
00:10:51,360 --> 00:10:53,370
See model evaluation documentation.

134
00:10:53,370 --> 00:11:00,190
That's what we've had to look up here or a scorer callable object slash function with signature scorer

135
00:11:00,220 --> 00:11:07,210
estimated x y wish a return only a single value if none the estimate is default scorer if available

136
00:11:07,360 --> 00:11:09,480
is used.

137
00:11:09,940 --> 00:11:16,510
Okay now this is why we know that this is accuracy because if the scoring parameter of cross vowel score

138
00:11:16,750 --> 00:11:22,300
is none it uses the default scoring parameter of our estimate.

139
00:11:22,470 --> 00:11:27,600
In our case what is a default

140
00:11:30,540 --> 00:11:37,200
scoring parameter of classifier equals mean accuracy.

141
00:11:37,350 --> 00:11:39,790
And where do we see that before we saw that in last video.

142
00:11:39,820 --> 00:11:46,920
Go see a left score and we hit shift tab returns the mean accuracy on the given test data and label

143
00:11:46,920 --> 00:11:49,800
so that means when we have scoring set to None.

144
00:11:50,040 --> 00:11:57,100
It's gonna use the default evaluation metric for cross validation on our classifier so if we hit shift

145
00:11:57,100 --> 00:12:02,620
and enter it's gonna return the same values all might be slightly different right because we haven't

146
00:12:02,620 --> 00:12:06,970
set up a seed in this cell so these values are gonna be different to the cross fail score we see out

147
00:12:06,970 --> 00:12:07,480
there.

148
00:12:07,540 --> 00:12:10,540
If we'd run it in here we would have seen a similar values.

149
00:12:10,910 --> 00:12:12,490
Who would have we covered here.

150
00:12:13,430 --> 00:12:19,520
Well as you might have guessed the scoring parameter can be changed right so we can as the docs shrink

151
00:12:19,520 --> 00:12:22,800
says we can input our own scoring parameter here.

152
00:12:23,060 --> 00:12:25,350
We can change this to something other than none.

153
00:12:25,400 --> 00:12:26,950
That is what we're going to start to cover right.

154
00:12:26,960 --> 00:12:28,970
We're gonna have a look at in the next few videos.

155
00:12:29,090 --> 00:12:35,300
Some other classification model evaluation metrics that we can use with cross validation.

156
00:12:35,420 --> 00:12:44,170
And so while we use cross validation well as we saw in the picture cross validation aims to solve the

157
00:12:44,170 --> 00:12:46,750
problem of not training on all the data.

158
00:12:46,750 --> 00:12:47,020
Right.

159
00:12:47,020 --> 00:12:52,900
So we're creating five models and we end up having a model trained on all of the data and avoiding getting

160
00:12:52,900 --> 00:12:54,330
lucky scores.

161
00:12:54,340 --> 00:12:55,890
So training on a single split.

162
00:12:56,440 --> 00:13:03,640
And we saw that in action with our single split score here getting a slightly higher score than our

163
00:13:03,640 --> 00:13:05,960
cross validation score.

164
00:13:06,030 --> 00:13:06,540
Who.

165
00:13:07,170 --> 00:13:07,920
That's a lot.

166
00:13:08,280 --> 00:13:10,200
Now we've still got a bit more to go.

167
00:13:10,230 --> 00:13:14,150
Let's get some depth classification metrics happening.

168
00:13:14,400 --> 00:13:15,320
I'll see in the next video.