1
00:00:00,360 --> 00:00:04,500
Now we've chewed the hyper parameters of our logistic regression model and it's getting pretty good

2
00:00:04,500 --> 00:00:05,350
score here.

3
00:00:05,370 --> 00:00:06,630
But remember this score.

4
00:00:06,630 --> 00:00:09,800
The default score for a classifier is accuracy.

5
00:00:09,810 --> 00:00:15,270
What we want to do now is create some evaluation metrics around our model that go a little bit beyond

6
00:00:15,270 --> 00:00:16,500
accuracy.

7
00:00:16,500 --> 00:00:23,130
More specifically let's create our tuned Shane learning model.

8
00:00:23,850 --> 00:00:28,650
Actually machine learning classifies for a better one because these evaluation metrics are specific

9
00:00:28,650 --> 00:00:29,900
to classification.

10
00:00:30,060 --> 00:00:33,060
So we're going to go beyond accuracy.

11
00:00:33,780 --> 00:00:40,310
So here and more specifically we want an ROIC curve and a UC school.

12
00:00:40,320 --> 00:00:47,310
We also want the confusion matrix putting flaws in there that actually looked like it was from The Matrix

13
00:00:47,310 --> 00:00:48,060
with lots of numbers.

14
00:00:48,060 --> 00:00:51,810
You know that net flowing green screen they want a classification report.

15
00:00:51,810 --> 00:01:05,780
What else do we want we want precision recall and f1 score and it would be great and it would be great

16
00:01:06,410 --> 00:01:11,610
if cross validation was used where possible.

17
00:01:12,390 --> 00:01:17,600
So let's turn that into markdown so that's what we're gonna be working on we'll use our churned grid

18
00:01:17,600 --> 00:01:23,180
search logistic regression model as well as the best type of parameters for it and we'll see where they

19
00:01:23,180 --> 00:01:24,290
come into play in a second.

20
00:01:24,740 --> 00:01:32,180
So first of all when we evaluate a model it's always comparing how a trained model's predictions compare

21
00:01:32,720 --> 00:01:34,530
to the truth labels.

22
00:01:34,550 --> 00:01:40,130
So what we have to do is make some predictions first so we can compare them to the truth labels a.k.a.

23
00:01:40,160 --> 00:01:42,590
the labels in the y test dataset.

24
00:01:43,130 --> 00:01:55,690
So let's do that or write a note here to make comparisons and evaluate our trained model.

25
00:01:55,810 --> 00:02:00,820
First we need to make predictions.

26
00:02:00,860 --> 00:02:04,790
So what we'll do is make predictions.

27
00:02:04,790 --> 00:02:10,460
We choose and model and there's a beautiful function called predict that we can use for that so we'll

28
00:02:10,460 --> 00:02:11,890
save it to y parades.

29
00:02:11,910 --> 00:02:16,190
We use G.S. log rig which is just our trained version of our grid search model.

30
00:02:16,880 --> 00:02:24,580
So G.S. log rig don't predict and we're gonna predict on the test data why.

31
00:02:24,620 --> 00:02:29,600
Wonderful and let's say them just to make sure we're not going crazy beautiful and now we need to compare

32
00:02:29,600 --> 00:02:30,860
them to the test dataset.

33
00:02:30,890 --> 00:02:32,300
So we have a look at this here.

34
00:02:32,970 --> 00:02:40,080
OK so if we wanted to go 0 0 we can know it's got this one wrong because that's supposed to be a zero

35
00:02:40,100 --> 00:02:43,510
there because this is the truth labels here and then we can keep going through there but we're not going

36
00:02:43,510 --> 00:02:44,220
to do that.

37
00:02:44,360 --> 00:02:46,710
We want to use some code to do that.

38
00:02:46,780 --> 00:02:51,590
The first thing first we want of rock curve because remember that we've got this little list up here

39
00:02:51,610 --> 00:02:54,650
so rock curve and AUC school.

40
00:02:55,160 --> 00:02:56,810
What is a rock curve.

41
00:02:56,810 --> 00:02:58,780
Well we looked that up.

42
00:02:58,800 --> 00:03:00,850
So what is a rock curve.

43
00:03:02,140 --> 00:03:08,120
We went through this process of understanding AC rock curves so if we were to read through this what

44
00:03:08,120 --> 00:03:14,090
we would get is the rock curve is created by plotting the true positive rate against the false positive

45
00:03:14,090 --> 00:03:16,610
rate at various threshold settings.

46
00:03:16,610 --> 00:03:17,250
Okay.

47
00:03:17,360 --> 00:03:24,700
Beautiful so let's see how we do that the rock curve is a way of understanding how your model is performing

48
00:03:24,940 --> 00:03:29,130
by comparing the true positive right to the false positive right.

49
00:03:29,260 --> 00:03:33,550
And if we want to figure out what a true positive and a false positive is we can have a look at our

50
00:03:33,550 --> 00:03:34,780
confusion matrix for that.

51
00:03:34,780 --> 00:03:40,750
So a true positive model predicts one when the truth is 1 and a false positive is just the model predicts

52
00:03:40,750 --> 00:03:42,270
one when the truth is supposed to be zero.

53
00:03:44,170 --> 00:03:46,540
And a perfect model is going to get an R U C score.

54
00:03:46,540 --> 00:03:49,140
What we see in a second of 1.0.

55
00:03:49,300 --> 00:03:50,320
So let's see how we do it.

56
00:03:50,320 --> 00:03:57,660
So import ROIC curve function from SBA loan dot matrix.

57
00:03:57,730 --> 00:03:58,470
That's where it's from.

58
00:03:58,480 --> 00:03:59,220
That's where it lives.

59
00:03:59,230 --> 00:04:00,710
But we've actually already done this.

60
00:04:00,730 --> 00:04:07,520
As with all the other SBA loan functions we've been using right back up the top right back up here.

61
00:04:07,550 --> 00:04:11,420
So this is Model evaluations right now in this next section.

62
00:04:11,420 --> 00:04:14,280
This next video we're going to tackle everything from here.

63
00:04:14,330 --> 00:04:20,330
So these are we can see here we've got randomized search of a grid search of a confusion matrix plot

64
00:04:20,330 --> 00:04:24,800
rock curve that's we're about to use precision school recall score F one school.

65
00:04:24,830 --> 00:04:27,430
So these are for our classification model.

66
00:04:27,430 --> 00:04:30,810
So let's go damn we can save ourselves a line of code.

67
00:04:30,820 --> 00:04:35,510
We don't actually have to import that come right back down here.

68
00:04:35,510 --> 00:04:38,450
We can just go plot rock curve.

69
00:04:38,450 --> 00:04:43,850
Now this is a relatively new addition to this socket line library which I'm very very happy with because

70
00:04:43,850 --> 00:04:46,250
usually you had to write these on your own.

71
00:04:46,370 --> 00:04:52,130
This function this plot rock curve and calculate it actually calculates the AEC metric for us so the

72
00:04:52,130 --> 00:04:54,660
area under the curve metric.

73
00:04:55,160 --> 00:05:01,410
Let's see it in action go plot rock curve G.S. log rig.

74
00:05:01,570 --> 00:05:07,460
And if we look at this in the documentation and tell us what it does pass it and estimate which is our

75
00:05:07,460 --> 00:05:13,880
machine learning model pass it X pass it y plot receive a operating characteristic a.k.a. rock curve

76
00:05:14,430 --> 00:05:21,110
that's that's what that's gonna do for us beautiful engineers log rank x test we wanted to do it on

77
00:05:21,110 --> 00:05:26,990
the test data set always evaluate this should be calculate always evaluate in machine learning models

78
00:05:26,990 --> 00:05:32,610
in the test dataset and let's see all beautiful.

79
00:05:33,320 --> 00:05:38,810
Okay so remember a perfect rock curve we've seen in previous videos we go up to this corner and then

80
00:05:38,810 --> 00:05:39,620
across like that.

81
00:05:39,620 --> 00:05:43,120
But as he is up pretty well then we're going to area under the curve.

82
00:05:43,130 --> 00:05:48,160
So if we calculated all this area under here of zero point nine three of course a perfect model will

83
00:05:48,170 --> 00:05:49,190
achieve one point zero.

84
00:05:49,190 --> 00:05:55,730
So our model is not perfect but it's an AUC score of zero point nine three when the average of just

85
00:05:57,180 --> 00:06:00,240
tossing a coin essentially would be zero point five.

86
00:06:00,270 --> 00:06:04,990
So we're edging close to a perfect model not bad for a model that just came out of the box.

87
00:06:05,010 --> 00:06:05,390
All right.

88
00:06:05,400 --> 00:06:09,270
The next thing we want to do is a confusion matrix.

89
00:06:09,390 --> 00:06:10,410
So let's say that.

90
00:06:10,410 --> 00:06:13,280
So confusion Matrix.

91
00:06:13,560 --> 00:06:14,610
How could we do that.

92
00:06:15,210 --> 00:06:20,820
So I think circuit line has a function and I don't think I know socket line has a function we can go

93
00:06:21,120 --> 00:06:26,360
confusion matrix and want to compare the ground truth labels with the predicted labels.

94
00:06:26,480 --> 00:06:27,060
OK.

95
00:06:27,270 --> 00:06:29,940
I could just leave it as that it's a bit bland.

96
00:06:29,940 --> 00:06:35,430
We can improve the visualization of this confusion matrix using seaborne.

97
00:06:35,430 --> 00:06:36,000
That's what you want.

98
00:06:36,180 --> 00:06:37,280
So we want to go.

99
00:06:37,560 --> 00:06:41,200
I believe we already have Seabourn here you're doing.

100
00:06:41,330 --> 00:06:43,870
Yes and yes yes we do.

101
00:06:43,870 --> 00:06:45,850
So we've you see bomb before in this project.

102
00:06:45,850 --> 00:06:52,690
So we can go what I might do is I know ahead of time that we need a bigger font size font size equals

103
00:06:52,750 --> 00:06:54,160
one point five.

104
00:06:54,250 --> 00:06:56,440
That's just s an S dot set.

105
00:06:56,440 --> 00:07:00,520
Now we're going to create a little function here in case we wanted to make another confusion matrix

106
00:07:01,330 --> 00:07:07,150
because socket lines confusion matrix function isn't up to scratch as is making this video could always

107
00:07:07,510 --> 00:07:10,290
maybe that's an opportunity for a pull request.

108
00:07:10,450 --> 00:07:13,640
So we go here nice and simple.

109
00:07:13,810 --> 00:07:20,160
So we're just gonna pass it this function our test libels and Al predicted libels.

110
00:07:20,170 --> 00:07:31,020
And then we're gonna create a nice looking plots a nice looking confusion matrix using C bonds.

111
00:07:31,300 --> 00:07:32,710
We've seen this before.

112
00:07:32,750 --> 00:07:35,660
Hate Matt wonderful.

113
00:07:35,820 --> 00:07:49,780
So we go fig X equals pale T dot sub plots fig size equals 3 3 0 shift in top trigger happy again X

114
00:07:49,810 --> 00:07:51,810
it S.A. dot heat map.

115
00:07:51,850 --> 00:07:52,470
Wonderful.

116
00:07:52,480 --> 00:08:03,040
We're gonna pass it a socket line function side confusion matrix y test y Fred's and then we want to

117
00:08:03,040 --> 00:08:05,370
add here and not so annotate.

118
00:08:05,410 --> 00:08:10,260
Yes please annotate Eagles true.

119
00:08:10,460 --> 00:08:15,180
Now we're gonna go see bar because we don't want to Calabar because I've seen the Calabar it doesn't

120
00:08:15,180 --> 00:08:24,320
look too great on the confusion matrix penalty don't ex label true liable and then we're gonna add peyote

121
00:08:24,390 --> 00:08:32,480
don't why liable gonna guy predicted libel and let's see it.

122
00:08:32,570 --> 00:08:41,290
So we're gonna plot comes from that confusion matrix y test y parades o f what have we done.

123
00:08:41,530 --> 00:08:50,150
Font Size that sack of font size wrong what is this tab or scale.

124
00:08:50,150 --> 00:08:56,120
There we go tab when I complete for Lent are classic and it's giving us this little cutoff thing here

125
00:08:56,120 --> 00:09:05,810
so we can fix this I believe by going bottom top because acts dot get Y limb Yeah and then we want to

126
00:09:05,810 --> 00:09:16,600
go ax dot set y limb make a little bit pretty set y lamb and it's going to be bottom last point five

127
00:09:16,610 --> 00:09:19,360
top minus zero point five.

128
00:09:19,370 --> 00:09:25,010
Now of course if your confusion Matrix came out looking well you might not need these lines but doesn't

129
00:09:25,010 --> 00:09:25,340
matter.

130
00:09:25,640 --> 00:09:26,480
There we go.

131
00:09:26,490 --> 00:09:32,030
It might look a little bit better so you can see that the model gets confused so predicts the wrong

132
00:09:32,030 --> 00:09:36,020
label relatively the same across both classes.

133
00:09:36,050 --> 00:09:42,710
So in essence there's four occasions here where the model predicted zero the predicted label so predicted

134
00:09:42,710 --> 00:09:46,080
someone didn't have heart disease when they should have been predicted as one.

135
00:09:46,160 --> 00:09:47,360
So that's a false negative.

136
00:09:47,390 --> 00:09:52,940
So it's predicting zero instead of one a false negative remember is we come back here false negative

137
00:09:52,940 --> 00:09:57,430
model predict zero when the truth is 1 and over here is the false positives.

138
00:09:57,530 --> 00:10:02,780
So okay you've got three instances here where a model predicts one someone does have heart disease when

139
00:10:02,780 --> 00:10:04,160
they actually don't.

140
00:10:04,160 --> 00:10:08,870
And you can see these are both things that we want to avoid right especially when I'm thinking about

141
00:10:08,870 --> 00:10:14,510
something as severe or as serious as heart disease even predicting when it's not present so here if

142
00:10:14,510 --> 00:10:16,820
we predict zero so no disease when it is present.

143
00:10:16,820 --> 00:10:22,850
Okay that's bad but also predicting that it is there when it's not actually there that's also bad.

144
00:10:22,880 --> 00:10:27,080
So that's something that you want to have to consider when you're building these type of models is a

145
00:10:27,080 --> 00:10:30,720
false negative worse or is a false positive worse.

146
00:10:30,800 --> 00:10:36,560
And again a perfect model would have none of these but in reality you're going to probably end up having

147
00:10:36,560 --> 00:10:41,720
some sort of confusion here with your model you're not going to ideally these would be perfect across

148
00:10:41,720 --> 00:10:43,370
here across the line.

149
00:10:43,370 --> 00:10:47,840
Now we've done a confusion matrix and got that there we could share that with our boss we could share

150
00:10:47,840 --> 00:10:50,390
that with our boss a rock curve there.

151
00:10:50,390 --> 00:10:51,010
What's next.

152
00:10:51,010 --> 00:10:52,310
Classification report.

153
00:10:52,550 --> 00:10:53,140
Okay.

154
00:10:53,150 --> 00:10:56,950
And it would be great if cross validation was used where possible.

155
00:10:56,960 --> 00:10:58,960
All right so that's what we might tackle in the next video.