1
00:00:00,450 --> 00:00:06,250
OK so we've seen how to make predictions using our train machine learning model and the predict function.

2
00:00:06,270 --> 00:00:10,250
Now let's have a look at how we can use a predict probe a function most distinctly.

3
00:00:10,260 --> 00:00:13,370
What exactly is the predict proto function.

4
00:00:13,390 --> 00:00:20,580
Now what we might do is I might move this comment into a markdown cell because this will make a little

5
00:00:20,580 --> 00:00:23,280
bit more sense and that is a function.

6
00:00:23,280 --> 00:00:32,420
So that's going to get bad and then we're going to go here and predict probe returns

7
00:00:35,260 --> 00:00:36,250
probabilities

8
00:00:38,590 --> 00:00:42,880
probabilities of a classification label.

9
00:00:42,910 --> 00:00:44,020
Don't take my word for it.

10
00:00:44,440 --> 00:00:48,570
Let's have a look at the psychic loan documentation because this is good practice.

11
00:00:48,620 --> 00:00:54,820
So I can't learn predict probe.

12
00:00:54,850 --> 00:00:57,050
There we go here.

13
00:00:57,160 --> 00:01:03,550
We won't predict probe but probability estimates the returned estimates for all classes are ordered

14
00:01:03,550 --> 00:01:05,770
by the label of classes.

15
00:01:05,770 --> 00:01:08,500
OK so probability estimates.

16
00:01:08,520 --> 00:01:09,640
Mm hmm.

17
00:01:09,760 --> 00:01:10,650
What does that mean.

18
00:01:10,660 --> 00:01:13,610
Scratches bead majestically.

19
00:01:14,170 --> 00:01:14,960
Let's have a look.

20
00:01:15,040 --> 00:01:17,520
Run the code first predict probe.

21
00:01:18,160 --> 00:01:20,470
What does it take if we go back to here.

22
00:01:20,680 --> 00:01:22,180
It takes X.

23
00:01:22,210 --> 00:01:22,560
All right.

24
00:01:22,570 --> 00:01:29,940
So that's what we can do takes X maybe we'll pass it the test data just like we did with predict while

25
00:01:29,940 --> 00:01:37,460
we get a lot maybe we only want to do the first five okay.

26
00:01:37,860 --> 00:01:41,640
So predict private returns a probability of a classification label.

27
00:01:41,640 --> 00:01:47,020
Now in psychic line they've used the return estimates for all classes.

28
00:01:47,040 --> 00:01:50,710
So a class would be not heart disease.

29
00:01:50,730 --> 00:01:53,040
And the other class would be heart disease.

30
00:01:53,040 --> 00:01:57,420
So that's just another word for different labels classes.

31
00:01:57,420 --> 00:01:59,150
Now what do we have here.

32
00:01:59,160 --> 00:02:02,130
Well we have probability estimates.

33
00:02:02,190 --> 00:02:04,640
Now what exactly is this.

34
00:02:05,070 --> 00:02:09,650
Let's predict on the same data.

35
00:02:10,610 --> 00:02:18,220
So it's probably easier to understand in contrast when we use just the normal predict function and we

36
00:02:18,220 --> 00:02:23,560
want the first five or test all right.

37
00:02:23,750 --> 00:02:27,290
So returns the probabilities of a classification label.

38
00:02:27,290 --> 00:02:28,640
That's what we got here.

39
00:02:28,640 --> 00:02:34,550
So if we look let's line up this predict private returns an array of five different samples.

40
00:02:34,550 --> 00:02:34,930
Right.

41
00:02:34,940 --> 00:02:40,970
So five this mega array here the two and braces here contain five smaller arrays.

42
00:02:41,030 --> 00:02:49,610
Because we've used five here using slicing and side as but within this array here there's five arrays

43
00:02:49,700 --> 00:02:51,430
of two numbers.

44
00:02:51,470 --> 00:02:55,260
But this only has one array of five numbers.

45
00:02:55,260 --> 00:02:56,530
Mm hmm.

46
00:02:56,660 --> 00:02:57,890
What's happening here.

47
00:02:58,490 --> 00:03:04,320
Well this is what it means by returns the probabilities of a classification label.

48
00:03:04,340 --> 00:03:11,210
So if we look at this let's line up sample one or sample zero with this first array we can see that

49
00:03:11,210 --> 00:03:18,230
the number on the left a.k.a. zero point eight nine is far greater than zero point 1 1.

50
00:03:18,230 --> 00:03:18,590
All right.

51
00:03:18,620 --> 00:03:20,000
Now let's see if there's a trend.

52
00:03:20,000 --> 00:03:28,630
Here we go to here index 1 the label the value on the right is bigger.

53
00:03:28,710 --> 00:03:30,390
And now this is a one.

54
00:03:30,400 --> 00:03:33,870
Now if we go to index 2 a.k.a. label 1.

55
00:03:33,870 --> 00:03:34,130
OK.

56
00:03:34,140 --> 00:03:36,340
The value on the right is bigger again.

57
00:03:36,360 --> 00:03:38,690
So that's index 1 of this array.

58
00:03:39,180 --> 00:03:40,390
And then we've got zero.

59
00:03:40,470 --> 00:03:42,640
This value is bigger.

60
00:03:42,720 --> 00:03:48,930
And then again for the final one it's value 1 and the value at index 1 is greater.

61
00:03:49,260 --> 00:03:50,450
Mm hmm.

62
00:03:50,690 --> 00:03:57,350
So what this is is it's making predictions on the same data but instead of just returning the label

63
00:03:57,770 --> 00:04:02,200
it's returning the probability of that label being true.

64
00:04:02,210 --> 00:04:04,150
So remember our labels are 0 and 1.

65
00:04:04,160 --> 00:04:16,500
So if we go here heart disease target and we want value counts so we've got one for heart disease and

66
00:04:16,500 --> 00:04:18,530
zero for not heart disease.

67
00:04:18,540 --> 00:04:27,630
So what predict probe or is doing is going hey I'm looking at the first five rows of these so x test

68
00:04:27,810 --> 00:04:29,740
I'm looking at these samples.

69
00:04:30,000 --> 00:04:37,260
What I've learned on the training data if I look at this sample here I'm giving it labels zero so not

70
00:04:37,260 --> 00:04:43,470
heart disease and I'm predicting that label zero with a probability of zero point eight nine.

71
00:04:43,990 --> 00:04:49,110
And so if we added these two together the maximum probability you can get is one.

72
00:04:49,140 --> 00:04:55,330
So zero point eight nine plus zero point one one and then if we did the same for the next one zero point

73
00:04:55,330 --> 00:05:02,470
four nine plus zero point five one in kind of get the point there right one one that's a maximum probability.

74
00:05:02,470 --> 00:05:11,640
So what it's saying is that this sample has a zero point eight nine probability of the label being zero.

75
00:05:11,650 --> 00:05:20,010
And the next sample here which gets the label 1 has a point 5 1 2 slightly only just slightly does that

76
00:05:20,010 --> 00:05:25,530
have a probability of being label 1 so that's why it's assigned one woman called predict we force the

77
00:05:25,530 --> 00:05:27,640
model to give us back one label.

78
00:05:27,690 --> 00:05:30,570
So this is where predict probe comes in handy.

79
00:05:30,570 --> 00:05:30,960
Right.

80
00:05:31,420 --> 00:05:36,420
We want to figure out the probability that our sample is given a certain label.

81
00:05:36,420 --> 00:05:42,030
So this one here is basically a coin toss and it's almost 50/50 but the model you could probably say

82
00:05:42,030 --> 00:05:50,280
is pretty confident on this sample this one here being zero because it's 89 versus point 1 1 the same

83
00:05:50,280 --> 00:05:50,930
one for here.

84
00:05:50,940 --> 00:05:51,270
Right.

85
00:05:51,270 --> 00:05:56,040
So this is number three the third index is given zero.

86
00:05:56,130 --> 00:05:58,020
We see this one here.

87
00:05:58,110 --> 00:06:01,900
So it's pretty damn confident that this one is not heart disease as well.

88
00:06:02,010 --> 00:06:04,240
So that's a difference between predict and predict probe.

89
00:06:04,860 --> 00:06:08,130
Is that if we did have more than two classes here.

90
00:06:08,160 --> 00:06:13,890
So if we had like 10 labels if you called predict probe on it you'd get values probability value for

91
00:06:13,890 --> 00:06:19,140
each of those classes that we had but because we only have to we're getting it back a raise of two samples

92
00:06:19,140 --> 00:06:19,850
here.

93
00:06:20,010 --> 00:06:26,260
And so the threshold because we have two samples is whichever one has over point five.

94
00:06:26,280 --> 00:06:32,200
So this is why this one has point over point five and it gets assigned a label of one.

95
00:06:32,280 --> 00:06:33,530
Same with the next one.

96
00:06:33,690 --> 00:06:35,090
And this one has over point five.

97
00:06:35,100 --> 00:06:39,620
So it gets assigned a label of zero which is the index of this array here.

98
00:06:39,870 --> 00:06:46,080
And then finally for this one it gets assigned a label of one because this one is over point five where

99
00:06:46,080 --> 00:06:52,580
the value of point eight to where could you use predict probiotic or maybe in the in the future.

100
00:06:52,590 --> 00:06:57,210
Right you're working on this kind of project you want to make sure that your model is very confident

101
00:06:57,440 --> 00:07:03,240
say we're deploying this to production right we're using this in a hospital and we don't want our model

102
00:07:03,270 --> 00:07:07,750
to give us samples that only have probability estimate of point 5 1.

103
00:07:07,760 --> 00:07:11,220
We want to go hey model only give us the samples.

104
00:07:11,220 --> 00:07:16,310
So this is where we could use predict probe to only give us the samples that are maybe even high and

105
00:07:16,310 --> 00:07:22,110
then point eight nine maybe we only want when our model is extremely confident and then we'll use that

106
00:07:22,110 --> 00:07:27,960
prediction or maybe it is helpful to know which samples are our models not sure about then maybe we

107
00:07:27,960 --> 00:07:33,870
could look at that sample and go hey why is that sample why is this row why is the model unclear about

108
00:07:33,870 --> 00:07:35,690
that is there something we could fix up.

109
00:07:35,820 --> 00:07:41,730
So that's sort of the value there between predict and predict probe to predict we'll give you a single

110
00:07:41,820 --> 00:07:48,840
label for each sample whereas predict probe the returns the probabilities of a classification label

111
00:07:49,170 --> 00:07:53,030
and remember the maximum value here is if you add these up is one.

112
00:07:53,040 --> 00:07:56,350
So the closer to 1 the more inverted commas.

113
00:07:56,370 --> 00:08:02,120
Sure your model is that the prediction it's made is a certain class.

114
00:08:02,250 --> 00:08:02,780
All right.

115
00:08:03,210 --> 00:08:10,080
So now we've seen the two main main ways of making predictions using a classification model I want you

116
00:08:10,080 --> 00:08:10,620
to have a thing.

117
00:08:10,620 --> 00:08:15,540
How can we make predictions using a regression model to revisit if you would go back up and look at

118
00:08:15,540 --> 00:08:17,190
our regression model code.

119
00:08:17,190 --> 00:08:21,720
How can we make a prediction using our regression model and say if we wanted to predict on our Boston

120
00:08:21,720 --> 00:08:27,770
housing dataset the median house price given a row and different characteristics about a town I'll challenge

121
00:08:27,780 --> 00:08:30,360
you to that maybe you'll figure it out before the next video.

122
00:08:30,390 --> 00:08:32,370
But otherwise we'll have a look at it then.