1 00:00:00,450 --> 00:00:06,250 OK so we've seen how to make predictions using our train machine learning model and the predict function. 2 00:00:06,270 --> 00:00:10,250 Now let's have a look at how we can use a predict probe a function most distinctly. 3 00:00:10,260 --> 00:00:13,370 What exactly is the predict proto function. 4 00:00:13,390 --> 00:00:20,580 Now what we might do is I might move this comment into a markdown cell because this will make a little 5 00:00:20,580 --> 00:00:23,280 bit more sense and that is a function. 6 00:00:23,280 --> 00:00:32,420 So that's going to get bad and then we're going to go here and predict probe returns 7 00:00:35,260 --> 00:00:36,250 probabilities 8 00:00:38,590 --> 00:00:42,880 probabilities of a classification label. 9 00:00:42,910 --> 00:00:44,020 Don't take my word for it. 10 00:00:44,440 --> 00:00:48,570 Let's have a look at the psychic loan documentation because this is good practice. 11 00:00:48,620 --> 00:00:54,820 So I can't learn predict probe. 12 00:00:54,850 --> 00:00:57,050 There we go here. 13 00:00:57,160 --> 00:01:03,550 We won't predict probe but probability estimates the returned estimates for all classes are ordered 14 00:01:03,550 --> 00:01:05,770 by the label of classes. 15 00:01:05,770 --> 00:01:08,500 OK so probability estimates. 16 00:01:08,520 --> 00:01:09,640 Mm hmm. 17 00:01:09,760 --> 00:01:10,650 What does that mean. 18 00:01:10,660 --> 00:01:13,610 Scratches bead majestically. 19 00:01:14,170 --> 00:01:14,960 Let's have a look. 20 00:01:15,040 --> 00:01:17,520 Run the code first predict probe. 21 00:01:18,160 --> 00:01:20,470 What does it take if we go back to here. 22 00:01:20,680 --> 00:01:22,180 It takes X. 23 00:01:22,210 --> 00:01:22,560 All right. 24 00:01:22,570 --> 00:01:29,940 So that's what we can do takes X maybe we'll pass it the test data just like we did with predict while 25 00:01:29,940 --> 00:01:37,460 we get a lot maybe we only want to do the first five okay. 26 00:01:37,860 --> 00:01:41,640 So predict private returns a probability of a classification label. 27 00:01:41,640 --> 00:01:47,020 Now in psychic line they've used the return estimates for all classes. 28 00:01:47,040 --> 00:01:50,710 So a class would be not heart disease. 29 00:01:50,730 --> 00:01:53,040 And the other class would be heart disease. 30 00:01:53,040 --> 00:01:57,420 So that's just another word for different labels classes. 31 00:01:57,420 --> 00:01:59,150 Now what do we have here. 32 00:01:59,160 --> 00:02:02,130 Well we have probability estimates. 33 00:02:02,190 --> 00:02:04,640 Now what exactly is this. 34 00:02:05,070 --> 00:02:09,650 Let's predict on the same data. 35 00:02:10,610 --> 00:02:18,220 So it's probably easier to understand in contrast when we use just the normal predict function and we 36 00:02:18,220 --> 00:02:23,560 want the first five or test all right. 37 00:02:23,750 --> 00:02:27,290 So returns the probabilities of a classification label. 38 00:02:27,290 --> 00:02:28,640 That's what we got here. 39 00:02:28,640 --> 00:02:34,550 So if we look let's line up this predict private returns an array of five different samples. 40 00:02:34,550 --> 00:02:34,930 Right. 41 00:02:34,940 --> 00:02:40,970 So five this mega array here the two and braces here contain five smaller arrays. 42 00:02:41,030 --> 00:02:49,610 Because we've used five here using slicing and side as but within this array here there's five arrays 43 00:02:49,700 --> 00:02:51,430 of two numbers. 44 00:02:51,470 --> 00:02:55,260 But this only has one array of five numbers. 45 00:02:55,260 --> 00:02:56,530 Mm hmm. 46 00:02:56,660 --> 00:02:57,890 What's happening here. 47 00:02:58,490 --> 00:03:04,320 Well this is what it means by returns the probabilities of a classification label. 48 00:03:04,340 --> 00:03:11,210 So if we look at this let's line up sample one or sample zero with this first array we can see that 49 00:03:11,210 --> 00:03:18,230 the number on the left a.k.a. zero point eight nine is far greater than zero point 1 1. 50 00:03:18,230 --> 00:03:18,590 All right. 51 00:03:18,620 --> 00:03:20,000 Now let's see if there's a trend. 52 00:03:20,000 --> 00:03:28,630 Here we go to here index 1 the label the value on the right is bigger. 53 00:03:28,710 --> 00:03:30,390 And now this is a one. 54 00:03:30,400 --> 00:03:33,870 Now if we go to index 2 a.k.a. label 1. 55 00:03:33,870 --> 00:03:34,130 OK. 56 00:03:34,140 --> 00:03:36,340 The value on the right is bigger again. 57 00:03:36,360 --> 00:03:38,690 So that's index 1 of this array. 58 00:03:39,180 --> 00:03:40,390 And then we've got zero. 59 00:03:40,470 --> 00:03:42,640 This value is bigger. 60 00:03:42,720 --> 00:03:48,930 And then again for the final one it's value 1 and the value at index 1 is greater. 61 00:03:49,260 --> 00:03:50,450 Mm hmm. 62 00:03:50,690 --> 00:03:57,350 So what this is is it's making predictions on the same data but instead of just returning the label 63 00:03:57,770 --> 00:04:02,200 it's returning the probability of that label being true. 64 00:04:02,210 --> 00:04:04,150 So remember our labels are 0 and 1. 65 00:04:04,160 --> 00:04:16,500 So if we go here heart disease target and we want value counts so we've got one for heart disease and 66 00:04:16,500 --> 00:04:18,530 zero for not heart disease. 67 00:04:18,540 --> 00:04:27,630 So what predict probe or is doing is going hey I'm looking at the first five rows of these so x test 68 00:04:27,810 --> 00:04:29,740 I'm looking at these samples. 69 00:04:30,000 --> 00:04:37,260 What I've learned on the training data if I look at this sample here I'm giving it labels zero so not 70 00:04:37,260 --> 00:04:43,470 heart disease and I'm predicting that label zero with a probability of zero point eight nine. 71 00:04:43,990 --> 00:04:49,110 And so if we added these two together the maximum probability you can get is one. 72 00:04:49,140 --> 00:04:55,330 So zero point eight nine plus zero point one one and then if we did the same for the next one zero point 73 00:04:55,330 --> 00:05:02,470 four nine plus zero point five one in kind of get the point there right one one that's a maximum probability. 74 00:05:02,470 --> 00:05:11,640 So what it's saying is that this sample has a zero point eight nine probability of the label being zero. 75 00:05:11,650 --> 00:05:20,010 And the next sample here which gets the label 1 has a point 5 1 2 slightly only just slightly does that 76 00:05:20,010 --> 00:05:25,530 have a probability of being label 1 so that's why it's assigned one woman called predict we force the 77 00:05:25,530 --> 00:05:27,640 model to give us back one label. 78 00:05:27,690 --> 00:05:30,570 So this is where predict probe comes in handy. 79 00:05:30,570 --> 00:05:30,960 Right. 80 00:05:31,420 --> 00:05:36,420 We want to figure out the probability that our sample is given a certain label. 81 00:05:36,420 --> 00:05:42,030 So this one here is basically a coin toss and it's almost 50/50 but the model you could probably say 82 00:05:42,030 --> 00:05:50,280 is pretty confident on this sample this one here being zero because it's 89 versus point 1 1 the same 83 00:05:50,280 --> 00:05:50,930 one for here. 84 00:05:50,940 --> 00:05:51,270 Right. 85 00:05:51,270 --> 00:05:56,040 So this is number three the third index is given zero. 86 00:05:56,130 --> 00:05:58,020 We see this one here. 87 00:05:58,110 --> 00:06:01,900 So it's pretty damn confident that this one is not heart disease as well. 88 00:06:02,010 --> 00:06:04,240 So that's a difference between predict and predict probe. 89 00:06:04,860 --> 00:06:08,130 Is that if we did have more than two classes here. 90 00:06:08,160 --> 00:06:13,890 So if we had like 10 labels if you called predict probe on it you'd get values probability value for 91 00:06:13,890 --> 00:06:19,140 each of those classes that we had but because we only have to we're getting it back a raise of two samples 92 00:06:19,140 --> 00:06:19,850 here. 93 00:06:20,010 --> 00:06:26,260 And so the threshold because we have two samples is whichever one has over point five. 94 00:06:26,280 --> 00:06:32,200 So this is why this one has point over point five and it gets assigned a label of one. 95 00:06:32,280 --> 00:06:33,530 Same with the next one. 96 00:06:33,690 --> 00:06:35,090 And this one has over point five. 97 00:06:35,100 --> 00:06:39,620 So it gets assigned a label of zero which is the index of this array here. 98 00:06:39,870 --> 00:06:46,080 And then finally for this one it gets assigned a label of one because this one is over point five where 99 00:06:46,080 --> 00:06:52,580 the value of point eight to where could you use predict probiotic or maybe in the in the future. 100 00:06:52,590 --> 00:06:57,210 Right you're working on this kind of project you want to make sure that your model is very confident 101 00:06:57,440 --> 00:07:03,240 say we're deploying this to production right we're using this in a hospital and we don't want our model 102 00:07:03,270 --> 00:07:07,750 to give us samples that only have probability estimate of point 5 1. 103 00:07:07,760 --> 00:07:11,220 We want to go hey model only give us the samples. 104 00:07:11,220 --> 00:07:16,310 So this is where we could use predict probe to only give us the samples that are maybe even high and 105 00:07:16,310 --> 00:07:22,110 then point eight nine maybe we only want when our model is extremely confident and then we'll use that 106 00:07:22,110 --> 00:07:27,960 prediction or maybe it is helpful to know which samples are our models not sure about then maybe we 107 00:07:27,960 --> 00:07:33,870 could look at that sample and go hey why is that sample why is this row why is the model unclear about 108 00:07:33,870 --> 00:07:35,690 that is there something we could fix up. 109 00:07:35,820 --> 00:07:41,730 So that's sort of the value there between predict and predict probe to predict we'll give you a single 110 00:07:41,820 --> 00:07:48,840 label for each sample whereas predict probe the returns the probabilities of a classification label 111 00:07:49,170 --> 00:07:53,030 and remember the maximum value here is if you add these up is one. 112 00:07:53,040 --> 00:07:56,350 So the closer to 1 the more inverted commas. 113 00:07:56,370 --> 00:08:02,120 Sure your model is that the prediction it's made is a certain class. 114 00:08:02,250 --> 00:08:02,780 All right. 115 00:08:03,210 --> 00:08:10,080 So now we've seen the two main main ways of making predictions using a classification model I want you 116 00:08:10,080 --> 00:08:10,620 to have a thing. 117 00:08:10,620 --> 00:08:15,540 How can we make predictions using a regression model to revisit if you would go back up and look at 118 00:08:15,540 --> 00:08:17,190 our regression model code. 119 00:08:17,190 --> 00:08:21,720 How can we make a prediction using our regression model and say if we wanted to predict on our Boston 120 00:08:21,720 --> 00:08:27,770 housing dataset the median house price given a row and different characteristics about a town I'll challenge 121 00:08:27,780 --> 00:08:30,360 you to that maybe you'll figure it out before the next video. 122 00:08:30,390 --> 00:08:32,370 But otherwise we'll have a look at it then.