1 00:00:00,780 --> 00:00:06,960 In this lesson we're going to talk about another metric to complement our recall and accuracy metrics. 2 00:00:07,080 --> 00:00:14,700 This metric is called precision or it's also known as positive predictive value. 3 00:00:14,700 --> 00:00:24,300 Precision is defined as the true positives divided by the sum of the true positives and the false positives. 4 00:00:25,260 --> 00:00:32,730 In other words we have a ratio of correctly predicted spam messages to the total number of times we 5 00:00:32,730 --> 00:00:36,440 predicted that an email will span. 6 00:00:36,490 --> 00:00:42,970 Now this equation looks very very similar to recall but instead of using the false negatives we are 7 00:00:42,970 --> 00:00:50,880 using the false positives in the denominator and we also see that looking at this definition a high 8 00:00:50,880 --> 00:00:56,340 precision relates to a low false positive rate. 9 00:00:56,340 --> 00:01:04,650 Remember that in our case a false positive is a non spam message that ends up in the spam folder. 10 00:01:04,650 --> 00:01:09,100 This is why many people think of precision as a measure of exactness. 11 00:01:09,120 --> 00:01:16,240 Precision is a measure of quality let's have another look at our chart and see where those false positives 12 00:01:16,360 --> 00:01:19,380 end up showing up the true positives. 13 00:01:19,420 --> 00:01:26,290 As we've discussed before are in this upper triangle and we've got them here showing up in green. 14 00:01:26,500 --> 00:01:33,430 The false positives on the other hand are the non spam messages that are in this upper triangle and 15 00:01:33,430 --> 00:01:42,580 I've colored them here in Orange and they're sitting alongside the correctly identified true spam messages. 16 00:01:42,610 --> 00:01:48,250 Now having looked at this a stylized slide here let's go back to our Jupiter notebook and look at our 17 00:01:48,250 --> 00:01:56,380 actual data and see if we can identify some of these false positives and false negatives on the actual 18 00:01:56,380 --> 00:01:59,040 chart that we generated. 19 00:01:59,080 --> 00:02:06,040 So if I have a look up here I can see some false negatives already I can see that with these red crosses 20 00:02:06,190 --> 00:02:14,920 showing up in our non spam section we've definitely got some spam emails that manage to inbox and down 21 00:02:14,920 --> 00:02:18,350 here I can actually spot a false positive as well. 22 00:02:18,400 --> 00:02:25,250 This blue dot is classified as spam but in actuality is a non spam email. 23 00:02:26,030 --> 00:02:31,870 But one thing that you already notice is that we have far more spam emails that got classified as non 24 00:02:31,870 --> 00:02:34,750 spam than the other way around. 25 00:02:34,930 --> 00:02:41,530 In a previous lesson we've talked about how an imbalance dataset can give us a misleadingly high accuracy 26 00:02:41,530 --> 00:02:42,610 metric. 27 00:02:42,640 --> 00:02:47,800 We had this hypothetical cancer model that simply labeled everyone as not having cancer. 28 00:02:49,190 --> 00:02:55,010 And given the prevalence of cancer in the UK and the population we calculated that the accuracy would 29 00:02:55,010 --> 00:03:04,500 be around 96 percent but if you think about how recall and precision are calculated what would the recall 30 00:03:04,500 --> 00:03:09,900 metric and the precision metric for this hypothetical model look like. 31 00:03:09,900 --> 00:03:12,780 Have a think about this and I'll tell you the answer shortly. 32 00:03:15,770 --> 00:03:22,670 Well the answer is that we would have a value of zero for precision and we would also have a big fat 33 00:03:22,670 --> 00:03:24,930 zero for recall. 34 00:03:24,950 --> 00:03:26,030 Why. 35 00:03:26,030 --> 00:03:29,720 Well let's take a look at these two equations in our thought experiment. 36 00:03:29,720 --> 00:03:32,810 We actually have no true positives. 37 00:03:32,870 --> 00:03:37,060 There's not a single patient with cancer that was correctly identified. 38 00:03:37,130 --> 00:03:40,920 Therefore both recall and precision are equal to zero. 39 00:03:41,030 --> 00:03:49,270 While the accuracy is still very high and this actually leads me onto my next question for you Do you 40 00:03:49,270 --> 00:03:53,720 think that there is a tradeoff between precision and recall. 41 00:03:53,860 --> 00:04:01,990 In other words if precision goes down does recall go up well let me ask you this. 42 00:04:02,210 --> 00:04:09,500 What happens in our cancer thought experiment where we label everyone as not having cancer but then 43 00:04:09,500 --> 00:04:15,360 somehow we're able to correctly identify a single person who has cancer. 44 00:04:15,380 --> 00:04:20,560 In other words we now predict that no one has cancer except one really unlucky guy. 45 00:04:20,600 --> 00:04:23,610 I'm sorry Mr. Jones I've got terrible news for you. 46 00:04:23,870 --> 00:04:25,270 That kind of stuff. 47 00:04:25,310 --> 00:04:27,590 So what we get in this case 48 00:04:31,120 --> 00:04:38,080 well our precision school in this case would be equal to one because we have one true positive and no 49 00:04:38,110 --> 00:04:39,410 false positive. 50 00:04:39,430 --> 00:04:47,830 We've maxed out precision however our recall is still very very very low because we've missed everyone 51 00:04:47,860 --> 00:04:55,170 else and we have a high false negative rate conversely what if we change it around. 52 00:04:55,170 --> 00:04:58,330 What if instead our model always predicted cancer. 53 00:04:58,350 --> 00:05:01,610 Everybody's got cancer cancer cancer cancer cancer. 54 00:05:01,710 --> 00:05:05,140 This by the way seemed to be the inevitable conclusion. 55 00:05:05,280 --> 00:05:11,970 If I was left to my own devices looking up my symptoms on a Web M.D. or at least that's how I remember 56 00:05:11,970 --> 00:05:13,690 it used to be. 57 00:05:14,010 --> 00:05:16,750 What's the what's our recall score in this case. 58 00:05:16,770 --> 00:05:18,090 What's the recall score. 59 00:05:18,090 --> 00:05:24,110 Where the model always predicts cancer well recall score would actually be maxed out. 60 00:05:24,140 --> 00:05:27,420 It's one we've correctly identified everyone with cancer. 61 00:05:27,710 --> 00:05:35,300 However our precision is very very low and low precision was also the Shepherd's boy problem. 62 00:05:35,310 --> 00:05:43,330 He cried wolf so many times but he only had one true positive and this brings us to why we can't maximize 63 00:05:43,450 --> 00:05:47,450 our recall score simply by labeling every email as spam. 64 00:05:47,590 --> 00:05:52,240 If we label every email as spam our model suffers from low precision. 65 00:05:52,240 --> 00:05:59,010 And conversely if we have fewer false positives then we are not classifying legitimate e-mail as spam. 66 00:05:59,200 --> 00:06:01,210 And we have higher precision. 67 00:06:01,210 --> 00:06:04,300 So is there a tradeoff between precision and recall. 68 00:06:04,300 --> 00:06:10,360 Well the answer is that yes very often there is when you're increasing one you're very often decreasing 69 00:06:10,360 --> 00:06:11,690 the other. 70 00:06:11,740 --> 00:06:17,440 And now that we've talked about the theory let's calculate precision in our naive based model Let's 71 00:06:17,530 --> 00:06:27,060 insert a subheading here called precision school and after that I'd like to pose a challenge for you. 72 00:06:27,310 --> 00:06:34,810 Can you calculate the precision of our naive based model store the result in a variable called precision 73 00:06:34,900 --> 00:06:42,460 underscore score and then print out the precision score as a decimal number and format it so that it 74 00:06:42,460 --> 00:06:45,610 is rounded to three decimal places. 75 00:06:45,610 --> 00:06:49,290 I'll give you a few seconds to pause the video and give this a go. 76 00:06:52,040 --> 00:06:52,400 All right. 77 00:06:52,460 --> 00:06:54,460 So here's the solution. 78 00:06:54,590 --> 00:06:59,800 Precision underscore Gore can't spell. 79 00:06:59,800 --> 00:07:12,020 As usual is equal to the true positives summed up divided by the sum of the true positives summed up 80 00:07:12,710 --> 00:07:18,410 plus the false positives summed up. 81 00:07:18,760 --> 00:07:20,620 And that's equal to our precision. 82 00:07:20,720 --> 00:07:23,730 Just the formula from the slides right. 83 00:07:23,990 --> 00:07:27,890 And printing this out with precision. 84 00:07:27,890 --> 00:07:31,850 School is curly braces colon. 85 00:07:31,850 --> 00:07:34,470 Point three for three decimal places. 86 00:07:34,580 --> 00:07:39,310 No percent sign and then the don't format. 87 00:07:39,710 --> 00:07:42,560 Precision on a score score. 88 00:07:42,560 --> 00:07:43,750 There we go. 89 00:07:43,880 --> 00:07:49,470 And here we have a precision score of zero point nine seven nine. 90 00:07:49,490 --> 00:07:51,100 Not bad right. 91 00:07:51,170 --> 00:07:58,070 Well in the next lesson we're going to talk about how to reconcile the precision and the recall scores 92 00:07:58,370 --> 00:08:00,420 into a single metric. 93 00:08:00,440 --> 00:08:02,510 I hope you're as excited as I am. 94 00:08:02,510 --> 00:08:03,170 See you there.