1 00:00:01,260 --> 00:00:06,650 In this video, we will discuss some of the terms which are used to characterize the performance of 2 00:00:06,800 --> 00:00:07,520 classify it. 3 00:00:08,970 --> 00:00:14,080 So we saw how to create a confusion, matrix and confusion matrix on one side. 4 00:00:14,100 --> 00:00:16,350 We had the true values on the other. 5 00:00:16,380 --> 00:00:17,700 We had predicted values. 6 00:00:18,240 --> 00:00:26,340 And the cross-section of these four classes gave us this confusion matrix and this confusion matrix. 7 00:00:26,940 --> 00:00:32,400 We first assigned the names of Drew, negative and false, positive, false, negative and true positive, 8 00:00:32,880 --> 00:00:34,140 depending on two things. 9 00:00:34,530 --> 00:00:37,470 Negative is always what they predicted. 10 00:00:37,470 --> 00:00:37,700 One. 11 00:00:38,130 --> 00:00:41,950 So and they predicted if it is negative, we'll call it negative. 12 00:00:42,120 --> 00:00:44,040 If it is positive, we'll call it positive. 13 00:00:45,300 --> 00:00:49,650 This true and false is whether we were correct or whether we were wrong. 14 00:00:50,070 --> 00:00:54,900 So it is true negative because predicted was negative and actual was also negative. 15 00:00:55,020 --> 00:00:56,790 So it is true and negative. 16 00:00:58,560 --> 00:01:03,210 This is true positive because predicted was positive and actually was also positive. 17 00:01:03,270 --> 00:01:05,040 So it is true and positive. 18 00:01:06,000 --> 00:01:09,270 The other two are false positive and false negative. 19 00:01:11,140 --> 00:01:18,480 The total true negatives we are denoting here with capital and the total true positives we are dead 20 00:01:18,600 --> 00:01:20,680 denoting ahead with capital P. 21 00:01:21,810 --> 00:01:29,640 The predicted total negatives had denoting with and start and predicted positives. 22 00:01:29,750 --> 00:01:36,600 We had the northing red p stuff out there, several items which can be used to characterize the performance. 23 00:01:37,590 --> 00:01:40,890 This table will help you avoid confusion in all these terms. 24 00:01:42,600 --> 00:01:45,570 The first term is called false positive rate. 25 00:01:45,990 --> 00:01:47,880 It has two other names. 26 00:01:48,120 --> 00:01:51,270 A VERNETTA and one minus specificity. 27 00:01:53,070 --> 00:01:54,960 It is calculated by F.P. by. 28 00:01:55,050 --> 00:01:57,630 And that is false positive. 29 00:01:58,080 --> 00:01:58,710 Divided by. 30 00:01:58,930 --> 00:02:01,930 And so let me give you an example. 31 00:02:01,950 --> 00:02:04,680 To understand all these performance measures. 32 00:02:06,450 --> 00:02:13,620 So suppose you are the manager of this real estate company and you want to use such a classifier to 33 00:02:13,620 --> 00:02:18,420 tell you whether you should choose to pick any particular property to sell or not. 34 00:02:19,680 --> 00:02:26,880 Why are you thinking of using a classifier, say when you pick up property, you will assign an agent 35 00:02:26,880 --> 00:02:27,720 to the property. 36 00:02:27,870 --> 00:02:33,300 You will do marketing for that property and do various other activities to get it sold. 37 00:02:34,560 --> 00:02:36,960 You will incur cost for all these activities. 38 00:02:37,230 --> 00:02:40,980 Now, if that property does not sell, you have sunk cost. 39 00:02:41,160 --> 00:02:43,260 And that will be real money lost. 40 00:02:44,970 --> 00:02:52,770 So from your classifier, you want that whenever it predicts that the house will be sold, it should 41 00:02:52,770 --> 00:02:53,540 be accurate. 42 00:02:54,150 --> 00:02:54,910 Most other times. 43 00:02:56,400 --> 00:02:56,760 So. 44 00:02:57,040 --> 00:03:00,180 And you think which performance measure will that be? 45 00:03:01,640 --> 00:03:09,770 We want whenever my classifier is predicting positive, that is out of these beest us, I want to see 46 00:03:09,770 --> 00:03:10,640 the accuracy. 47 00:03:11,360 --> 00:03:12,760 That is proof positive. 48 00:03:13,360 --> 00:03:17,330 So BP by Pista is one of the measures that I am interested in. 49 00:03:18,800 --> 00:03:20,700 There should be high for my classify it. 50 00:03:21,680 --> 00:03:24,800 That is known as precision BP by P stuff. 51 00:03:25,850 --> 00:03:33,020 In other words, if P by Peter is the amount of sunk cost that you are incurring. 52 00:03:33,110 --> 00:03:41,540 You do your classify it the other way is out of how many times it was not sold. 53 00:03:41,810 --> 00:03:44,870 Your model saved you by saying that it will not be sold. 54 00:03:45,500 --> 00:03:51,650 So that is the end by end, one minus D and by N will be F.P. by N. 55 00:03:51,770 --> 00:03:54,870 That is default postulate being by N. 56 00:03:55,040 --> 00:03:56,480 Is called specificity. 57 00:03:56,840 --> 00:04:01,450 That is how many times your model saved you from the sun cost. 58 00:04:02,030 --> 00:04:04,250 So that measure is specificity. 59 00:04:06,530 --> 00:04:13,880 On the other hand, if you decide not to pick up property under this sort easily, someone else owns 60 00:04:13,950 --> 00:04:17,690 commission that you could have on this is opportunity cost. 61 00:04:17,930 --> 00:04:23,630 That is, you will not lose the deal money, but you lose the opportunity to earn money. 62 00:04:25,940 --> 00:04:31,850 So can you guess which performance measure will give that this was the total opportunity of earning 63 00:04:31,850 --> 00:04:32,180 money? 64 00:04:32,910 --> 00:04:33,160 This. 65 00:04:34,340 --> 00:04:35,030 Although this. 66 00:04:35,150 --> 00:04:44,180 How many times you actually on BP to T.P, BP is giving you the amount of times you actually earned 67 00:04:44,360 --> 00:04:51,710 from the available opportunity, whereas this F in BP is the opportunity cost you incurred to BP by 68 00:04:51,710 --> 00:04:52,400 BP. 69 00:04:52,670 --> 00:04:56,180 Is the performance measured here, which is known as sensitivity. 70 00:04:57,620 --> 00:05:03,350 So all these performance measures have some business meaning inside them and correspondingly, there 71 00:05:03,350 --> 00:05:07,430 are some costs incurred for the inefficiencies of your model. 72 00:05:08,540 --> 00:05:15,020 So depending on which cost is high and wages low, you have to select the corresponding performance 73 00:05:15,020 --> 00:05:17,750 measure and choose the appropriate classifier. 74 00:05:20,630 --> 00:05:28,380 Another common barometer, often stated, while specifying the performance of a classifier is 80 under 75 00:05:28,380 --> 00:05:30,200 the curve of auto CECO. 76 00:05:31,950 --> 00:05:34,940 So this good is known as auto siecle. 77 00:05:35,910 --> 00:05:40,410 The name I don't see is historic and it comes from the communications theory. 78 00:05:41,100 --> 00:05:48,960 It is an acronym for a receiver operating characteristics to create this article on the X axis. 79 00:05:49,050 --> 00:05:53,340 We use this false positive ID that is one minus specificity. 80 00:05:54,090 --> 00:05:56,590 And on the Y axis, we use this positive. 81 00:05:57,060 --> 00:05:58,350 That is sensitivity. 82 00:05:59,730 --> 00:06:03,780 So on the Y axis, we have this sensitivity on x axis. 83 00:06:03,810 --> 00:06:07,110 We have false positive ID or one minus specificity. 84 00:06:08,010 --> 00:06:09,540 And we draw this code. 85 00:06:10,350 --> 00:06:12,030 This code is called Orosco. 86 00:06:14,400 --> 00:06:21,630 This bend of the globe should be as close to this top left corner of the glass fir to be good. 87 00:06:22,410 --> 00:06:28,510 That is the best classifier will have this point very near to this top corner of. 88 00:06:31,130 --> 00:06:38,720 In other words, the area under this code should be as close to the area of this whole square for this 89 00:06:38,720 --> 00:06:40,370 classified to work perfectly. 90 00:06:41,300 --> 00:06:43,580 If this goal is very close to the straight line. 91 00:06:45,110 --> 00:06:47,840 The performance of this classified is not very good. 92 00:06:48,260 --> 00:06:51,350 And it is almost as similar to a random classified. 93 00:06:53,040 --> 00:06:59,310 So whenever we want to compare performance of different models, we can use the auto cycle. 94 00:07:00,120 --> 00:07:03,270 We will specify the area under cover of auto cycle. 95 00:07:03,390 --> 00:07:05,840 That is a U.S. of RC. 96 00:07:06,930 --> 00:07:12,240 Whichever AUC is highest that classify, it will be considered the best it. 97 00:07:13,590 --> 00:07:19,320 So for best scenario, we are discovers exactly following these two axes. 98 00:07:20,010 --> 00:07:26,340 The U.S. value will be one that is the area under the cover will with nearly equal to this whole squared, 99 00:07:26,670 --> 00:07:27,690 which is one into one. 100 00:07:27,840 --> 00:07:28,380 That is one. 101 00:07:30,800 --> 00:07:35,000 For a random classifier, you see, value is nearly half. 102 00:07:35,660 --> 00:07:38,300 So it will have a curve like the straight line. 103 00:07:38,720 --> 00:07:40,580 And the area under the cover will be half.