1
00:00:01,260 --> 00:00:06,650
In this video, we will discuss some of the terms which are used to characterize the performance of

2
00:00:06,800 --> 00:00:07,520
classify it.

3
00:00:08,970 --> 00:00:14,080
So we saw how to create a confusion, matrix and confusion matrix on one side.

4
00:00:14,100 --> 00:00:16,350
We had the true values on the other.

5
00:00:16,380 --> 00:00:17,700
We had predicted values.

6
00:00:18,240 --> 00:00:26,340
And the cross-section of these four classes gave us this confusion matrix and this confusion matrix.

7
00:00:26,940 --> 00:00:32,400
We first assigned the names of Drew, negative and false, positive, false, negative and true positive,

8
00:00:32,880 --> 00:00:34,140
depending on two things.

9
00:00:34,530 --> 00:00:37,470
Negative is always what they predicted.

10
00:00:37,470 --> 00:00:37,700
One.

11
00:00:38,130 --> 00:00:41,950
So and they predicted if it is negative, we'll call it negative.

12
00:00:42,120 --> 00:00:44,040
If it is positive, we'll call it positive.

13
00:00:45,300 --> 00:00:49,650
This true and false is whether we were correct or whether we were wrong.

14
00:00:50,070 --> 00:00:54,900
So it is true negative because predicted was negative and actual was also negative.

15
00:00:55,020 --> 00:00:56,790
So it is true and negative.

16
00:00:58,560 --> 00:01:03,210
This is true positive because predicted was positive and actually was also positive.

17
00:01:03,270 --> 00:01:05,040
So it is true and positive.

18
00:01:06,000 --> 00:01:09,270
The other two are false positive and false negative.

19
00:01:11,140 --> 00:01:18,480
The total true negatives we are denoting here with capital and the total true positives we are dead

20
00:01:18,600 --> 00:01:20,680
denoting ahead with capital P.

21
00:01:21,810 --> 00:01:29,640
The predicted total negatives had denoting with and start and predicted positives.

22
00:01:29,750 --> 00:01:36,600
We had the northing red p stuff out there, several items which can be used to characterize the performance.

23
00:01:37,590 --> 00:01:40,890
This table will help you avoid confusion in all these terms.

24
00:01:42,600 --> 00:01:45,570
The first term is called false positive rate.

25
00:01:45,990 --> 00:01:47,880
It has two other names.

26
00:01:48,120 --> 00:01:51,270
A VERNETTA and one minus specificity.

27
00:01:53,070 --> 00:01:54,960
It is calculated by F.P. by.

28
00:01:55,050 --> 00:01:57,630
And that is false positive.

29
00:01:58,080 --> 00:01:58,710
Divided by.

30
00:01:58,930 --> 00:02:01,930
And so let me give you an example.

31
00:02:01,950 --> 00:02:04,680
To understand all these performance measures.

32
00:02:06,450 --> 00:02:13,620
So suppose you are the manager of this real estate company and you want to use such a classifier to

33
00:02:13,620 --> 00:02:18,420
tell you whether you should choose to pick any particular property to sell or not.

34
00:02:19,680 --> 00:02:26,880
Why are you thinking of using a classifier, say when you pick up property, you will assign an agent

35
00:02:26,880 --> 00:02:27,720
to the property.

36
00:02:27,870 --> 00:02:33,300
You will do marketing for that property and do various other activities to get it sold.

37
00:02:34,560 --> 00:02:36,960
You will incur cost for all these activities.

38
00:02:37,230 --> 00:02:40,980
Now, if that property does not sell, you have sunk cost.

39
00:02:41,160 --> 00:02:43,260
And that will be real money lost.

40
00:02:44,970 --> 00:02:52,770
So from your classifier, you want that whenever it predicts that the house will be sold, it should

41
00:02:52,770 --> 00:02:53,540
be accurate.

42
00:02:54,150 --> 00:02:54,910
Most other times.

43
00:02:56,400 --> 00:02:56,760
So.

44
00:02:57,040 --> 00:03:00,180
And you think which performance measure will that be?

45
00:03:01,640 --> 00:03:09,770
We want whenever my classifier is predicting positive, that is out of these beest us, I want to see

46
00:03:09,770 --> 00:03:10,640
the accuracy.

47
00:03:11,360 --> 00:03:12,760
That is proof positive.

48
00:03:13,360 --> 00:03:17,330
So BP by Pista is one of the measures that I am interested in.

49
00:03:18,800 --> 00:03:20,700
There should be high for my classify it.

50
00:03:21,680 --> 00:03:24,800
That is known as precision BP by P stuff.

51
00:03:25,850 --> 00:03:33,020
In other words, if P by Peter is the amount of sunk cost that you are incurring.

52
00:03:33,110 --> 00:03:41,540
You do your classify it the other way is out of how many times it was not sold.

53
00:03:41,810 --> 00:03:44,870
Your model saved you by saying that it will not be sold.

54
00:03:45,500 --> 00:03:51,650
So that is the end by end, one minus D and by N will be F.P. by N.

55
00:03:51,770 --> 00:03:54,870
That is default postulate being by N.

56
00:03:55,040 --> 00:03:56,480
Is called specificity.

57
00:03:56,840 --> 00:04:01,450
That is how many times your model saved you from the sun cost.

58
00:04:02,030 --> 00:04:04,250
So that measure is specificity.

59
00:04:06,530 --> 00:04:13,880
On the other hand, if you decide not to pick up property under this sort easily, someone else owns

60
00:04:13,950 --> 00:04:17,690
commission that you could have on this is opportunity cost.

61
00:04:17,930 --> 00:04:23,630
That is, you will not lose the deal money, but you lose the opportunity to earn money.

62
00:04:25,940 --> 00:04:31,850
So can you guess which performance measure will give that this was the total opportunity of earning

63
00:04:31,850 --> 00:04:32,180
money?

64
00:04:32,910 --> 00:04:33,160
This.

65
00:04:34,340 --> 00:04:35,030
Although this.

66
00:04:35,150 --> 00:04:44,180
How many times you actually on BP to T.P, BP is giving you the amount of times you actually earned

67
00:04:44,360 --> 00:04:51,710
from the available opportunity, whereas this F in BP is the opportunity cost you incurred to BP by

68
00:04:51,710 --> 00:04:52,400
BP.

69
00:04:52,670 --> 00:04:56,180
Is the performance measured here, which is known as sensitivity.

70
00:04:57,620 --> 00:05:03,350
So all these performance measures have some business meaning inside them and correspondingly, there

71
00:05:03,350 --> 00:05:07,430
are some costs incurred for the inefficiencies of your model.

72
00:05:08,540 --> 00:05:15,020
So depending on which cost is high and wages low, you have to select the corresponding performance

73
00:05:15,020 --> 00:05:17,750
measure and choose the appropriate classifier.

74
00:05:20,630 --> 00:05:28,380
Another common barometer, often stated, while specifying the performance of a classifier is 80 under

75
00:05:28,380 --> 00:05:30,200
the curve of auto CECO.

76
00:05:31,950 --> 00:05:34,940
So this good is known as auto siecle.

77
00:05:35,910 --> 00:05:40,410
The name I don't see is historic and it comes from the communications theory.

78
00:05:41,100 --> 00:05:48,960
It is an acronym for a receiver operating characteristics to create this article on the X axis.

79
00:05:49,050 --> 00:05:53,340
We use this false positive ID that is one minus specificity.

80
00:05:54,090 --> 00:05:56,590
And on the Y axis, we use this positive.

81
00:05:57,060 --> 00:05:58,350
That is sensitivity.

82
00:05:59,730 --> 00:06:03,780
So on the Y axis, we have this sensitivity on x axis.

83
00:06:03,810 --> 00:06:07,110
We have false positive ID or one minus specificity.

84
00:06:08,010 --> 00:06:09,540
And we draw this code.

85
00:06:10,350 --> 00:06:12,030
This code is called Orosco.

86
00:06:14,400 --> 00:06:21,630
This bend of the globe should be as close to this top left corner of the glass fir to be good.

87
00:06:22,410 --> 00:06:28,510
That is the best classifier will have this point very near to this top corner of.

88
00:06:31,130 --> 00:06:38,720
In other words, the area under this code should be as close to the area of this whole square for this

89
00:06:38,720 --> 00:06:40,370
classified to work perfectly.

90
00:06:41,300 --> 00:06:43,580
If this goal is very close to the straight line.

91
00:06:45,110 --> 00:06:47,840
The performance of this classified is not very good.

92
00:06:48,260 --> 00:06:51,350
And it is almost as similar to a random classified.

93
00:06:53,040 --> 00:06:59,310
So whenever we want to compare performance of different models, we can use the auto cycle.

94
00:07:00,120 --> 00:07:03,270
We will specify the area under cover of auto cycle.

95
00:07:03,390 --> 00:07:05,840
That is a U.S. of RC.

96
00:07:06,930 --> 00:07:12,240
Whichever AUC is highest that classify, it will be considered the best it.

97
00:07:13,590 --> 00:07:19,320
So for best scenario, we are discovers exactly following these two axes.

98
00:07:20,010 --> 00:07:26,340
The U.S. value will be one that is the area under the cover will with nearly equal to this whole squared,

99
00:07:26,670 --> 00:07:27,690
which is one into one.

100
00:07:27,840 --> 00:07:28,380
That is one.

101
00:07:30,800 --> 00:07:35,000
For a random classifier, you see, value is nearly half.

102
00:07:35,660 --> 00:07:38,300
So it will have a curve like the straight line.

103
00:07:38,720 --> 00:07:40,580
And the area under the cover will be half.