1
00:00:00,540 --> 00:00:03,780
Now, in the last lecture, we have centralized our data.

2
00:00:04,080 --> 00:00:08,160
Now it's Spain, Ukraine over SBM classification MURDEN.

3
00:00:11,420 --> 00:00:15,200
First, we need to import a swim from a Skillern.

4
00:00:16,470 --> 00:00:23,040
And we will use as we see function of our SVM to create classification, Morton.

5
00:00:24,840 --> 00:00:31,890
Now, if you remember for regulation, we were using, as we add, but for classification, we need

6
00:00:31,890 --> 00:00:37,080
to use as we see the other steps are almost similar.

7
00:00:37,740 --> 00:00:43,960
We first need to create our object, then we need to train it using the word extreme.

8
00:00:44,160 --> 00:00:45,330
And why train data?

9
00:00:46,480 --> 00:00:54,040
And then the next step is to predict the values of white and then whites using this trained object.

10
00:00:54,940 --> 00:01:02,020
And then we will use some valuation parameter to check the performance of over more than.

11
00:01:04,560 --> 00:01:07,740
Now, let's start with importing a swim.

12
00:01:10,240 --> 00:01:14,350
Now, in this cell, we are first creating this object.

13
00:01:14,740 --> 00:01:22,350
Our object variable name is VLF underscore as we m. underscore l here l stands for Linear.

14
00:01:22,690 --> 00:01:23,830
This is a variable name.

15
00:01:23,830 --> 00:01:26,560
You can give your own variable name as well.

16
00:01:28,390 --> 00:01:37,870
Then we are creating this object, using as we see function and in as we see if you open the hell,

17
00:01:38,260 --> 00:01:43,120
these are the parameters and these are their default values.

18
00:01:43,870 --> 00:01:46,990
By default, the cardinal is radial.

19
00:01:47,590 --> 00:01:55,960
So since we are creating our first model as linear model, we need to provide this parameter of cardinal

20
00:01:55,960 --> 00:01:57,010
equal to linear.

21
00:01:59,020 --> 00:02:05,800
Then if you remember, one of the most important parameter of SVM is the cost parameter, which has

22
00:02:05,800 --> 00:02:06,970
been ordered by sea.

23
00:02:08,080 --> 00:02:12,280
And here we are providing the value of C as zero point zero one.

24
00:02:14,760 --> 00:02:17,130
So let's clear this object.

25
00:02:17,730 --> 00:02:22,650
And let's also find out what extent is standard and why crane data?

26
00:02:26,210 --> 00:02:34,090
Now, remember to use extreme standard data and pseudo fewer extreme to train this modern standardization

27
00:02:34,220 --> 00:02:35,210
is very important.

28
00:02:35,240 --> 00:02:37,820
Before graining, you're as a model.

29
00:02:40,370 --> 00:02:42,140
Now, we have trained our model.

30
00:02:42,710 --> 00:02:48,290
We have fitted our external inviting data in this object.

31
00:02:51,060 --> 00:02:56,730
Now, let's predict the values of why using overtrained more than.

32
00:02:58,890 --> 00:03:00,840
Predicting value is very easy.

33
00:03:00,960 --> 00:03:05,000
You just have to use Dort Braddick method of field grain.

34
00:03:05,350 --> 00:03:05,790
More than.

35
00:03:07,010 --> 00:03:15,350
So I can write, see a live, underscore a swim underscoring this, the same object as this, then I

36
00:03:15,350 --> 00:03:19,220
can use Daudt predict method so I can write Dort predict.

37
00:03:19,430 --> 00:03:22,670
And then we have to just broyd the values of X.

38
00:03:23,660 --> 00:03:31,880
If I am providing values of extreme, I will get wide trend data and if I am avoiding values of expressed,

39
00:03:32,270 --> 00:03:34,970
I will get predicted values of White House data.

40
00:03:37,730 --> 00:03:45,530
So here I am, saving my prepared values and two well-trained, underscored red and white test underscore

41
00:03:45,540 --> 00:03:46,340
pride variable.

42
00:03:48,070 --> 00:03:50,830
Let's run this to get the predicted values.

43
00:03:52,690 --> 00:03:59,740
And if you want to look at the total values, you can just use any of these two variables.

44
00:04:01,240 --> 00:04:05,260
You can see these are the predicted values of a work as dataset.

45
00:04:10,020 --> 00:04:18,960
Now, next step is to evaluate the performance of over more than now in the regression we were using

46
00:04:19,080 --> 00:04:20,910
Artist Square and MASC value.

47
00:04:22,620 --> 00:04:28,170
And classification, we generally use accuracy, score and confusion metrics.

48
00:04:29,460 --> 00:04:37,310
No accuracy is school is the percentage of observations that our model is predicting correctly.

49
00:04:39,950 --> 00:04:43,910
The other metrics we want to look at is confusion, metrics.

50
00:04:44,780 --> 00:04:50,240
Let's import these two functions from a skill and our metrics.

51
00:04:51,540 --> 00:04:55,530
So Escalon has already provided us these two functions.

52
00:04:55,560 --> 00:05:02,040
We do not have to manually compute the accuracy score and confident matrix using were created an actual

53
00:05:02,040 --> 00:05:02,490
values.

54
00:05:02,880 --> 00:05:10,200
We can just important the functions and let us draw confusion metrics on our test dataset.

55
00:05:10,950 --> 00:05:12,420
The process is very simple.

56
00:05:12,570 --> 00:05:20,190
You can just write confusion, metrics, function, and then if you look at the parameters first, you

57
00:05:20,190 --> 00:05:22,400
have to provide the actual values of the way.

58
00:05:22,770 --> 00:05:24,810
And then the TED values of Y.

59
00:05:25,740 --> 00:05:30,350
Since we want convenient metrics of forward, tell certa we just need to provide Y test.

60
00:05:30,990 --> 00:05:34,260
Since our Y test contains the actual value and then.

61
00:05:35,240 --> 00:05:42,170
The second parameter is the accredited values of flight test, which we have calculated here.

62
00:05:43,560 --> 00:05:44,490
Let's run this.

63
00:05:46,290 --> 00:05:55,570
So this the confusion matrix here, rules represent the actual values and columns represent the predicted

64
00:05:55,590 --> 00:05:56,100
values.

65
00:05:57,590 --> 00:06:01,460
And these numbers are the number of observation.

66
00:06:01,520 --> 00:06:03,470
They're not falling into that category.

67
00:06:04,100 --> 00:06:07,310
So, for example, if I select this first girl.

68
00:06:08,530 --> 00:06:12,420
First, a sense for number of actual zeros.

69
00:06:13,260 --> 00:06:16,680
So there are 44 observations in this first straw.

70
00:06:17,040 --> 00:06:27,060
That means in our dataset there are 44 actual zeros and there are 58 actual one.

71
00:06:28,590 --> 00:06:32,340
Now, this column's it stands for the values.

72
00:06:34,110 --> 00:06:37,140
So and then what predicted values?

73
00:06:37,350 --> 00:06:40,090
There are total sixteen zeroes.

74
00:06:40,830 --> 00:06:43,470
And there are total it be six ones.

75
00:06:44,420 --> 00:06:53,770
You can also calculated using this table here, a VM put the data off over Vytas predicted variable.

76
00:06:56,780 --> 00:07:02,390
So in this also, there are 16 zeros and it is six months.

77
00:07:05,400 --> 00:07:05,940
So.

78
00:07:07,100 --> 00:07:14,840
In short, this confusion matrix is telling me that for this eleven records, the actual value is also

79
00:07:14,840 --> 00:07:17,480
zero and the predicted value is also zero.

80
00:07:17,570 --> 00:07:25,100
Since this is in the first floor and the first column for this five records, the actual value is one,

81
00:07:25,280 --> 00:07:32,990
but the total value is zero for this thirty three records, the actual value is zero since this is in

82
00:07:32,990 --> 00:07:33,810
the first row.

83
00:07:34,130 --> 00:07:39,650
But the credit card value is once and this is in the second column for this 53.

84
00:07:39,920 --> 00:07:44,780
The actual value is also one and the total value is also one.

85
00:07:46,640 --> 00:07:56,180
So we are correctly identifying this 53 record and this eleven records, and we are not able to accurately

86
00:07:56,180 --> 00:07:58,100
predict these five records.

87
00:07:58,220 --> 00:07:59,870
And this 33 records.

88
00:08:01,430 --> 00:08:05,350
So what accuracy will be 53 less?

89
00:08:05,410 --> 00:08:05,890
Eleven.

90
00:08:06,620 --> 00:08:10,550
That is 64, divided by the total number of observations.

91
00:08:11,240 --> 00:08:20,690
So you can compute it manually also, but you can also use this accuracy underscored escort function.

92
00:08:21,680 --> 00:08:26,150
And here also you force her to provide the actual values and then the predicted values.

93
00:08:26,240 --> 00:08:27,060
Let's find out.

94
00:08:27,080 --> 00:08:27,860
Decorously.

95
00:08:29,580 --> 00:08:37,490
The accuracy of the murder lists 62 percent, if you calculated from here also, you will get the same

96
00:08:37,520 --> 00:08:37,950
result.

97
00:08:39,420 --> 00:08:45,630
And one more thing about country and metrics, since the actual value of this 11 record is zero.

98
00:08:45,720 --> 00:08:48,210
And we are also predicting it as zero.

99
00:08:49,230 --> 00:08:51,870
We call them groo negative.

100
00:08:54,250 --> 00:08:56,800
That is the Ted Lewis crew.

101
00:08:57,220 --> 00:09:01,550
And actually, they are negative for this 53 records.

102
00:09:01,600 --> 00:09:02,920
The actual value is one.

103
00:09:03,010 --> 00:09:04,930
And the predicted value is also one.

104
00:09:05,620 --> 00:09:08,430
Therefore, we call them crew positive.

105
00:09:10,030 --> 00:09:12,580
That is the actual Ray Lewis one.

106
00:09:14,220 --> 00:09:16,590
And we are also predicting it as one.

107
00:09:16,770 --> 00:09:18,540
So they are group positives.

108
00:09:21,830 --> 00:09:28,910
No, for a swim than you can also find out the number of support vectors you have in your unmodern.

109
00:09:30,880 --> 00:09:37,420
So to find out the number of support vector, you need to use the attribute and underscore, support,

110
00:09:37,510 --> 00:09:38,220
underscore.

111
00:09:39,220 --> 00:09:41,680
So just write your more name.

112
00:09:42,750 --> 00:09:50,010
And then write this attribute, dot and underscore support, underscore Vuuren, this.

113
00:09:51,860 --> 00:09:53,490
So you will get two numbers.

114
00:09:54,870 --> 00:09:58,980
The first number is number of support vector for your first class.

115
00:09:59,670 --> 00:10:03,750
And the second number is the number of support vector for your second class.

116
00:10:04,020 --> 00:10:05,790
Since we have two class zero and one.

117
00:10:05,910 --> 00:10:09,150
So this is that number of support vector for the zero class.

118
00:10:09,870 --> 00:10:13,470
And this is the number of support vector for one class.

119
00:10:14,310 --> 00:10:19,050
So in total, we have 375 support vectors.

120
00:10:20,570 --> 00:10:28,420
If you want to learn more, you can also have a look at this official documentation of Eskinder about

121
00:10:28,630 --> 00:10:31,250
support vector classification here.

122
00:10:31,570 --> 00:10:34,030
We will get details of all the parameters.

123
00:10:34,060 --> 00:10:38,950
We have already discussed few of the important parameters, such as sea, kernell, etc..

124
00:10:41,770 --> 00:10:44,250
And then you have attributes also.

125
00:10:45,370 --> 00:10:52,060
So after you have grown your model, you can call on this attributes to get more information about your

126
00:10:52,070 --> 00:10:52,480
more than.

127
00:10:54,870 --> 00:10:59,060
We use this attribute to get the number of support vectors.

128
00:10:59,760 --> 00:11:02,930
If you want to view your support directory, you can just straight.

129
00:11:03,180 --> 00:11:04,240
You are more the name.

130
00:11:05,250 --> 00:11:07,540
Support, underscore, vectors underscore.

131
00:11:08,080 --> 00:11:10,350
And it will give you all the support vectors.

132
00:11:12,000 --> 00:11:18,480
So you can have a look at this attributes and parameter to get more information about classification

133
00:11:18,480 --> 00:11:19,860
models of SVM.

134
00:11:21,700 --> 00:11:27,340
In the next lecture, we will use grid search to optimize this value of sea.