1
00:00:01,440 --> 00:00:08,070
So now that we have split the data into two parts and we are going to use green sea, did I say to bring

2
00:00:08,070 --> 00:00:08,650
the model?

3
00:00:09,210 --> 00:00:15,120
There is one important thing that you should note, as I told you earlier, that for now we are doing

4
00:00:15,120 --> 00:00:18,240
classification and later on we'll be doing integration.

5
00:00:20,400 --> 00:00:26,630
The function that we'll be using to run the Espey, a model will be seen for classification and regulation.

6
00:00:29,040 --> 00:00:37,230
However, that function identifies whether it has to then diggnation or classification basis, the type

7
00:00:37,230 --> 00:00:42,630
of the dependent variable that is the variable that we are predicting.

8
00:00:43,590 --> 00:00:48,840
If that variable is categorical, then it will do classification.

9
00:00:50,220 --> 00:00:54,090
And if that variable is numeric, it will do regression.

10
00:00:57,090 --> 00:00:59,220
So if you look at my treinta dataset.

11
00:01:02,180 --> 00:01:03,120
We go to the right.

12
00:01:03,530 --> 00:01:05,960
This tartikoff skirt is a dependent variable.

13
00:01:06,080 --> 00:01:07,760
The city variable that we want to predict.

14
00:01:08,400 --> 00:01:16,490
But if I hover over the name of this variable, you can see that it has numeric values with range zero

15
00:01:16,550 --> 00:01:17,000
to one.

16
00:01:19,430 --> 00:01:26,810
If I use this variable as many Brenin variable, the SVM function will see that this is numeric and

17
00:01:26,810 --> 00:01:28,030
it will run a regulation.

18
00:01:29,600 --> 00:01:36,380
So it is important that we change this to a categorical variable which are identified as factors.

19
00:01:38,180 --> 00:01:45,440
So we need to run this these two lines, which is converting the same numeric data into factors.

20
00:01:45,920 --> 00:01:51,680
So if I run this command and go and look at the Trinity variable.

21
00:01:53,490 --> 00:01:54,360
Daintily does it?

22
00:01:55,230 --> 00:02:01,060
Now, if I scroll, the starting Oscar variable is no factor.

23
00:02:01,220 --> 00:02:03,890
We live IDs do level dirty to anyone.

24
00:02:06,030 --> 00:02:07,950
So this is no categorical variable.

25
00:02:08,310 --> 00:02:10,500
Same thing we will do with test, see?

26
00:02:12,610 --> 00:02:14,530
And both of these data sets.

27
00:02:14,990 --> 00:02:18,950
Now has the starting Oscar variable as factor.

28
00:02:20,650 --> 00:02:23,950
Now this variable can be used to do classification.

29
00:02:25,660 --> 00:02:30,240
So let me want to SBM know to run SVM.

30
00:02:30,670 --> 00:02:32,350
We need to use this package.

31
00:02:32,590 --> 00:02:35,590
The package is called E one zero seven one.

32
00:02:37,890 --> 00:02:40,160
Most probably you don't have this package installed.

33
00:02:40,380 --> 00:02:44,610
You need to install these packages, as I showed you earlier, to install a package.

34
00:02:44,640 --> 00:02:50,310
We need to write install lock packages and within the brackets and single code will write the name,

35
00:02:50,400 --> 00:02:52,680
which is E one zero seven one.

36
00:02:59,140 --> 00:03:03,760
So this package is downloaded and installed to make it active.

37
00:03:03,910 --> 00:03:11,650
We will run the library command and the packages active also know we can use this package.

38
00:03:12,450 --> 00:03:16,450
So the brain and SVM model, we use this SVM function.

39
00:03:16,720 --> 00:03:20,800
This is Veum function as part of this even zero seven one library.

40
00:03:23,090 --> 00:03:28,880
We will store the output of this as function into this as we unfit variable.

41
00:03:30,620 --> 00:03:34,190
So SBM function takes all these parameters.

42
00:03:35,390 --> 00:03:37,920
The first barometer is the formula.

43
00:03:39,550 --> 00:03:46,660
And this we need to specify which is the dependent variable and which are the independent variables

44
00:03:46,810 --> 00:03:49,240
which are to be used as predictive variables.

45
00:03:52,080 --> 00:03:56,790
So this telecine, you can find this dealer saying.

46
00:03:57,840 --> 00:04:00,150
On your keyboard, above the tab key.

47
00:04:01,530 --> 00:04:06,120
This delay sign is used to separate dependent and independent variables.

48
00:04:06,750 --> 00:04:10,410
Anything to the left of the delay is the dependent variable.

49
00:04:11,160 --> 00:04:17,100
And all the independent variables that you want to use to predict this variable will be on the right

50
00:04:17,100 --> 00:04:17,640
of daily.

51
00:04:19,560 --> 00:04:22,340
So I want to predict Star Trek Oscar.

52
00:04:22,710 --> 00:04:23,920
So that is on the left on.

53
00:04:24,500 --> 00:04:24,840
Billy.

54
00:04:26,000 --> 00:04:29,690
And I want to use all other variables from my dataset.

55
00:04:30,210 --> 00:04:32,100
That is why I have put our daughter.

56
00:04:33,090 --> 00:04:40,380
So to represent that, I want to use all the variables as predictive variables, I have water don't.

57
00:04:41,420 --> 00:04:47,630
If I want to use only one variable, say, budget, I go to have written budget here.

58
00:04:48,170 --> 00:04:54,020
If I wanted to use two variables, I could have written budget plus time taken.

59
00:04:55,310 --> 00:05:00,160
So using plus symbols, I can add multiple independent variables here.

60
00:05:01,070 --> 00:05:06,410
But since I want to use all the other variables and it does not make sense to write the name of 18 other

61
00:05:06,410 --> 00:05:08,480
variables here by using plus symbols.

62
00:05:09,910 --> 00:05:15,760
We have just bought our daughter here, which signifies that we want to use all of the valuables.

63
00:05:17,420 --> 00:05:20,490
The second parameter is data, which is to be used.

64
00:05:20,860 --> 00:05:22,550
The data did train see dataset.

65
00:05:23,640 --> 00:05:31,350
Third parameter that we are giving is cardinal as we have Govardhan are two reclass.

66
00:05:31,710 --> 00:05:37,360
The linear gomel SVM is the same as support vector classifier model longto.

67
00:05:38,100 --> 00:05:41,650
So turning this come on will be same as running a support network classifier.

68
00:05:42,840 --> 00:05:45,540
Also, this is a linear support with the machine.

69
00:05:46,950 --> 00:05:51,240
So here we are using a good analogy called to linear parameter.

70
00:05:53,730 --> 00:06:01,050
The fourth barometer is the cost barometer, as they discussed in the two re-elected, we use budget

71
00:06:01,320 --> 00:06:11,040
or cost as I put parameters so that we control the rate of the margin and we allow some of the point

72
00:06:11,940 --> 00:06:19,230
to be misclassified to this cost is equal to one is giving us the cost of misclassification.

73
00:06:21,660 --> 00:06:24,320
The last perimeter that we are using is skill.

74
00:06:25,380 --> 00:06:32,280
We have four skill is equal to two, which means that we will be scaling all the variables in our dataset

75
00:06:33,150 --> 00:06:34,620
when we scale the variables.

76
00:06:35,250 --> 00:06:41,430
We change in values so that they have a mean of zero and the standard deviation of one.

77
00:06:42,420 --> 00:06:46,590
We do this because SVM is skill sensitive.

78
00:06:47,790 --> 00:06:57,030
What this means is if you have a variable in your dataset which has a very large value, say it is in

79
00:06:57,030 --> 00:07:02,370
millions and there is another variable in your dataset which is very small.

80
00:07:02,760 --> 00:07:07,230
So it is of the range of point zero zero one two point zero one.

81
00:07:08,980 --> 00:07:16,500
Now, when we use these two variables, since as we will be calculating distance, the variable which

82
00:07:16,710 --> 00:07:20,040
has very high scale, will be given more importance.

83
00:07:21,780 --> 00:07:28,050
Another issue that you will face in a scale since the model is, for example, if you have currency

84
00:07:28,140 --> 00:07:36,390
in your dataset, if you have one dollar and the data set, it will be considered differently than if

85
00:07:36,390 --> 00:07:43,110
that same value is in euros or in rupees or yangs and so on.

86
00:07:44,760 --> 00:07:48,240
So we have to make this data scale indifferent.

87
00:07:49,770 --> 00:07:57,540
To do that, we use scale is a goal to prove very few times the data does not need scaling in such a

88
00:07:57,540 --> 00:07:58,070
scenario.

89
00:07:58,170 --> 00:08:00,340
We can use scale is equal to faults also.

90
00:08:01,500 --> 00:08:03,300
You can add few more parameters.

91
00:08:03,300 --> 00:08:13,530
Also, if you click anywhere on dysfunction and press F1, you will see that the help for this function

92
00:08:13,530 --> 00:08:14,100
opens up.

93
00:08:16,610 --> 00:08:20,850
And in this function you can see all the argument that you can give.

94
00:08:21,930 --> 00:08:32,250
So I've discussed the important ones deform large data, scale Connel, etc. This type argument will

95
00:08:32,250 --> 00:08:33,420
be chosen by default.

96
00:08:33,930 --> 00:08:39,350
So as I told you, depending on the type of are dependent variable, that is where there is.

97
00:08:39,560 --> 00:08:41,370
It is numeric or factor.

98
00:08:41,990 --> 00:08:46,260
That type is elected, whether it should be classification or regression.

99
00:08:47,880 --> 00:08:52,770
So if you want to specifically give the tape, you can also specify that.

100
00:08:54,660 --> 00:08:56,580
So all these are arguments available.

101
00:08:57,180 --> 00:09:00,960
But I suggest that we use all these arguments.

102
00:09:01,620 --> 00:09:03,720
You can check out the other arguments also.

103
00:09:03,930 --> 00:09:05,860
These are the ones that you must know about.

104
00:09:07,230 --> 00:09:08,180
So I'll run this.

105
00:09:08,190 --> 00:09:08,550
Come on.

106
00:09:12,510 --> 00:09:18,720
And you can see that as we in fact, variable is now created and as we unfold, contains the information

107
00:09:19,140 --> 00:09:21,400
of the SVM model.

108
00:09:22,290 --> 00:09:27,330
If you want to get a summary of the information in this model, you can run the summary command.

109
00:09:27,990 --> 00:09:33,770
So somebody within brackets, we write the name of the model and.

110
00:09:36,040 --> 00:09:36,760
You can see.

111
00:09:39,810 --> 00:09:47,980
That we ran a classification type of model with a Carmeli near the Costa said one.

112
00:09:50,490 --> 00:09:55,840
We have three and therefore support wetters out of the 398 observations in Deep Ringley desert.

113
00:09:56,880 --> 00:10:04,130
So the margins that this model created within those margins, we have three hundred four point nine

114
00:10:04,140 --> 00:10:11,640
out of these three hundred four point one, fifty four point are on one side and 150 points are on the

115
00:10:11,790 --> 00:10:18,300
other side of the hyper plane number of classes to deliver that data in one.

116
00:10:19,520 --> 00:10:25,080
So this is the summary of the information in this as a model.

117
00:10:26,160 --> 00:10:32,430
So now that we have trained the model and the information is stored in as we in fact, we can check

118
00:10:32,430 --> 00:10:36,870
its performance on the test to Geddie performance.

119
00:10:36,990 --> 00:10:43,920
We will first use this model and the independent variables in the test dataset to get the predicted

120
00:10:43,920 --> 00:10:46,730
value on those observations and data set.

121
00:10:48,510 --> 00:10:50,880
Then we will compare these predictions.

122
00:10:51,270 --> 00:10:53,380
What is the actual value in these tests?

123
00:10:53,410 --> 00:10:53,720
It.

124
00:10:55,560 --> 00:11:04,320
So this first line, which is why bread is equal to predict SBN fit Comac Bessie is storing the predicted

125
00:11:04,320 --> 00:11:07,260
values in this whitebread variable.

126
00:11:08,550 --> 00:11:15,660
The values that are predicted using this predict function first barometer to this function is the model

127
00:11:15,660 --> 00:11:15,960
name.

128
00:11:16,470 --> 00:11:18,930
And the second parameter is deep test set.

129
00:11:20,130 --> 00:11:25,050
Here, you need not provide, which will be the dependent variable and which will be the independent

130
00:11:25,050 --> 00:11:31,650
variable, because in the model we already know, which were the independent variables, it will automatically

131
00:11:31,650 --> 00:11:36,000
take values of those independent variable from this day said.

132
00:11:37,370 --> 00:11:42,350
And using those values of independent variables, it will predict the values.

133
00:11:43,510 --> 00:11:44,720
Of dependent variable.

134
00:11:45,070 --> 00:11:47,630
And it restored them into widespread value.

135
00:11:47,730 --> 00:11:47,890
But.

136
00:11:49,730 --> 00:11:54,460
Now, the actual value is in STC dataset.

137
00:11:55,310 --> 00:11:55,650
Indeed.

138
00:11:55,850 --> 00:12:01,370
Star Trek Oscar variable, the predicted values are in my bread variable.

139
00:12:02,800 --> 00:12:09,160
To compare these predictions against the actual values, we use this table function.

140
00:12:10,970 --> 00:12:14,190
And this table function on the rules.

141
00:12:14,250 --> 00:12:18,020
We will get the predicted values and unbe columns.

142
00:12:18,330 --> 00:12:20,090
We will get the actual values.

143
00:12:21,700 --> 00:12:25,530
Creating this table is also called creating a confusion matrix.

144
00:12:26,690 --> 00:12:33,560
Let me first rounders predict function to predict predictive values, and I will then on this day will

145
00:12:33,560 --> 00:12:39,730
function and you can see that the predicted values are on the arrows.

146
00:12:40,250 --> 00:12:40,700
So.

147
00:12:42,050 --> 00:12:49,190
What these 45 cases, my model predicted that these 45 cases will not get an Oscar.

148
00:12:50,480 --> 00:12:58,070
Out of the 45 cases, actually 32 did not get Oscar and 13 did get the Oscar.

149
00:13:00,260 --> 00:13:12,080
And my model predicted that 63 movies will get Oscar out, of which 37 actually got the Oscar and 26

150
00:13:12,890 --> 00:13:13,990
did not get the Oscar.

151
00:13:15,560 --> 00:13:18,440
This matrix is also called confusion matrix.

152
00:13:18,980 --> 00:13:22,430
This is used to compare the performance of classification models.

153
00:13:23,990 --> 00:13:33,620
So here you can see that 69 nine of the cases were correctly predicted by our model and 39 guesses were

154
00:13:33,740 --> 00:13:35,450
incorrectly predicted by the model.

155
00:13:36,950 --> 00:13:41,670
So if you want to calculate the prediction accuracy of our model, you can simply write.

156
00:13:42,780 --> 00:13:46,340
Sixty nine divided by one zero eight.

157
00:13:47,660 --> 00:13:54,230
Since we correctly predicted four sixty nine cases and we have one zero eight observations.

158
00:13:54,280 --> 00:13:54,590
Indeed.

159
00:13:54,690 --> 00:13:55,150
Essid.

160
00:13:55,960 --> 00:14:02,960
So if I run this command, it is telling us that we are getting an accuracy of nearly 64 percent.

161
00:14:04,550 --> 00:14:11,900
So using this model with an accuracy of 64 percent, I can predict whether a particular movie is going

162
00:14:11,900 --> 00:14:13,640
to get an Oscar or not.

163
00:14:14,390 --> 00:14:18,260
This last line here, which is, as we inferred, dollar index.

164
00:14:19,010 --> 00:14:26,480
This is used to check which of the observations are support vectors, since in this scenario we are

165
00:14:26,480 --> 00:14:28,250
getting a lot of observations.

166
00:14:28,910 --> 00:14:36,920
This may not make any sense, but if you have very few observations and out of which two or three support

167
00:14:36,920 --> 00:14:42,260
vectors, you may want to find out which of the observations are support vectors.

168
00:14:42,900 --> 00:14:46,850
Change your model completely depends on those few observations.

169
00:14:47,690 --> 00:14:51,440
So a slight change in any of those observations can change your model.

170
00:14:51,980 --> 00:15:00,290
So you may want to find out which observations are your support vectors to find out which observations

171
00:15:00,380 --> 00:15:05,540
other support workers will on this command, SBM three dollar index.

172
00:15:06,680 --> 00:15:11,560
And you can see here that first sixt well, 13.

173
00:15:12,350 --> 00:15:19,730
All these are the index of observations, which are your support vectors in this video.

174
00:15:19,790 --> 00:15:25,850
We have seen how to train the SBM model with a linear Cottonelle and we then.

175
00:15:25,940 --> 00:15:32,570
So how do you use that train model to predict values on a test set or a new set?

176
00:15:33,590 --> 00:15:41,540
And then we used those predicted values of dataset to find out deep prediction, accuracy of our model.

177
00:15:42,980 --> 00:15:49,040
And lastly, we also saw how to find out which of these observations are the support vectors.

178
00:15:50,680 --> 00:15:57,640
In the next video, we will see how to find out that value of this hybrid parameter.

179
00:15:58,020 --> 00:16:00,290
This cost parameter.

180
00:16:01,200 --> 00:16:06,270
We want to find out such a value of this cost parameter, which gives us maximum accuracy.

181
00:16:07,350 --> 00:16:10,380
So we will learn how to tune this hyper parameter.