1
00:00:00,360 --> 00:00:04,770
So you have seen how to train a model with a linear kernel.

2
00:00:05,700 --> 00:00:09,350
Now we are going to see how any model that polynomial connotes.

3
00:00:10,620 --> 00:00:17,790
It is not necessity that every time we use polynomial or radial grinnell's, the model will perform

4
00:00:17,790 --> 00:00:18,120
better.

5
00:00:19,680 --> 00:00:26,700
It largely depends upon the type of data and the relationship are independent variable, have the D

6
00:00:27,030 --> 00:00:27,890
dependent variable.

7
00:00:29,680 --> 00:00:37,660
If the relationship is more linear, binomial and radial chronometer will perform worse than the linear

8
00:00:37,810 --> 00:00:38,210
milimeter.

9
00:00:39,400 --> 00:00:44,890
So it depends on the type of relationship that are independent and dependent variables have between

10
00:00:44,890 --> 00:00:45,130
them.

11
00:00:46,680 --> 00:00:49,620
However, when we are running a polynomial cannon.

12
00:00:50,670 --> 00:00:56,670
If you remember from your tutee, there will be an additional hyper parameter which will be D.

13
00:00:56,850 --> 00:00:59,400
D, that is the degree of the polynomial.

14
00:01:00,390 --> 00:01:03,540
And as you know, you only hyper barometer's.

15
00:01:03,690 --> 00:01:04,920
We can do a grid search.

16
00:01:05,400 --> 00:01:13,200
That is, we can use multiple values of the hyper parameter, check the performance of the model over

17
00:01:13,500 --> 00:01:21,420
different values of that degree and find out which value of this degree D gives us the best performance

18
00:01:21,420 --> 00:01:22,500
on best set.

19
00:01:24,460 --> 00:01:32,890
Now, if the best performance can be found only at a degree of one that is a linear relationship, then

20
00:01:32,890 --> 00:01:36,710
that grid search will give us output of Beezy equal to one.

21
00:01:37,600 --> 00:01:45,280
So we can always do a polynomial Carmel method and run a grid search and find out whether there is a

22
00:01:45,280 --> 00:01:49,960
linear type of relationship or not, depending on the value of the degree of the polynomial.

23
00:01:52,320 --> 00:02:02,430
So let's start first of all, we will run a simple SBM function in which we will specify the value of

24
00:02:02,430 --> 00:02:03,520
two hyper parameters.

25
00:02:03,900 --> 00:02:04,920
One is the cost.

26
00:02:05,130 --> 00:02:06,060
And one is the degree.

27
00:02:07,230 --> 00:02:09,030
So the function is same as we am.

28
00:02:09,560 --> 00:02:14,580
I'll put out this function will be stored in this as we in fact be variable.

29
00:02:14,910 --> 00:02:18,570
I have put the speed to signify that this belongs to polynomial cabman.

30
00:02:20,100 --> 00:02:21,510
First parameterize the formula.

31
00:02:21,570 --> 00:02:30,960
Second, this data, which is train C, only the kernel changes from linear to polynomial cost.

32
00:02:31,170 --> 00:02:34,250
I have said to one degree I had to do so.

33
00:02:34,250 --> 00:02:39,980
It will be a polynomial degree to which we are trying to fit and train the model.

34
00:02:41,470 --> 00:02:42,470
So I'll run this command.

35
00:02:45,890 --> 00:02:53,550
You can see that a similar SBM fixed variable is created like the one which we created for linear government.

36
00:02:54,920 --> 00:03:00,170
And this contains the information of a SBM model of degree, too.

37
00:03:01,640 --> 00:03:09,290
Now, you can use the information in this SBM, in this SBM fixed variable to predict the values on

38
00:03:09,300 --> 00:03:14,390
dataset and then check out its performance using a confusion matrix.

39
00:03:15,080 --> 00:03:15,980
You know how to do it.

40
00:03:16,340 --> 00:03:18,290
I'm not doing it here to save pain.

41
00:03:18,980 --> 00:03:22,130
But if you're curious, I think you should write on your own.

42
00:03:23,930 --> 00:03:27,110
But here we are going to do hyper barometer tuning.

43
00:03:28,160 --> 00:03:34,730
Last time we tuned only one parameter, which is cost, since linear function has only one hyper barometer.

44
00:03:35,660 --> 00:03:38,510
But when we are using a cardinal, we just polynomial.

45
00:03:39,110 --> 00:03:40,780
There will be two hyper parameters.

46
00:03:42,180 --> 00:03:43,310
One is this cost.

47
00:03:43,470 --> 00:03:52,700
One which I am creating a series having these five values zero point zero zero one zero one one one

48
00:03:52,700 --> 00:03:53,090
five.

49
00:03:53,650 --> 00:03:58,820
Then you can choose values of your own choice or degree.

50
00:03:58,850 --> 00:04:06,150
I have chosen these five values, 2.5 line two to fight another parameter.

51
00:04:06,230 --> 00:04:11,920
I have included in this, which I did not include earlier, is this cross is equal to full.

52
00:04:13,220 --> 00:04:20,570
As I told you earlier, this tuning function uses cross-validation to find out the cross validate the

53
00:04:20,750 --> 00:04:21,060
error.

54
00:04:21,920 --> 00:04:30,740
And l just the best model which has least cross-validation error to perform cross-validation.

55
00:04:31,040 --> 00:04:33,320
It splits today down to 10 parts.

56
00:04:33,350 --> 00:04:37,530
By default, uses nine of the parts to train the model.

57
00:04:37,760 --> 00:04:40,990
And the end part to find out the cross validation edit.

58
00:04:41,880 --> 00:04:50,600
And it runs the model 10 times using nine of the sets as training set and one set as they tested each

59
00:04:50,600 --> 00:04:50,900
day.

60
00:04:52,670 --> 00:05:00,120
So basically, if we have a cross validation value of ten, it will run the model painting Martyn's.

61
00:05:00,170 --> 00:05:06,890
We already have these five combinations of cost and five combinations or degrees.

62
00:05:07,340 --> 00:05:11,240
So we are already done in this model five, five, 25, 25.

63
00:05:12,050 --> 00:05:18,110
If I have cross-validation of N here, it will have to run the same model 250 times.

64
00:05:19,340 --> 00:05:22,730
So I am reduced to cross-validation count here to four.

65
00:05:23,030 --> 00:05:26,870
So that this command does not take as much time.

66
00:05:27,740 --> 00:05:35,300
This basically means that the training set will be made into four parts, three parts will be used to

67
00:05:35,300 --> 00:05:39,740
train the data and the fourth part will be used to find out the cross validation error.

68
00:05:41,180 --> 00:05:48,950
So when you have limited computational power and you want to check out more values of cost and degree

69
00:05:49,550 --> 00:05:56,810
and some other hybrid parameter to be done, you have an option of reducing this cross parameter, which

70
00:05:56,810 --> 00:05:59,030
is the number of cross validation set.

71
00:06:01,240 --> 00:06:02,080
So I'll run this command.

72
00:06:05,560 --> 00:06:06,490
It is still running.

73
00:06:10,920 --> 00:06:17,730
So the command has run and you can see that we have a new erasable called Tuned Out Out B for people

74
00:06:17,730 --> 00:06:18,010
to make.

75
00:06:18,010 --> 00:06:23,370
I mean, you can see that it took a lot of time if we had the cross value of them.

76
00:06:23,800 --> 00:06:29,800
It would have taken even more pain, probably 2.5 times more pain than this commanded.

77
00:06:31,360 --> 00:06:38,910
So now you don our B has information of all the different models that we created.

78
00:06:39,820 --> 00:06:45,970
So let us find out which is the best model will store the information on best model in best more be.

79
00:06:48,700 --> 00:06:55,600
So no best might be as the information of the best polynomial Maudy.

80
00:06:57,380 --> 00:06:58,120
Let us look at this.

81
00:06:58,120 --> 00:06:59,570
Somebody on this list might be.

82
00:07:02,070 --> 00:07:10,050
So in this best model, we get that best model has a cost value of 10 degree of one.

83
00:07:10,950 --> 00:07:19,830
This means that if you if you are using a cardinal, the best binomial is coming out of degree, one

84
00:07:20,070 --> 00:07:22,590
which is same as linear polynomial.

85
00:07:23,160 --> 00:07:28,470
So even a polynomial, Cardinal Nellis telling us that the best relationship between dependent and independent

86
00:07:28,470 --> 00:07:29,970
variable is linear.

87
00:07:31,830 --> 00:07:37,530
However, this is telling us that cost is equal to 10 is giving us the best cross-validation result

88
00:07:39,480 --> 00:07:44,030
into saying that the number of support vectors is 310 out of three.

89
00:07:44,030 --> 00:07:50,740
Ninety eight, 157 of them are on one side, 153 of them are on the other side.

90
00:07:52,890 --> 00:07:56,920
And it ran a fourfold cross validation on the training data.

91
00:07:59,010 --> 00:08:00,780
And these are the accuracy values.

92
00:08:03,060 --> 00:08:05,370
So let us predict values of that.

93
00:08:05,470 --> 00:08:12,960
Start to asking variable on the desert and compared the performance of this model against the linear

94
00:08:12,960 --> 00:08:14,340
model that we created earlier.

95
00:08:16,870 --> 00:08:22,260
So why would be USD predicted values from this polynomial based model?

96
00:08:23,100 --> 00:08:27,720
And we'll create this table, which is also called the confusion matrix.

97
00:08:28,770 --> 00:08:35,490
So in this confusion matrix, you can see that we predicted that for forty three cases the movie will

98
00:08:35,490 --> 00:08:37,990
not get any longer.

99
00:08:38,030 --> 00:08:43,380
These articles could out of them for 32 cases.

100
00:08:43,590 --> 00:08:44,240
We are correct.

101
00:08:48,640 --> 00:08:54,690
Also, for 65 cases, a model predicted that we will get started Oscar.

102
00:08:55,720 --> 00:08:58,010
Whereas 49 of those cases we are.

103
00:08:58,490 --> 00:08:59,560
And I did not prediction.

104
00:09:00,400 --> 00:09:07,390
So overall, the prediction accuracy is 71, divided by one zero eight.

105
00:09:10,060 --> 00:09:11,950
It comes out to 65 percent.

106
00:09:13,900 --> 00:09:20,410
So a linear polynomial with a cost value of 10 is giving us a prediction.

107
00:09:20,440 --> 00:09:22,480
Accuracy of 65 percent.

108
00:09:23,320 --> 00:09:32,110
Last time when we created a table on linear model, we correctly predicted.

109
00:09:32,210 --> 00:09:33,300
Sixty seven guesses.

110
00:09:33,460 --> 00:09:36,280
This time we were correctly predicting 71 cases.

111
00:09:38,230 --> 00:09:43,150
So this is how we train our model in Polynomial Kernell SVM.

112
00:09:44,170 --> 00:09:47,360
We use the dual function to do only hyper barometer's.

113
00:09:47,860 --> 00:09:50,050
We have to help out barometer's cost and degree.

114
00:09:51,100 --> 00:09:53,380
We find out the best value of cost and degree.

115
00:09:54,180 --> 00:09:58,750
Use that best model to predict the values on a test or a new set.

116
00:09:59,410 --> 00:10:04,780
And if we have the actual values, we can compare the performance of our predictions against the actual

117
00:10:04,780 --> 00:10:11,410
values using the stable function, which creates the confusion matrix telling us how many were correct

118
00:10:11,710 --> 00:10:12,730
and how many were wrong.

119
00:10:15,610 --> 00:10:18,840
In the next video, we will learn about Radil governance.