1 00:00:00,360 --> 00:00:04,770 So you have seen how to train a model with a linear kernel. 2 00:00:05,700 --> 00:00:09,350 Now we are going to see how any model that polynomial connotes. 3 00:00:10,620 --> 00:00:17,790 It is not necessity that every time we use polynomial or radial grinnell's, the model will perform 4 00:00:17,790 --> 00:00:18,120 better. 5 00:00:19,680 --> 00:00:26,700 It largely depends upon the type of data and the relationship are independent variable, have the D 6 00:00:27,030 --> 00:00:27,890 dependent variable. 7 00:00:29,680 --> 00:00:37,660 If the relationship is more linear, binomial and radial chronometer will perform worse than the linear 8 00:00:37,810 --> 00:00:38,210 milimeter. 9 00:00:39,400 --> 00:00:44,890 So it depends on the type of relationship that are independent and dependent variables have between 10 00:00:44,890 --> 00:00:45,130 them. 11 00:00:46,680 --> 00:00:49,620 However, when we are running a polynomial cannon. 12 00:00:50,670 --> 00:00:56,670 If you remember from your tutee, there will be an additional hyper parameter which will be D. 13 00:00:56,850 --> 00:00:59,400 D, that is the degree of the polynomial. 14 00:01:00,390 --> 00:01:03,540 And as you know, you only hyper barometer's. 15 00:01:03,690 --> 00:01:04,920 We can do a grid search. 16 00:01:05,400 --> 00:01:13,200 That is, we can use multiple values of the hyper parameter, check the performance of the model over 17 00:01:13,500 --> 00:01:21,420 different values of that degree and find out which value of this degree D gives us the best performance 18 00:01:21,420 --> 00:01:22,500 on best set. 19 00:01:24,460 --> 00:01:32,890 Now, if the best performance can be found only at a degree of one that is a linear relationship, then 20 00:01:32,890 --> 00:01:36,710 that grid search will give us output of Beezy equal to one. 21 00:01:37,600 --> 00:01:45,280 So we can always do a polynomial Carmel method and run a grid search and find out whether there is a 22 00:01:45,280 --> 00:01:49,960 linear type of relationship or not, depending on the value of the degree of the polynomial. 23 00:01:52,320 --> 00:02:02,430 So let's start first of all, we will run a simple SBM function in which we will specify the value of 24 00:02:02,430 --> 00:02:03,520 two hyper parameters. 25 00:02:03,900 --> 00:02:04,920 One is the cost. 26 00:02:05,130 --> 00:02:06,060 And one is the degree. 27 00:02:07,230 --> 00:02:09,030 So the function is same as we am. 28 00:02:09,560 --> 00:02:14,580 I'll put out this function will be stored in this as we in fact be variable. 29 00:02:14,910 --> 00:02:18,570 I have put the speed to signify that this belongs to polynomial cabman. 30 00:02:20,100 --> 00:02:21,510 First parameterize the formula. 31 00:02:21,570 --> 00:02:30,960 Second, this data, which is train C, only the kernel changes from linear to polynomial cost. 32 00:02:31,170 --> 00:02:34,250 I have said to one degree I had to do so. 33 00:02:34,250 --> 00:02:39,980 It will be a polynomial degree to which we are trying to fit and train the model. 34 00:02:41,470 --> 00:02:42,470 So I'll run this command. 35 00:02:45,890 --> 00:02:53,550 You can see that a similar SBM fixed variable is created like the one which we created for linear government. 36 00:02:54,920 --> 00:03:00,170 And this contains the information of a SBM model of degree, too. 37 00:03:01,640 --> 00:03:09,290 Now, you can use the information in this SBM, in this SBM fixed variable to predict the values on 38 00:03:09,300 --> 00:03:14,390 dataset and then check out its performance using a confusion matrix. 39 00:03:15,080 --> 00:03:15,980 You know how to do it. 40 00:03:16,340 --> 00:03:18,290 I'm not doing it here to save pain. 41 00:03:18,980 --> 00:03:22,130 But if you're curious, I think you should write on your own. 42 00:03:23,930 --> 00:03:27,110 But here we are going to do hyper barometer tuning. 43 00:03:28,160 --> 00:03:34,730 Last time we tuned only one parameter, which is cost, since linear function has only one hyper barometer. 44 00:03:35,660 --> 00:03:38,510 But when we are using a cardinal, we just polynomial. 45 00:03:39,110 --> 00:03:40,780 There will be two hyper parameters. 46 00:03:42,180 --> 00:03:43,310 One is this cost. 47 00:03:43,470 --> 00:03:52,700 One which I am creating a series having these five values zero point zero zero one zero one one one 48 00:03:52,700 --> 00:03:53,090 five. 49 00:03:53,650 --> 00:03:58,820 Then you can choose values of your own choice or degree. 50 00:03:58,850 --> 00:04:06,150 I have chosen these five values, 2.5 line two to fight another parameter. 51 00:04:06,230 --> 00:04:11,920 I have included in this, which I did not include earlier, is this cross is equal to full. 52 00:04:13,220 --> 00:04:20,570 As I told you earlier, this tuning function uses cross-validation to find out the cross validate the 53 00:04:20,750 --> 00:04:21,060 error. 54 00:04:21,920 --> 00:04:30,740 And l just the best model which has least cross-validation error to perform cross-validation. 55 00:04:31,040 --> 00:04:33,320 It splits today down to 10 parts. 56 00:04:33,350 --> 00:04:37,530 By default, uses nine of the parts to train the model. 57 00:04:37,760 --> 00:04:40,990 And the end part to find out the cross validation edit. 58 00:04:41,880 --> 00:04:50,600 And it runs the model 10 times using nine of the sets as training set and one set as they tested each 59 00:04:50,600 --> 00:04:50,900 day. 60 00:04:52,670 --> 00:05:00,120 So basically, if we have a cross validation value of ten, it will run the model painting Martyn's. 61 00:05:00,170 --> 00:05:06,890 We already have these five combinations of cost and five combinations or degrees. 62 00:05:07,340 --> 00:05:11,240 So we are already done in this model five, five, 25, 25. 63 00:05:12,050 --> 00:05:18,110 If I have cross-validation of N here, it will have to run the same model 250 times. 64 00:05:19,340 --> 00:05:22,730 So I am reduced to cross-validation count here to four. 65 00:05:23,030 --> 00:05:26,870 So that this command does not take as much time. 66 00:05:27,740 --> 00:05:35,300 This basically means that the training set will be made into four parts, three parts will be used to 67 00:05:35,300 --> 00:05:39,740 train the data and the fourth part will be used to find out the cross validation error. 68 00:05:41,180 --> 00:05:48,950 So when you have limited computational power and you want to check out more values of cost and degree 69 00:05:49,550 --> 00:05:56,810 and some other hybrid parameter to be done, you have an option of reducing this cross parameter, which 70 00:05:56,810 --> 00:05:59,030 is the number of cross validation set. 71 00:06:01,240 --> 00:06:02,080 So I'll run this command. 72 00:06:05,560 --> 00:06:06,490 It is still running. 73 00:06:10,920 --> 00:06:17,730 So the command has run and you can see that we have a new erasable called Tuned Out Out B for people 74 00:06:17,730 --> 00:06:18,010 to make. 75 00:06:18,010 --> 00:06:23,370 I mean, you can see that it took a lot of time if we had the cross value of them. 76 00:06:23,800 --> 00:06:29,800 It would have taken even more pain, probably 2.5 times more pain than this commanded. 77 00:06:31,360 --> 00:06:38,910 So now you don our B has information of all the different models that we created. 78 00:06:39,820 --> 00:06:45,970 So let us find out which is the best model will store the information on best model in best more be. 79 00:06:48,700 --> 00:06:55,600 So no best might be as the information of the best polynomial Maudy. 80 00:06:57,380 --> 00:06:58,120 Let us look at this. 81 00:06:58,120 --> 00:06:59,570 Somebody on this list might be. 82 00:07:02,070 --> 00:07:10,050 So in this best model, we get that best model has a cost value of 10 degree of one. 83 00:07:10,950 --> 00:07:19,830 This means that if you if you are using a cardinal, the best binomial is coming out of degree, one 84 00:07:20,070 --> 00:07:22,590 which is same as linear polynomial. 85 00:07:23,160 --> 00:07:28,470 So even a polynomial, Cardinal Nellis telling us that the best relationship between dependent and independent 86 00:07:28,470 --> 00:07:29,970 variable is linear. 87 00:07:31,830 --> 00:07:37,530 However, this is telling us that cost is equal to 10 is giving us the best cross-validation result 88 00:07:39,480 --> 00:07:44,030 into saying that the number of support vectors is 310 out of three. 89 00:07:44,030 --> 00:07:50,740 Ninety eight, 157 of them are on one side, 153 of them are on the other side. 90 00:07:52,890 --> 00:07:56,920 And it ran a fourfold cross validation on the training data. 91 00:07:59,010 --> 00:08:00,780 And these are the accuracy values. 92 00:08:03,060 --> 00:08:05,370 So let us predict values of that. 93 00:08:05,470 --> 00:08:12,960 Start to asking variable on the desert and compared the performance of this model against the linear 94 00:08:12,960 --> 00:08:14,340 model that we created earlier. 95 00:08:16,870 --> 00:08:22,260 So why would be USD predicted values from this polynomial based model? 96 00:08:23,100 --> 00:08:27,720 And we'll create this table, which is also called the confusion matrix. 97 00:08:28,770 --> 00:08:35,490 So in this confusion matrix, you can see that we predicted that for forty three cases the movie will 98 00:08:35,490 --> 00:08:37,990 not get any longer. 99 00:08:38,030 --> 00:08:43,380 These articles could out of them for 32 cases. 100 00:08:43,590 --> 00:08:44,240 We are correct. 101 00:08:48,640 --> 00:08:54,690 Also, for 65 cases, a model predicted that we will get started Oscar. 102 00:08:55,720 --> 00:08:58,010 Whereas 49 of those cases we are. 103 00:08:58,490 --> 00:08:59,560 And I did not prediction. 104 00:09:00,400 --> 00:09:07,390 So overall, the prediction accuracy is 71, divided by one zero eight. 105 00:09:10,060 --> 00:09:11,950 It comes out to 65 percent. 106 00:09:13,900 --> 00:09:20,410 So a linear polynomial with a cost value of 10 is giving us a prediction. 107 00:09:20,440 --> 00:09:22,480 Accuracy of 65 percent. 108 00:09:23,320 --> 00:09:32,110 Last time when we created a table on linear model, we correctly predicted. 109 00:09:32,210 --> 00:09:33,300 Sixty seven guesses. 110 00:09:33,460 --> 00:09:36,280 This time we were correctly predicting 71 cases. 111 00:09:38,230 --> 00:09:43,150 So this is how we train our model in Polynomial Kernell SVM. 112 00:09:44,170 --> 00:09:47,360 We use the dual function to do only hyper barometer's. 113 00:09:47,860 --> 00:09:50,050 We have to help out barometer's cost and degree. 114 00:09:51,100 --> 00:09:53,380 We find out the best value of cost and degree. 115 00:09:54,180 --> 00:09:58,750 Use that best model to predict the values on a test or a new set. 116 00:09:59,410 --> 00:10:04,780 And if we have the actual values, we can compare the performance of our predictions against the actual 117 00:10:04,780 --> 00:10:11,410 values using the stable function, which creates the confusion matrix telling us how many were correct 118 00:10:11,710 --> 00:10:12,730 and how many were wrong. 119 00:10:15,610 --> 00:10:18,840 In the next video, we will learn about Radil governance.