1 00:00:00,540 --> 00:00:03,780 Now, in the last lecture, we have centralized our data. 2 00:00:04,080 --> 00:00:08,160 Now it's Spain, Ukraine over SBM classification MURDEN. 3 00:00:11,420 --> 00:00:15,200 First, we need to import a swim from a Skillern. 4 00:00:16,470 --> 00:00:23,040 And we will use as we see function of our SVM to create classification, Morton. 5 00:00:24,840 --> 00:00:31,890 Now, if you remember for regulation, we were using, as we add, but for classification, we need 6 00:00:31,890 --> 00:00:37,080 to use as we see the other steps are almost similar. 7 00:00:37,740 --> 00:00:43,960 We first need to create our object, then we need to train it using the word extreme. 8 00:00:44,160 --> 00:00:45,330 And why train data? 9 00:00:46,480 --> 00:00:54,040 And then the next step is to predict the values of white and then whites using this trained object. 10 00:00:54,940 --> 00:01:02,020 And then we will use some valuation parameter to check the performance of over more than. 11 00:01:04,560 --> 00:01:07,740 Now, let's start with importing a swim. 12 00:01:10,240 --> 00:01:14,350 Now, in this cell, we are first creating this object. 13 00:01:14,740 --> 00:01:22,350 Our object variable name is VLF underscore as we m. underscore l here l stands for Linear. 14 00:01:22,690 --> 00:01:23,830 This is a variable name. 15 00:01:23,830 --> 00:01:26,560 You can give your own variable name as well. 16 00:01:28,390 --> 00:01:37,870 Then we are creating this object, using as we see function and in as we see if you open the hell, 17 00:01:38,260 --> 00:01:43,120 these are the parameters and these are their default values. 18 00:01:43,870 --> 00:01:46,990 By default, the cardinal is radial. 19 00:01:47,590 --> 00:01:55,960 So since we are creating our first model as linear model, we need to provide this parameter of cardinal 20 00:01:55,960 --> 00:01:57,010 equal to linear. 21 00:01:59,020 --> 00:02:05,800 Then if you remember, one of the most important parameter of SVM is the cost parameter, which has 22 00:02:05,800 --> 00:02:06,970 been ordered by sea. 23 00:02:08,080 --> 00:02:12,280 And here we are providing the value of C as zero point zero one. 24 00:02:14,760 --> 00:02:17,130 So let's clear this object. 25 00:02:17,730 --> 00:02:22,650 And let's also find out what extent is standard and why crane data? 26 00:02:26,210 --> 00:02:34,090 Now, remember to use extreme standard data and pseudo fewer extreme to train this modern standardization 27 00:02:34,220 --> 00:02:35,210 is very important. 28 00:02:35,240 --> 00:02:37,820 Before graining, you're as a model. 29 00:02:40,370 --> 00:02:42,140 Now, we have trained our model. 30 00:02:42,710 --> 00:02:48,290 We have fitted our external inviting data in this object. 31 00:02:51,060 --> 00:02:56,730 Now, let's predict the values of why using overtrained more than. 32 00:02:58,890 --> 00:03:00,840 Predicting value is very easy. 33 00:03:00,960 --> 00:03:05,000 You just have to use Dort Braddick method of field grain. 34 00:03:05,350 --> 00:03:05,790 More than. 35 00:03:07,010 --> 00:03:15,350 So I can write, see a live, underscore a swim underscoring this, the same object as this, then I 36 00:03:15,350 --> 00:03:19,220 can use Daudt predict method so I can write Dort predict. 37 00:03:19,430 --> 00:03:22,670 And then we have to just broyd the values of X. 38 00:03:23,660 --> 00:03:31,880 If I am providing values of extreme, I will get wide trend data and if I am avoiding values of expressed, 39 00:03:32,270 --> 00:03:34,970 I will get predicted values of White House data. 40 00:03:37,730 --> 00:03:45,530 So here I am, saving my prepared values and two well-trained, underscored red and white test underscore 41 00:03:45,540 --> 00:03:46,340 pride variable. 42 00:03:48,070 --> 00:03:50,830 Let's run this to get the predicted values. 43 00:03:52,690 --> 00:03:59,740 And if you want to look at the total values, you can just use any of these two variables. 44 00:04:01,240 --> 00:04:05,260 You can see these are the predicted values of a work as dataset. 45 00:04:10,020 --> 00:04:18,960 Now, next step is to evaluate the performance of over more than now in the regression we were using 46 00:04:19,080 --> 00:04:20,910 Artist Square and MASC value. 47 00:04:22,620 --> 00:04:28,170 And classification, we generally use accuracy, score and confusion metrics. 48 00:04:29,460 --> 00:04:37,310 No accuracy is school is the percentage of observations that our model is predicting correctly. 49 00:04:39,950 --> 00:04:43,910 The other metrics we want to look at is confusion, metrics. 50 00:04:44,780 --> 00:04:50,240 Let's import these two functions from a skill and our metrics. 51 00:04:51,540 --> 00:04:55,530 So Escalon has already provided us these two functions. 52 00:04:55,560 --> 00:05:02,040 We do not have to manually compute the accuracy score and confident matrix using were created an actual 53 00:05:02,040 --> 00:05:02,490 values. 54 00:05:02,880 --> 00:05:10,200 We can just important the functions and let us draw confusion metrics on our test dataset. 55 00:05:10,950 --> 00:05:12,420 The process is very simple. 56 00:05:12,570 --> 00:05:20,190 You can just write confusion, metrics, function, and then if you look at the parameters first, you 57 00:05:20,190 --> 00:05:22,400 have to provide the actual values of the way. 58 00:05:22,770 --> 00:05:24,810 And then the TED values of Y. 59 00:05:25,740 --> 00:05:30,350 Since we want convenient metrics of forward, tell certa we just need to provide Y test. 60 00:05:30,990 --> 00:05:34,260 Since our Y test contains the actual value and then. 61 00:05:35,240 --> 00:05:42,170 The second parameter is the accredited values of flight test, which we have calculated here. 62 00:05:43,560 --> 00:05:44,490 Let's run this. 63 00:05:46,290 --> 00:05:55,570 So this the confusion matrix here, rules represent the actual values and columns represent the predicted 64 00:05:55,590 --> 00:05:56,100 values. 65 00:05:57,590 --> 00:06:01,460 And these numbers are the number of observation. 66 00:06:01,520 --> 00:06:03,470 They're not falling into that category. 67 00:06:04,100 --> 00:06:07,310 So, for example, if I select this first girl. 68 00:06:08,530 --> 00:06:12,420 First, a sense for number of actual zeros. 69 00:06:13,260 --> 00:06:16,680 So there are 44 observations in this first straw. 70 00:06:17,040 --> 00:06:27,060 That means in our dataset there are 44 actual zeros and there are 58 actual one. 71 00:06:28,590 --> 00:06:32,340 Now, this column's it stands for the values. 72 00:06:34,110 --> 00:06:37,140 So and then what predicted values? 73 00:06:37,350 --> 00:06:40,090 There are total sixteen zeroes. 74 00:06:40,830 --> 00:06:43,470 And there are total it be six ones. 75 00:06:44,420 --> 00:06:53,770 You can also calculated using this table here, a VM put the data off over Vytas predicted variable. 76 00:06:56,780 --> 00:07:02,390 So in this also, there are 16 zeros and it is six months. 77 00:07:05,400 --> 00:07:05,940 So. 78 00:07:07,100 --> 00:07:14,840 In short, this confusion matrix is telling me that for this eleven records, the actual value is also 79 00:07:14,840 --> 00:07:17,480 zero and the predicted value is also zero. 80 00:07:17,570 --> 00:07:25,100 Since this is in the first floor and the first column for this five records, the actual value is one, 81 00:07:25,280 --> 00:07:32,990 but the total value is zero for this thirty three records, the actual value is zero since this is in 82 00:07:32,990 --> 00:07:33,810 the first row. 83 00:07:34,130 --> 00:07:39,650 But the credit card value is once and this is in the second column for this 53. 84 00:07:39,920 --> 00:07:44,780 The actual value is also one and the total value is also one. 85 00:07:46,640 --> 00:07:56,180 So we are correctly identifying this 53 record and this eleven records, and we are not able to accurately 86 00:07:56,180 --> 00:07:58,100 predict these five records. 87 00:07:58,220 --> 00:07:59,870 And this 33 records. 88 00:08:01,430 --> 00:08:05,350 So what accuracy will be 53 less? 89 00:08:05,410 --> 00:08:05,890 Eleven. 90 00:08:06,620 --> 00:08:10,550 That is 64, divided by the total number of observations. 91 00:08:11,240 --> 00:08:20,690 So you can compute it manually also, but you can also use this accuracy underscored escort function. 92 00:08:21,680 --> 00:08:26,150 And here also you force her to provide the actual values and then the predicted values. 93 00:08:26,240 --> 00:08:27,060 Let's find out. 94 00:08:27,080 --> 00:08:27,860 Decorously. 95 00:08:29,580 --> 00:08:37,490 The accuracy of the murder lists 62 percent, if you calculated from here also, you will get the same 96 00:08:37,520 --> 00:08:37,950 result. 97 00:08:39,420 --> 00:08:45,630 And one more thing about country and metrics, since the actual value of this 11 record is zero. 98 00:08:45,720 --> 00:08:48,210 And we are also predicting it as zero. 99 00:08:49,230 --> 00:08:51,870 We call them groo negative. 100 00:08:54,250 --> 00:08:56,800 That is the Ted Lewis crew. 101 00:08:57,220 --> 00:09:01,550 And actually, they are negative for this 53 records. 102 00:09:01,600 --> 00:09:02,920 The actual value is one. 103 00:09:03,010 --> 00:09:04,930 And the predicted value is also one. 104 00:09:05,620 --> 00:09:08,430 Therefore, we call them crew positive. 105 00:09:10,030 --> 00:09:12,580 That is the actual Ray Lewis one. 106 00:09:14,220 --> 00:09:16,590 And we are also predicting it as one. 107 00:09:16,770 --> 00:09:18,540 So they are group positives. 108 00:09:21,830 --> 00:09:28,910 No, for a swim than you can also find out the number of support vectors you have in your unmodern. 109 00:09:30,880 --> 00:09:37,420 So to find out the number of support vector, you need to use the attribute and underscore, support, 110 00:09:37,510 --> 00:09:38,220 underscore. 111 00:09:39,220 --> 00:09:41,680 So just write your more name. 112 00:09:42,750 --> 00:09:50,010 And then write this attribute, dot and underscore support, underscore Vuuren, this. 113 00:09:51,860 --> 00:09:53,490 So you will get two numbers. 114 00:09:54,870 --> 00:09:58,980 The first number is number of support vector for your first class. 115 00:09:59,670 --> 00:10:03,750 And the second number is the number of support vector for your second class. 116 00:10:04,020 --> 00:10:05,790 Since we have two class zero and one. 117 00:10:05,910 --> 00:10:09,150 So this is that number of support vector for the zero class. 118 00:10:09,870 --> 00:10:13,470 And this is the number of support vector for one class. 119 00:10:14,310 --> 00:10:19,050 So in total, we have 375 support vectors. 120 00:10:20,570 --> 00:10:28,420 If you want to learn more, you can also have a look at this official documentation of Eskinder about 121 00:10:28,630 --> 00:10:31,250 support vector classification here. 122 00:10:31,570 --> 00:10:34,030 We will get details of all the parameters. 123 00:10:34,060 --> 00:10:38,950 We have already discussed few of the important parameters, such as sea, kernell, etc.. 124 00:10:41,770 --> 00:10:44,250 And then you have attributes also. 125 00:10:45,370 --> 00:10:52,060 So after you have grown your model, you can call on this attributes to get more information about your 126 00:10:52,070 --> 00:10:52,480 more than. 127 00:10:54,870 --> 00:10:59,060 We use this attribute to get the number of support vectors. 128 00:10:59,760 --> 00:11:02,930 If you want to view your support directory, you can just straight. 129 00:11:03,180 --> 00:11:04,240 You are more the name. 130 00:11:05,250 --> 00:11:07,540 Support, underscore, vectors underscore. 131 00:11:08,080 --> 00:11:10,350 And it will give you all the support vectors. 132 00:11:12,000 --> 00:11:18,480 So you can have a look at this attributes and parameter to get more information about classification 133 00:11:18,480 --> 00:11:19,860 models of SVM. 134 00:11:21,700 --> 00:11:27,340 In the next lecture, we will use grid search to optimize this value of sea.