1 00:00:00,980 --> 00:00:10,170 In this video, we will learn how to build a cannon based classify skin and classify it uses a function 2 00:00:10,500 --> 00:00:15,540 called gain in this function is part of the glass package. 3 00:00:16,650 --> 00:00:21,040 You can see on the right there is a glass package if it is installed. 4 00:00:21,090 --> 00:00:22,020 It will be shown here. 5 00:00:22,110 --> 00:00:24,960 If it is tarnished all you first have to install it. 6 00:00:25,140 --> 00:00:28,500 Using its target is function once you have started. 7 00:00:28,680 --> 00:00:30,930 Just take it here so that it is active. 8 00:00:33,890 --> 00:00:38,610 So this function the key and in function it requires for input. 9 00:00:40,320 --> 00:00:45,570 The first input is a matrix containing the predictors associated with the training data. 10 00:00:47,250 --> 00:00:49,890 That is all the independent variables. 11 00:00:50,220 --> 00:00:57,210 The X variables should be segregated from the training data and put into a separate variable. 12 00:00:57,800 --> 00:01:00,060 I'll name this very well as Train X. 13 00:01:04,170 --> 00:01:05,610 So why train X variable? 14 00:01:06,420 --> 00:01:14,700 We'll have all the variables of training data set except desalt variable, distorted variable is on 15 00:01:14,700 --> 00:01:16,890 the sixteenth column position. 16 00:01:17,010 --> 00:01:20,340 If you're not sure about it, you can you can open that data. 17 00:01:23,950 --> 00:01:28,210 Scroll to the right and hold over the name of the variable. 18 00:01:28,840 --> 00:01:29,950 This is column 16. 19 00:01:30,940 --> 00:01:32,610 So we will remove the 16th column. 20 00:01:34,170 --> 00:01:37,240 To do that, we will write green. 21 00:01:37,600 --> 00:01:44,350 And of course, it will in square brackets, first parameter will be blank. 22 00:01:44,600 --> 00:01:49,040 That is, we want all zeros, then a comma, then minus 16. 23 00:01:49,450 --> 00:01:51,610 That is not the 16th column. 24 00:01:51,670 --> 00:01:54,940 Everything else we have three. 25 00:01:56,560 --> 00:02:01,120 The second parameter we require is the best predictor variables. 26 00:02:01,240 --> 00:02:02,320 That is test X. 27 00:02:05,930 --> 00:02:14,510 These variables, these observations will be classified by digging in classified, so this Test X will 28 00:02:14,510 --> 00:02:18,890 be graded from the test set and tested. 29 00:02:19,790 --> 00:02:22,340 Also has this old variable which has to be removed. 30 00:02:22,400 --> 00:02:24,530 So they'll remove it by writing minus 16. 31 00:02:27,620 --> 00:02:32,030 The third parameter can and function will take is a vector containing 32 00:02:34,550 --> 00:02:36,740 glass labels for the training observation. 33 00:02:37,130 --> 00:02:37,880 What does that mean? 34 00:02:40,010 --> 00:02:46,400 It contains the Y, but even that is the sole variable for all the reading observations. 35 00:02:46,760 --> 00:02:47,810 So train Y. 36 00:02:51,400 --> 00:02:55,640 Is equal to drain dollar soared. 37 00:02:57,020 --> 00:03:01,850 So does the dependent variable for which we know devalues. 38 00:03:05,450 --> 00:03:10,370 And Test Wavery will again will be the dependent variable of be tested. 39 00:03:11,360 --> 00:03:19,060 This will be used to compare the performance of the predicted values of Y against this test sort of 40 00:03:19,060 --> 00:03:19,280 way. 41 00:03:21,420 --> 00:03:26,250 But this test way is not the vote, but every day before that, and we don't know for certain. 42 00:03:26,620 --> 00:03:29,760 D given you engage nearest neighbor. 43 00:03:29,880 --> 00:03:32,880 I told you that we fixed the rally off nearest neighbor. 44 00:03:32,990 --> 00:03:35,970 Be we consider that is the value of key. 45 00:03:36,690 --> 00:03:38,660 We have to input that value of key. 46 00:03:39,540 --> 00:03:43,220 So here we will use a key value of three first. 47 00:03:43,760 --> 00:03:46,920 We'll create a variable called is equal to three. 48 00:03:49,140 --> 00:03:57,720 I also told you that since kanon classify user distances, it is important that we standardize these 49 00:03:57,720 --> 00:04:03,200 variables so that all the variables have an equal impact in terms of their scale. 50 00:04:05,440 --> 00:04:09,030 So to standardize the variables, we use a variable called scale. 51 00:04:10,470 --> 00:04:14,040 So the standardized version of the will be Tenex. 52 00:04:14,280 --> 00:04:14,910 Code is. 53 00:04:21,080 --> 00:04:25,590 Is equal to scale and within bracket. 54 00:04:25,640 --> 00:04:28,780 We will give these tenex, but he would. 55 00:04:33,250 --> 00:04:35,270 So, I mean, I leave, you will do it for mistakes. 56 00:04:36,860 --> 00:04:44,300 We do not need to do this for divine variables because they are categorical and they do not need scaling. 57 00:04:53,150 --> 00:04:58,250 One last thing that we have to do before running the and classify it is setting seed. 58 00:04:58,760 --> 00:05:06,770 As I told you, whenever there is a time by assigning the class to an observation or a science class 59 00:05:06,860 --> 00:05:07,430 randomly. 60 00:05:08,060 --> 00:05:15,260 So when you are doing it and when I am doing it, we should both get the same reasons to do that. 61 00:05:15,500 --> 00:05:19,910 We will set a seed by writing set seed zero. 62 00:05:21,470 --> 00:05:24,920 If you do this, your design and manager will be exactly same. 63 00:05:25,490 --> 00:05:28,520 So for the reproducibility of the desert, we are setting the seed. 64 00:05:31,310 --> 00:05:33,820 Now we are ready to run again and classify it. 65 00:05:36,440 --> 00:05:38,230 We will write candles and dark, Fred. 66 00:05:38,600 --> 00:05:43,790 This is the variable name which will contain the result of the garden and model. 67 00:05:45,350 --> 00:05:46,790 It starts with the gain and function. 68 00:05:48,680 --> 00:05:53,720 The first barometer is train X under Skoda's. 69 00:05:57,030 --> 00:05:59,520 Second, barometer's districts underscore this. 70 00:06:05,090 --> 00:06:08,030 The third parameter is green light. 71 00:06:15,200 --> 00:06:19,070 And the last word on that is D gave a news, look, gays equal to gay. 72 00:06:21,630 --> 00:06:23,400 So this gay is. 73 00:06:24,450 --> 00:06:30,500 But I made a name for this function and this gay is the variable name that I have to sign. 74 00:06:31,110 --> 00:06:34,840 So here you can put gays equal to one, not gay is equal to three radically alter. 75 00:06:35,460 --> 00:06:42,240 I've just put I've just say separated it here so that whenever I change the value of gay here I can 76 00:06:42,340 --> 00:06:43,320 add on the whole analysis. 77 00:06:43,440 --> 00:06:46,950 I mean, so let's run this. 78 00:06:50,710 --> 00:06:53,300 So the model are stored in gain and. 79 00:06:54,700 --> 00:07:01,410 If you want to create the confusion matrix, we will use this CNN prayed and the best way variable so 80 00:07:01,490 --> 00:07:05,040 they are able to get in and prayed 81 00:07:09,480 --> 00:07:10,920 Colma Bestway. 82 00:07:11,610 --> 00:07:12,550 So get in prayer. 83 00:07:12,580 --> 00:07:18,070 Has deep predicted values display as the actual values. 84 00:07:23,800 --> 00:07:26,950 So here is our confusion, matrix using beginning classifier. 85 00:07:28,550 --> 00:07:38,840 You can see we are getting 66 out of 120 correct responses, correct predictions and 54 incorrect predictions. 86 00:07:40,820 --> 00:07:43,940 Now the JADI value of gate from three to one. 87 00:07:52,100 --> 00:07:53,040 And then done this again. 88 00:07:55,220 --> 00:08:00,950 You can see that now we are getting only fifty nine correct predictions on the desert. 89 00:08:01,850 --> 00:08:04,670 That is a live we were getting 66 correct predictions. 90 00:08:06,170 --> 00:08:12,410 You can see by changing the value of key, that is by increasing the flexibility of Deakin and model. 91 00:08:13,130 --> 00:08:16,010 We get different accuracy of our gain in model. 92 00:08:18,750 --> 00:08:22,560 So you can see this is the template of the gear nearest neighbor model. 93 00:08:23,820 --> 00:08:26,760 We first get this class package. 94 00:08:28,020 --> 00:08:31,800 Then we create X that sticks train Y Bestway. 95 00:08:32,490 --> 00:08:34,200 And we assign a key value. 96 00:08:34,620 --> 00:08:40,110 All these four will going to the gain and function to give us the predicted values. 97 00:08:41,430 --> 00:08:47,130 Remember to standardize the values of all the dependent variables before putting it into the can and 98 00:08:47,130 --> 00:08:47,370 model. 99 00:08:48,240 --> 00:08:48,690 That's it. 100 00:08:48,990 --> 00:08:49,590 And this we do.