1 00:00:00,466 --> 00:00:00,766 All right. 2 00:00:00,766 --> 00:00:03,666 So that was the name of the module you had to find. 3 00:00:03,666 --> 00:00:07,100 And now the next question is which of these classes. 4 00:00:07,100 --> 00:00:10,933 Because you see these are all the classes of this neighbors module. 5 00:00:10,933 --> 00:00:13,933 That's actually the name of the module by site learn neighbors. 6 00:00:14,233 --> 00:00:18,066 And these are all the classes that allow you to build machine learning tools. 7 00:00:18,066 --> 00:00:21,066 You know, in this nearest neighbors branch of machine learning. 8 00:00:21,600 --> 00:00:21,900 All right. 9 00:00:21,900 --> 00:00:27,833 So of course, the one we're interested in is this one k neighbors classifier. 10 00:00:28,133 --> 00:00:30,666 There you go. Congratulations if you found it. 11 00:00:30,666 --> 00:00:35,100 So let's click it and let's see the whole documentation. 12 00:00:35,100 --> 00:00:36,466 So feel free to read it if you want. 13 00:00:36,466 --> 00:00:40,033 You can see what are all the parameters and also attributes. 14 00:00:40,300 --> 00:00:43,533 But what we actually simply need is this, you know, 15 00:00:43,900 --> 00:00:47,700 the whole name of the class and the module and the library scikit learn. 16 00:00:47,700 --> 00:00:52,300 Because the only thing that we need really to build and train the skin and model 17 00:00:52,466 --> 00:00:55,500 is the name of this class to, you know, create the object 18 00:00:55,500 --> 00:00:59,366 and also the parameters here we need to know which parameters 19 00:00:59,366 --> 00:01:03,000 we have to enter here in order to build a relevant K and an ML. 20 00:01:03,000 --> 00:01:03,633 All right. 21 00:01:03,633 --> 00:01:09,000 So first let's do this and let's go back to our implementation to here paste it 22 00:01:09,366 --> 00:01:13,200 and then adapted by you know doing this from from scikit 23 00:01:13,200 --> 00:01:16,200 learn and then from the neighbors module of scikit learn. 24 00:01:16,400 --> 00:01:21,600 We will import this class the k neighbors classifier. 25 00:01:21,833 --> 00:01:22,933 That's the class. 26 00:01:22,933 --> 00:01:27,500 And then you know the next natural step it is to create an object of this class 27 00:01:27,500 --> 00:01:32,466 which will represent exactly the k and in model itself, the classifier. 28 00:01:32,800 --> 00:01:35,800 And that's why we call it classifier. 29 00:01:35,966 --> 00:01:36,600 And then 30 00:01:36,600 --> 00:01:40,266 to create an object of this class, well we just need to call the class again. 31 00:01:40,466 --> 00:01:44,833 So I'm copying this basing it here and then adding some parenthesis. 32 00:01:44,966 --> 00:01:45,266 All right. 33 00:01:45,266 --> 00:01:49,000 So that's the first information we need to get from the cycling API. 34 00:01:49,000 --> 00:01:53,700 But then the second thing we need to check also are the parameters here. 35 00:01:53,700 --> 00:01:55,733 And you have all the descriptions here. 36 00:01:55,733 --> 00:01:58,800 So for example the first one and neighbors equals five. 37 00:01:59,000 --> 00:02:02,800 And neighbors is of course the number of neighbors of your k-NN and model. 38 00:02:02,800 --> 00:02:04,900 You remember the intuition lectures 39 00:02:04,900 --> 00:02:07,800 you have the neighbors that you use to make your predictions. 40 00:02:07,800 --> 00:02:08,433 And we have to 41 00:02:08,433 --> 00:02:12,500 choose a number of neighbors and well, you know, we can just try this value. 42 00:02:12,500 --> 00:02:15,466 Five I actually know that we will get good results with this. 43 00:02:15,466 --> 00:02:18,500 But you know, in your future machine learning projects, 44 00:02:18,500 --> 00:02:21,666 if you're using a K in in model, well, I recommend to tune it 45 00:02:21,666 --> 00:02:24,600 with several values, but five is usually good. 46 00:02:24,600 --> 00:02:25,800 So let's do this. 47 00:02:25,800 --> 00:02:29,066 First parameter n neighbors 48 00:02:29,366 --> 00:02:32,333 equals five good. 49 00:02:32,333 --> 00:02:33,433 Then next parameter. 50 00:02:33,433 --> 00:02:35,866 Let's see weights equals uniform. 51 00:02:35,866 --> 00:02:37,666 So uniform is the default value of weights. 52 00:02:37,666 --> 00:02:41,000 And weight is the weight function used in prediction. 53 00:02:41,000 --> 00:02:44,466 And well here we will actually keep the default values uniform, 54 00:02:44,466 --> 00:02:49,800 which means that all the points in each neighborhood are weighted equally okay. 55 00:02:49,800 --> 00:02:51,600 So they have the same importance. 56 00:02:51,600 --> 00:02:53,366 So we will keep that. That's fine. 57 00:02:53,366 --> 00:02:55,066 Then algorithm equals zero. 58 00:02:55,066 --> 00:02:56,066 What does that mean. 59 00:02:56,066 --> 00:02:59,566 Well that's basically the algorithm used to compute the nearest neighbors. 60 00:02:59,566 --> 00:03:03,900 And zero is the best value to choose because it will decide automatically 61 00:03:04,066 --> 00:03:08,400 the most appropriate algorithm based on the values passed to the fit method. 62 00:03:08,533 --> 00:03:11,533 You know, the method that trains your model on the training set. 63 00:03:11,533 --> 00:03:16,000 So definitely here it will be simple if we choose auto and then you have 64 00:03:16,000 --> 00:03:20,633 some other parameters, leaf size of which will give the default value, and P 65 00:03:20,900 --> 00:03:24,366 which is the power parameter for the Minkowski metric. 66 00:03:24,366 --> 00:03:25,833 So there we go. That's important. 67 00:03:25,833 --> 00:03:27,100 That's the other parameters. 68 00:03:27,100 --> 00:03:31,100 We will enter the last two parameters I actually want to enter are this one 69 00:03:31,100 --> 00:03:35,733 metric equals min koski and p because indeed metric 70 00:03:35,733 --> 00:03:39,566 is actually the distance you want to use to compute, you know, 71 00:03:39,566 --> 00:03:42,566 the distance between your observation points and the neighbors. 72 00:03:42,733 --> 00:03:45,566 And we actually want to choose the Euclidean distance, 73 00:03:45,566 --> 00:03:48,566 which is, you know, the classic distance equal to the square root 74 00:03:48,566 --> 00:03:51,566 of the sum of the squared differences between the coordinates 75 00:03:51,833 --> 00:03:55,333 and in order to take that classic Euclidean distance, well, 76 00:03:55,333 --> 00:03:58,700 we have to choose a Minkowski metric with p equals two. 77 00:03:59,100 --> 00:04:02,100 So basically we're keeping all the default 78 00:04:02,100 --> 00:04:05,100 values of this k neighbors classifier class. 79 00:04:05,300 --> 00:04:09,966 But in order to make sure that we are using them and just to highlight them, 80 00:04:10,200 --> 00:04:13,500 well, let's just write these parameters with their default values anyway 81 00:04:13,700 --> 00:04:15,866 because it's important to see what we're dealing with. 82 00:04:15,866 --> 00:04:19,166 You know what version of K and then we're dealing with okay. 83 00:04:19,166 --> 00:04:20,466 So let's do this quickly. 84 00:04:20,466 --> 00:04:24,500 Metric equals Minkowski. 85 00:04:24,966 --> 00:04:27,833 And then p equals two. 86 00:04:27,833 --> 00:04:28,800 Perfect. 87 00:04:28,800 --> 00:04:30,233 And so now we have basically 88 00:04:30,233 --> 00:04:34,000 a classic K-nearest neighbors model with five neighbors. 89 00:04:34,000 --> 00:04:37,000 And the classic Euclidean distance okay. 90 00:04:37,133 --> 00:04:39,766 And now you perfectly know how to finish this. 91 00:04:39,766 --> 00:04:42,800 The last step here is of course to train our classifier, 92 00:04:42,800 --> 00:04:46,833 which indeed we built so far but is not trained yet on the training set. 93 00:04:47,100 --> 00:04:50,700 So that's exactly what we need to do as a final step. 94 00:04:50,966 --> 00:04:51,733 And so there you go. 95 00:04:51,733 --> 00:04:56,866 We call our classifier from which we're going to call our fit method, 96 00:04:57,133 --> 00:04:59,566 which as usual takes as input. 97 00:04:59,566 --> 00:05:02,566 First the matrix of features X train. 98 00:05:03,000 --> 00:05:07,800 And second, the dependent variable vector y train of the training set. 99 00:05:07,800 --> 00:05:10,300 Of course. All right. Perfect. 100 00:05:10,300 --> 00:05:11,266 And that's it. 101 00:05:11,266 --> 00:05:13,566 You know we are done with this implementation. 102 00:05:13,566 --> 00:05:15,566 All the rest is the same. 103 00:05:15,566 --> 00:05:19,300 We don't have to change anything else here because indeed since we called 104 00:05:19,300 --> 00:05:21,233 R-cnn and model classifier. 105 00:05:21,233 --> 00:05:22,800 Well here to make the predictions, 106 00:05:22,800 --> 00:05:26,033 we already have the right name of the variable classifier. 107 00:05:26,266 --> 00:05:29,266 And then same here to predict the test results classifier. 108 00:05:29,266 --> 00:05:33,200 And then same for the confusion matrix Y test wipe read which result 109 00:05:33,200 --> 00:05:37,566 from our same classifier and then same for the visualization of the results. 110 00:05:37,566 --> 00:05:38,800 Sorry, I just show them to you. 111 00:05:38,800 --> 00:05:42,066 I hope you didn't see, but we're going to get to that in a second. 112 00:05:42,333 --> 00:05:42,900 There you go. 113 00:05:42,900 --> 00:05:47,100 That's the same same names of the variable classifier Xtrain y train. 114 00:05:47,100 --> 00:05:48,500 So all the rest is the same. 115 00:05:48,500 --> 00:05:51,533 And that's why I like to call it a good code template.