1
00:00:00,466 --> 00:00:00,766
All right.

2
00:00:00,766 --> 00:00:03,666
So that was the name of the module
you had to find.

3
00:00:03,666 --> 00:00:07,100
And now the next question is
which of these classes.

4
00:00:07,100 --> 00:00:10,933
Because you see these are all the classes
of this neighbors module.

5
00:00:10,933 --> 00:00:13,933
That's actually the name of the module
by site learn neighbors.

6
00:00:14,233 --> 00:00:18,066
And these are all the classes that allow
you to build machine learning tools.

7
00:00:18,066 --> 00:00:21,066
You know, in this nearest neighbors
branch of machine learning.

8
00:00:21,600 --> 00:00:21,900
All right.

9
00:00:21,900 --> 00:00:27,833
So of course, the one we're interested in
is this one k neighbors classifier.

10
00:00:28,133 --> 00:00:30,666
There you go. Congratulations
if you found it.

11
00:00:30,666 --> 00:00:35,100
So let's click it
and let's see the whole documentation.

12
00:00:35,100 --> 00:00:36,466
So feel free to read it if you want.

13
00:00:36,466 --> 00:00:40,033
You can see what are all the parameters
and also attributes.

14
00:00:40,300 --> 00:00:43,533
But what we actually simply need is this,
you know,

15
00:00:43,900 --> 00:00:47,700
the whole name of the class and the module
and the library scikit learn.

16
00:00:47,700 --> 00:00:52,300
Because the only thing that we need really
to build and train the skin and model

17
00:00:52,466 --> 00:00:55,500
is the name of this class
to, you know, create the object

18
00:00:55,500 --> 00:00:59,366
and also the parameters here
we need to know which parameters

19
00:00:59,366 --> 00:01:03,000
we have to enter here
in order to build a relevant K and an ML.

20
00:01:03,000 --> 00:01:03,633
All right.

21
00:01:03,633 --> 00:01:09,000
So first let's do this and let's go back
to our implementation to here paste it

22
00:01:09,366 --> 00:01:13,200
and then adapted by
you know doing this from from scikit

23
00:01:13,200 --> 00:01:16,200
learn and then from the neighbors
module of scikit learn.

24
00:01:16,400 --> 00:01:21,600
We will import this class
the k neighbors classifier.

25
00:01:21,833 --> 00:01:22,933
That's the class.

26
00:01:22,933 --> 00:01:27,500
And then you know the next natural step
it is to create an object of this class

27
00:01:27,500 --> 00:01:32,466
which will represent exactly the k
and in model itself, the classifier.

28
00:01:32,800 --> 00:01:35,800
And that's why we call it classifier.

29
00:01:35,966 --> 00:01:36,600
And then

30
00:01:36,600 --> 00:01:40,266
to create an object of this class,
well we just need to call the class again.

31
00:01:40,466 --> 00:01:44,833
So I'm copying this basing it here
and then adding some parenthesis.

32
00:01:44,966 --> 00:01:45,266
All right.

33
00:01:45,266 --> 00:01:49,000
So that's the first information
we need to get from the cycling API.

34
00:01:49,000 --> 00:01:53,700
But then the second thing we need to check
also are the parameters here.

35
00:01:53,700 --> 00:01:55,733
And you have all the descriptions here.

36
00:01:55,733 --> 00:01:58,800
So for example
the first one and neighbors equals five.

37
00:01:59,000 --> 00:02:02,800
And neighbors is of course the number
of neighbors of your k-NN and model.

38
00:02:02,800 --> 00:02:04,900
You remember the intuition lectures

39
00:02:04,900 --> 00:02:07,800
you have the neighbors that you use
to make your predictions.

40
00:02:07,800 --> 00:02:08,433
And we have to

41
00:02:08,433 --> 00:02:12,500
choose a number of neighbors and well,
you know, we can just try this value.

42
00:02:12,500 --> 00:02:15,466
Five I actually know that
we will get good results with this.

43
00:02:15,466 --> 00:02:18,500
But you know, in your future machine
learning projects,

44
00:02:18,500 --> 00:02:21,666
if you're using a K in in model, well,
I recommend to tune it

45
00:02:21,666 --> 00:02:24,600
with several values,
but five is usually good.

46
00:02:24,600 --> 00:02:25,800
So let's do this.

47
00:02:25,800 --> 00:02:29,066
First parameter n neighbors

48
00:02:29,366 --> 00:02:32,333
equals five good.

49
00:02:32,333 --> 00:02:33,433
Then next parameter.

50
00:02:33,433 --> 00:02:35,866
Let's see weights equals uniform.

51
00:02:35,866 --> 00:02:37,666
So uniform
is the default value of weights.

52
00:02:37,666 --> 00:02:41,000
And weight is the weight
function used in prediction.

53
00:02:41,000 --> 00:02:44,466
And well here we will actually keep
the default values uniform,

54
00:02:44,466 --> 00:02:49,800
which means that all the points in each
neighborhood are weighted equally okay.

55
00:02:49,800 --> 00:02:51,600
So they have the same importance.

56
00:02:51,600 --> 00:02:53,366
So we will keep that. That's fine.

57
00:02:53,366 --> 00:02:55,066
Then algorithm equals zero.

58
00:02:55,066 --> 00:02:56,066
What does that mean.

59
00:02:56,066 --> 00:02:59,566
Well that's basically the algorithm used
to compute the nearest neighbors.

60
00:02:59,566 --> 00:03:03,900
And zero is the best value to choose
because it will decide automatically

61
00:03:04,066 --> 00:03:08,400
the most appropriate algorithm based on
the values passed to the fit method.

62
00:03:08,533 --> 00:03:11,533
You know, the method that trains
your model on the training set.

63
00:03:11,533 --> 00:03:16,000
So definitely here it will be simple
if we choose auto and then you have

64
00:03:16,000 --> 00:03:20,633
some other parameters, leaf size of
which will give the default value, and P

65
00:03:20,900 --> 00:03:24,366
which is the power parameter
for the Minkowski metric.

66
00:03:24,366 --> 00:03:25,833
So there we go. That's important.

67
00:03:25,833 --> 00:03:27,100
That's the other parameters.

68
00:03:27,100 --> 00:03:31,100
We will enter the last two parameters
I actually want to enter are this one

69
00:03:31,100 --> 00:03:35,733
metric equals min koski and p
because indeed metric

70
00:03:35,733 --> 00:03:39,566
is actually the distance
you want to use to compute, you know,

71
00:03:39,566 --> 00:03:42,566
the distance between your observation
points and the neighbors.

72
00:03:42,733 --> 00:03:45,566
And we actually want to choose
the Euclidean distance,

73
00:03:45,566 --> 00:03:48,566
which is, you know, the classic distance
equal to the square root

74
00:03:48,566 --> 00:03:51,566
of the sum of the squared differences
between the coordinates

75
00:03:51,833 --> 00:03:55,333
and in order to take that classic
Euclidean distance, well,

76
00:03:55,333 --> 00:03:58,700
we have to choose a Minkowski metric
with p equals two.

77
00:03:59,100 --> 00:04:02,100
So basically we're keeping all the default

78
00:04:02,100 --> 00:04:05,100
values of this k
neighbors classifier class.

79
00:04:05,300 --> 00:04:09,966
But in order to make sure that we are
using them and just to highlight them,

80
00:04:10,200 --> 00:04:13,500
well, let's just write these parameters
with their default values anyway

81
00:04:13,700 --> 00:04:15,866
because it's important
to see what we're dealing with.

82
00:04:15,866 --> 00:04:19,166
You know what version of K
and then we're dealing with okay.

83
00:04:19,166 --> 00:04:20,466
So let's do this quickly.

84
00:04:20,466 --> 00:04:24,500
Metric equals Minkowski.

85
00:04:24,966 --> 00:04:27,833
And then p equals two.

86
00:04:27,833 --> 00:04:28,800
Perfect.

87
00:04:28,800 --> 00:04:30,233
And so now we have basically

88
00:04:30,233 --> 00:04:34,000
a classic K-nearest neighbors model
with five neighbors.

89
00:04:34,000 --> 00:04:37,000
And the classic Euclidean distance okay.

90
00:04:37,133 --> 00:04:39,766
And now you perfectly know
how to finish this.

91
00:04:39,766 --> 00:04:42,800
The last step here is
of course to train our classifier,

92
00:04:42,800 --> 00:04:46,833
which indeed we built so far but
is not trained yet on the training set.

93
00:04:47,100 --> 00:04:50,700
So that's exactly what we need to do
as a final step.

94
00:04:50,966 --> 00:04:51,733
And so there you go.

95
00:04:51,733 --> 00:04:56,866
We call our classifier from which
we're going to call our fit method,

96
00:04:57,133 --> 00:04:59,566
which as usual takes as input.

97
00:04:59,566 --> 00:05:02,566
First the matrix of features X train.

98
00:05:03,000 --> 00:05:07,800
And second, the dependent variable vector
y train of the training set.

99
00:05:07,800 --> 00:05:10,300
Of course. All right. Perfect.

100
00:05:10,300 --> 00:05:11,266
And that's it.

101
00:05:11,266 --> 00:05:13,566
You know
we are done with this implementation.

102
00:05:13,566 --> 00:05:15,566
All the rest is the same.

103
00:05:15,566 --> 00:05:19,300
We don't have to change anything else here
because indeed since we called

104
00:05:19,300 --> 00:05:21,233
R-cnn and model classifier.

105
00:05:21,233 --> 00:05:22,800
Well here to make the predictions,

106
00:05:22,800 --> 00:05:26,033
we already have the right name
of the variable classifier.

107
00:05:26,266 --> 00:05:29,266
And then same here
to predict the test results classifier.

108
00:05:29,266 --> 00:05:33,200
And then same for the confusion
matrix Y test wipe read which result

109
00:05:33,200 --> 00:05:37,566
from our same classifier and then same
for the visualization of the results.

110
00:05:37,566 --> 00:05:38,800
Sorry, I just show them to you.

111
00:05:38,800 --> 00:05:42,066
I hope you didn't see, but
we're going to get to that in a second.

112
00:05:42,333 --> 00:05:42,900
There you go.

113
00:05:42,900 --> 00:05:47,100
That's the same same names
of the variable classifier Xtrain y train.

114
00:05:47,100 --> 00:05:48,500
So all the rest is the same.

115
00:05:48,500 --> 00:05:51,533
And that's why
I like to call it a good code template.