1
00:00:01,050 --> 00:00:05,730
In this video, we will learn the intuition behind our nearest neighbors classifier.

2
00:00:07,930 --> 00:00:13,640
CNN is another approach which attempts to classify basis based classify it.

3
00:00:13,900 --> 00:00:20,440
That is what a given observation based on the conditional probability of belonging to each class.

4
00:00:21,070 --> 00:00:24,100
It will classify the observation into each class.

5
00:00:24,220 --> 00:00:25,680
As we saw in Alioto.

6
00:00:27,600 --> 00:00:31,590
Mark, unlike Ali, Gannon, is a non parametric mentor.

7
00:00:32,180 --> 00:00:33,320
That is in Kinnon.

8
00:00:33,470 --> 00:00:36,520
We do not assume any functional form of the relationship.

9
00:00:37,680 --> 00:00:41,520
Therefore, the final go can have very complex chip.

10
00:00:42,940 --> 00:00:48,280
Also, potentially, the final go can have very high accuracy, too.

11
00:00:49,810 --> 00:00:54,250
Let me clean the content behind CNN using this simple diagram.

12
00:00:56,580 --> 00:01:03,130
For simplicity, we are assuming that we have only two predictive variables so that I can short on a

13
00:01:03,130 --> 00:01:03,810
two day plot.

14
00:01:05,170 --> 00:01:08,920
Although the same concept can be extended for any number of predictors.

15
00:01:10,790 --> 00:01:18,350
So suppose I have one predictor on the x axis and the other one on the Y axis and decollete of of these

16
00:01:18,440 --> 00:01:24,380
small circles which represent each data point is telling us the class of the responsibility.

17
00:01:24,500 --> 00:01:27,840
But that is some circles are an orange color.

18
00:01:28,280 --> 00:01:30,860
So that is one class and somewhat in Blue-Collar.

19
00:01:33,050 --> 00:01:41,620
Now, I have this point, which is Mondelēz X and I want to classify in2 either blue glass or the orange

20
00:01:41,620 --> 00:01:44,780
glass, engage nearest neighbors.

21
00:01:45,640 --> 00:01:47,860
We decide the value of key.

22
00:01:48,310 --> 00:01:54,330
That is how many point near that particular point we want to consider.

23
00:01:56,950 --> 00:01:59,440
So suppose I take is equal to three.

24
00:01:59,680 --> 00:02:06,790
That is, I will take three nearest points to that point and out of those three.

25
00:02:07,210 --> 00:02:10,840
I find the conditional probability of each class blue or orange.

26
00:02:12,160 --> 00:02:18,310
So if you look at this point, I'm brown, the smaller circle, which can and Kompass do three point

27
00:02:18,430 --> 00:02:26,470
so that I have the three nearest neighbors out of these three to belong to the blue category.

28
00:02:27,040 --> 00:02:33,070
Therefore, the conditional probability of blue is to Vetri and one belongs to orange.

29
00:02:33,190 --> 00:02:34,520
Never conditional release.

30
00:02:34,540 --> 00:02:35,250
One by three.

31
00:02:36,260 --> 00:02:42,230
Since the conditional probability of blue category is higher, I will assign this point.

32
00:02:42,430 --> 00:02:44,840
Magda's across to the blue category.

33
00:02:48,500 --> 00:02:54,880
If I had gays equal to one, that is, I will decide only basis one nearest neighbor.

34
00:02:55,610 --> 00:03:03,020
Then I will find the point nearest to this Crosspoint, which is probably this orange circle.

35
00:03:04,160 --> 00:03:09,200
In that case, I will assign Ordenes category to this cross.

36
00:03:11,130 --> 00:03:17,720
If we take gays equal to do, then these two point will be closer to these.

37
00:03:17,850 --> 00:03:24,750
This Crosspoint, in that case, the orange category will be having a commissioner of a video point

38
00:03:24,750 --> 00:03:25,090
for you.

39
00:03:25,170 --> 00:03:26,880
And the blue will also be having a condition.

40
00:03:26,880 --> 00:03:29,280
Where will they appoint in such a case?

41
00:03:29,340 --> 00:03:34,020
Our software package will be assigning the class randomly.

42
00:03:36,810 --> 00:03:44,610
So when I'm running this gain and in much of good package, I will be setting seed by setting seed.

43
00:03:44,730 --> 00:03:47,550
We will both be getting these same random solutions.

44
00:03:47,850 --> 00:03:53,460
So whenever the conditions for a release seem for two glasses and the software package randomly assigned

45
00:03:53,490 --> 00:03:56,340
Stigler's, we both will get the same answers.

46
00:03:56,850 --> 00:04:01,230
So setting seed will help us, getting the same answers that is reproducing.

47
00:04:01,230 --> 00:04:03,320
That isn't too broad.

48
00:04:03,390 --> 00:04:09,690
This graph, that is to identify the boundaries of this nearest neighbor classifier.

49
00:04:11,520 --> 00:04:13,940
We have created a grid off point.

50
00:04:14,820 --> 00:04:20,730
So for all the different values of X and Y, we have created a grid of points.

51
00:04:21,000 --> 00:04:24,180
And you assigned that last to each of these points.

52
00:04:24,960 --> 00:04:27,640
So you see all these points are in Blue-Collar.

53
00:04:28,110 --> 00:04:33,900
All these points are Ulgen, Blue-Collar and these points are in orange color and all these points.

54
00:04:33,900 --> 00:04:41,110
I mean, the classes based on this concept, only when do you assign categories to all these Green Point.

55
00:04:41,850 --> 00:04:43,980
Wherever decollete off grid point is changing.

56
00:04:44,430 --> 00:04:48,310
I've drawn this boundary and this is all we will get.

57
00:04:48,330 --> 00:04:51,050
The boundary of the classifier.

58
00:04:51,570 --> 00:04:52,680
And again, in this neighborhood.

59
00:04:57,480 --> 00:05:03,670
One of the most important parameter in Kenya and its neighbor is the value of key gay is often called

60
00:05:03,680 --> 00:05:12,450
the hyper barometer of Ganon classify a gay controls the flexibility of this boundary.

61
00:05:13,380 --> 00:05:22,230
So if you look at this, go I gays equal to one, my classified will closely follow each individual

62
00:05:22,230 --> 00:05:22,680
point.

63
00:05:23,010 --> 00:05:30,270
So if I have a blue point here, it will closely follow this and the boundary will be very complicated

64
00:05:31,360 --> 00:05:33,600
and will have a lot of twists and turns.

65
00:05:35,610 --> 00:05:42,650
Whereas if I use a very high value of key, the boundary will not be very sensitive to individual data

66
00:05:42,650 --> 00:05:43,130
point.

67
00:05:45,270 --> 00:05:47,430
It is having very less dones.

68
00:05:49,690 --> 00:05:54,710
So the flexibility of this boundary is being go on by the value of key.

69
00:05:55,240 --> 00:06:01,900
It is very important that we choose the optimal value of K so that we get the date of which it is entered

70
00:06:01,900 --> 00:06:04,000
by Distorter Lane, which.

71
00:06:05,610 --> 00:06:12,240
As the minimum edited, so since then we have gains you could look under, a lot of points were getting

72
00:06:12,240 --> 00:06:15,420
misclassified when we have Gaige equal to one.

73
00:06:16,350 --> 00:06:17,100
A lot of point.

74
00:06:17,160 --> 00:06:18,780
Again, get misclassified.

75
00:06:22,760 --> 00:06:27,230
Although the training at a rate will be very low when gays are equal to one.

76
00:06:27,810 --> 00:06:35,450
My deepest Adelaide will be very high because this girl is too dependent on these individual values

77
00:06:36,380 --> 00:06:42,140
and may not be actually following the true function of relationship between the Predator and the responsibility.

78
00:06:42,200 --> 00:06:45,740
But so both of these will have.

79
00:06:46,360 --> 00:06:47,950
At Adelaide in the desert.

80
00:06:49,310 --> 00:06:55,070
It is very important that we get the optimum value of cake, get our test at Adelaide is minimum.

81
00:06:58,150 --> 00:07:04,850
Another important thing to notice, because the and classify it predicts declasse of a given test observation

82
00:07:05,330 --> 00:07:08,060
by identifying the observations that are nearest to it.

83
00:07:08,900 --> 00:07:11,030
The scale of variables matters.

84
00:07:12,590 --> 00:07:18,500
Any variables that are on a large scale will have a much larger effect on the distance between the observations

85
00:07:19,070 --> 00:07:24,000
and hands on the can and classify it then the variables that are on a smaller scale.

86
00:07:24,890 --> 00:07:30,260
For instance, imagine a dataset that contains two variables Salvy and age.

87
00:07:32,030 --> 00:07:38,300
As far as CNN is concerned, a difference of a thousand dollars in salary is enormous compared to a

88
00:07:38,300 --> 00:07:40,400
difference of 50 years in age.

89
00:07:42,080 --> 00:07:48,380
Consequently, salary will drive dickin in classification results and age will have almost no effect.

90
00:07:49,910 --> 00:07:55,880
This is contrary to our intuition that salary difference of thousand dollar is quite small compared

91
00:07:55,880 --> 00:07:58,100
to in is defense of 50 years.

92
00:07:59,210 --> 00:08:05,150
A good way to handle this problem is to standardize the data so that all variables are given a mean

93
00:08:05,150 --> 00:08:07,670
of zero and a standard deviation of one.

94
00:08:09,350 --> 00:08:16,490
Then all variables will be on a compatible skin to standardize data in the background.

95
00:08:16,580 --> 00:08:19,370
Our software package will be using a formula like this.

96
00:08:20,480 --> 00:08:22,020
We do not need to bother about it.

97
00:08:23,510 --> 00:08:27,920
And just doing it for your students who would like to know what happened in the background.

98
00:08:29,630 --> 00:08:34,070
We just follow this formula to standardize all the variables in our dataset.

99
00:08:35,540 --> 00:08:39,620
We learn how to standardize variables in that software package.

100
00:08:40,090 --> 00:08:40,880
In becoming video.