1
00:00:01,290 --> 00:00:08,220
So we have a two dimensional predicted space and we want to find the hyper plane which can separate

2
00:00:08,310 --> 00:00:12,170
this space so that each class is in a different part.

3
00:00:15,820 --> 00:00:21,790
Probably you'd have guessed in such a scenario where the data is perfectly separable.

4
00:00:22,570 --> 00:00:24,730
We can draw in finite hyper planes.

5
00:00:26,110 --> 00:00:33,670
Just take any high powered plane and make tiny shifts or tiny rotations and you would get more hyper

6
00:00:33,670 --> 00:00:34,120
planes.

7
00:00:35,380 --> 00:00:39,570
Here you can see three such hyper planes in this figure.

8
00:00:41,620 --> 00:00:45,010
But which of these hyper planes should we choose and why?

9
00:00:46,510 --> 00:00:52,560
One reasonable choices, selecting that hyper plane, which is farthest from the training observations.

10
00:00:54,160 --> 00:00:57,750
That is, we have a hyper plane.

11
00:00:58,780 --> 00:01:07,300
We find the perpendicular distance of all the observations from this hyper plane, the smallest distance

12
00:01:07,540 --> 00:01:12,250
of these observations is called margin in this figure.

13
00:01:12,340 --> 00:01:16,220
You can see that this point is the closest to the iBOT plane.

14
00:01:16,930 --> 00:01:21,400
So the distance of this point from the hyper plane is the margin.

15
00:01:22,690 --> 00:01:30,520
In other words, margin is the farthest minimum distance between observations of the hyper plane.

16
00:01:31,780 --> 00:01:39,390
So whichever hyper plane has maximum value of the margin, that hyper plane will be selected.

17
00:01:41,320 --> 00:01:47,050
And then whatever it is to the left of this plane is classified as blue or class one.

18
00:01:47,560 --> 00:01:52,480
And whatever it is to the right is classified as purple or class to.

19
00:01:54,540 --> 00:01:58,480
This is known as maximal margin classifier.

20
00:01:59,820 --> 00:02:02,250
You can now see what the name stands for.

21
00:02:03,610 --> 00:02:07,750
Because we choose the hyper plane with maximum value of margin.

22
00:02:08,410 --> 00:02:10,670
This is called maximal margin classifier.

23
00:02:13,120 --> 00:02:14,770
Now, I have chosen the hyper plane.

24
00:02:15,940 --> 00:02:19,630
I broady the hyper plane and the two margins on this graph.

25
00:02:20,290 --> 00:02:24,340
And we notice that there are three point which lie on the margin.

26
00:02:26,690 --> 00:02:31,140
If these points were not there, we would have received wider margins.

27
00:02:31,890 --> 00:02:39,360
These points are called support vectors because in a way, these points are supporting these margin

28
00:02:39,360 --> 00:02:39,990
boundaries.

29
00:02:41,370 --> 00:02:46,440
In fact, if you think about it, other points are not important anymore.

30
00:02:47,120 --> 00:02:51,420
Our classifier is completely dependent only on these support vectors.

31
00:02:52,960 --> 00:03:01,150
Any slight movement in any of these support vectors would mean that the classifier will change identification

32
00:03:01,150 --> 00:03:06,470
of such points and classification on the basis of only these few points.

33
00:03:07,030 --> 00:03:14,440
Is a special characteristic of support vector classifier and machines which separate this technique

34
00:03:14,440 --> 00:03:16,570
from any other conventional technique.