1
00:00:00,610 --> 00:00:07,660
The maximum marginal classifier has two major limitations due to which it is not used in majority of

2
00:00:07,660 --> 00:00:08,980
the deal work to Nadja's.

3
00:00:10,420 --> 00:00:17,320
The first major limitation comes from the fact that we said we want to find a lenient hyper plain,

4
00:00:17,800 --> 00:00:21,340
which perfectly separates the classes in predictive space.

5
00:00:23,940 --> 00:00:26,880
But what if the classes are not perfectly separable?

6
00:00:27,750 --> 00:00:34,350
That is, there is no such hyper plane which can separate these observations perfectly.

7
00:00:35,970 --> 00:00:39,240
This is often the case in real world problems.

8
00:00:41,500 --> 00:00:45,070
Suppose the observations are as shown in this graph.

9
00:00:46,590 --> 00:00:51,360
Can you find any street lane that can separate blue dots from Purple Dot?

10
00:00:53,720 --> 00:00:54,660
It is not possible.

11
00:00:56,140 --> 00:00:58,000
Then how do we classify in this scenario?

12
00:00:59,110 --> 00:01:02,370
We cannot do it using maximal margin classified.

13
00:01:05,420 --> 00:01:09,770
The second major limitation is the sensitivity of this classify it.

14
00:01:11,380 --> 00:01:12,830
Consider this an idea on the left.

15
00:01:14,600 --> 00:01:19,910
We had a distribution on which we drilled this hybrid plane with maximal margin.

16
00:01:21,740 --> 00:01:29,000
But notice, indeed, eight figure just that the introduction of one new point of hyper blink changes

17
00:01:29,120 --> 00:01:29,720
immensely.

18
00:01:31,430 --> 00:01:33,310
This is a completely different classify it.

19
00:01:35,400 --> 00:01:40,150
Also, notice that Dick, new high powered plane will have a very small margin.

20
00:01:41,210 --> 00:01:43,850
These are the points very near to the cyber plane.

21
00:01:44,090 --> 00:01:47,450
So the margin will be very small around this hyper plane.

22
00:01:49,700 --> 00:01:56,240
This is a problem because the distance of an observation from the hyper plane can be considered as a

23
00:01:56,240 --> 00:02:00,430
measure of confidence that the observation was correctly classified.

24
00:02:02,890 --> 00:02:10,390
Also, such sensitivity to one observation makes maximal margin classifier very prone to ordering.

25
00:02:11,920 --> 00:02:14,530
Third, due to these two limitations.

26
00:02:15,660 --> 00:02:23,920
We will move on from maximal margin classifier to support vector classifier, and we will see how support

27
00:02:23,930 --> 00:02:26,880
vector classifier handles these limitations.