1 00:00:00,610 --> 00:00:07,660 The maximum marginal classifier has two major limitations due to which it is not used in majority of 2 00:00:07,660 --> 00:00:08,980 the deal work to Nadja's. 3 00:00:10,420 --> 00:00:17,320 The first major limitation comes from the fact that we said we want to find a lenient hyper plain, 4 00:00:17,800 --> 00:00:21,340 which perfectly separates the classes in predictive space. 5 00:00:23,940 --> 00:00:26,880 But what if the classes are not perfectly separable? 6 00:00:27,750 --> 00:00:34,350 That is, there is no such hyper plane which can separate these observations perfectly. 7 00:00:35,970 --> 00:00:39,240 This is often the case in real world problems. 8 00:00:41,500 --> 00:00:45,070 Suppose the observations are as shown in this graph. 9 00:00:46,590 --> 00:00:51,360 Can you find any street lane that can separate blue dots from Purple Dot? 10 00:00:53,720 --> 00:00:54,660 It is not possible. 11 00:00:56,140 --> 00:00:58,000 Then how do we classify in this scenario? 12 00:00:59,110 --> 00:01:02,370 We cannot do it using maximal margin classified. 13 00:01:05,420 --> 00:01:09,770 The second major limitation is the sensitivity of this classify it. 14 00:01:11,380 --> 00:01:12,830 Consider this an idea on the left. 15 00:01:14,600 --> 00:01:19,910 We had a distribution on which we drilled this hybrid plane with maximal margin. 16 00:01:21,740 --> 00:01:29,000 But notice, indeed, eight figure just that the introduction of one new point of hyper blink changes 17 00:01:29,120 --> 00:01:29,720 immensely. 18 00:01:31,430 --> 00:01:33,310 This is a completely different classify it. 19 00:01:35,400 --> 00:01:40,150 Also, notice that Dick, new high powered plane will have a very small margin. 20 00:01:41,210 --> 00:01:43,850 These are the points very near to the cyber plane. 21 00:01:44,090 --> 00:01:47,450 So the margin will be very small around this hyper plane. 22 00:01:49,700 --> 00:01:56,240 This is a problem because the distance of an observation from the hyper plane can be considered as a 23 00:01:56,240 --> 00:02:00,430 measure of confidence that the observation was correctly classified. 24 00:02:02,890 --> 00:02:10,390 Also, such sensitivity to one observation makes maximal margin classifier very prone to ordering. 25 00:02:11,920 --> 00:02:14,530 Third, due to these two limitations. 26 00:02:15,660 --> 00:02:23,920 We will move on from maximal margin classifier to support vector classifier, and we will see how support 27 00:02:23,930 --> 00:02:26,880 vector classifier handles these limitations.