1 00:00:00,066 --> 00:00:03,033 And now we're actually going to understand why. 2 00:00:03,033 --> 00:00:07,300 Why didn't the SVM model beat the K and then. 3 00:00:07,566 --> 00:00:10,200 Okay. So we'll actually figure that out in a bit. 4 00:00:10,200 --> 00:00:11,166 Or actually, you know, 5 00:00:11,166 --> 00:00:15,433 I can just show it to you here on the original implementation of the SVM. 6 00:00:15,666 --> 00:00:18,900 But you're going to understand now why it didn't beat it. 7 00:00:19,100 --> 00:00:20,333 Well, there you go. 8 00:00:20,333 --> 00:00:24,400 It didn't beat it because once again, since we chose a linear kernel, 9 00:00:24,566 --> 00:00:26,366 well, the prediction boundary 10 00:00:26,366 --> 00:00:30,166 or you know, the decision boundary is actually a once again a straight line. 11 00:00:30,166 --> 00:00:34,666 And therefore even if you rotate it either this way, with this way, well, 12 00:00:34,666 --> 00:00:37,800 it won't be able to catch the right predictions for, you know, 13 00:00:37,800 --> 00:00:41,100 these green customers here, which should belong to the green regions. 14 00:00:41,100 --> 00:00:41,966 And same for this one. 15 00:00:41,966 --> 00:00:45,300 You know, if we rotated this way, well we will catch these ones. 16 00:00:45,300 --> 00:00:48,300 But then we will catch more incorrect predictions around here. 17 00:00:48,400 --> 00:00:51,900 And if we rotate this way, you know, from here to here, well, indeed 18 00:00:51,900 --> 00:00:54,233 we will catch these ones in the right prediction region, 19 00:00:54,233 --> 00:00:56,800 but we will get more incorrect ones here. 20 00:00:56,800 --> 00:00:59,733 So that's the problem of linear models here. 21 00:00:59,733 --> 00:01:00,833 Indeed, it's much better 22 00:01:00,833 --> 00:01:04,800 to have a prediction boundary that does some kind of a curve here 23 00:01:04,900 --> 00:01:09,066 in order to catch only the red customers who should be predicted 24 00:01:09,100 --> 00:01:13,100 not by the SUV, and leave the green ones in the green region. 25 00:01:13,200 --> 00:01:13,866 All right. 26 00:01:13,866 --> 00:01:17,900 That's exactly what we got with our k nearest neighbors, right? 27 00:01:18,200 --> 00:01:22,133 It is not a smooth curve, but it did the job of selecting the right 28 00:01:22,200 --> 00:01:23,700 red customers here. 29 00:01:23,700 --> 00:01:24,966 Still leaving some green ones, 30 00:01:24,966 --> 00:01:28,033 but here catching the right green ones in the right region. 31 00:01:28,033 --> 00:01:28,966 The green region. 32 00:01:28,966 --> 00:01:32,166 And here of course, since we have this straight line, it's impossible to do. 33 00:01:32,466 --> 00:01:34,000 But no worries. 34 00:01:34,000 --> 00:01:38,433 I'm sure you have the intuition that once we use a nonlinear kernel, 35 00:01:38,466 --> 00:01:42,600 well, we'll be able to catch the right green observations here in the right 36 00:01:42,600 --> 00:01:43,466 green region. 37 00:01:43,466 --> 00:01:47,300 And well, that's exactly what we'll find out in the next section. 38 00:01:47,433 --> 00:01:50,433 When implementing the kernel SVM model. 39 00:01:50,900 --> 00:01:52,866 All right. So let's see if we're done here. 40 00:01:52,866 --> 00:01:53,566 Yes we are. 41 00:01:53,566 --> 00:01:55,800 So it's finished to execute. 42 00:01:55,800 --> 00:01:57,566 And of course we get the same results. 43 00:01:57,566 --> 00:02:00,366 And let's see for the test set. Still running. But 44 00:02:01,500 --> 00:02:03,600 and oh well okay. 45 00:02:03,600 --> 00:02:05,066 So funny timing. 46 00:02:05,066 --> 00:02:06,866 It just populated here 47 00:02:06,866 --> 00:02:09,866 and well you know it's still the same because of this straight line here. 48 00:02:09,866 --> 00:02:12,600 Well we have some green customers you know customers who. 49 00:02:12,600 --> 00:02:17,166 But in reality the new SUV but couldn't be predicted correctly because they fall 50 00:02:17,166 --> 00:02:22,000 in the wrong region because of this prediction boundary that cannot separate. 51 00:02:22,000 --> 00:02:24,466 Well are two classes all right. 52 00:02:24,466 --> 00:02:25,333 So same problem. 53 00:02:25,333 --> 00:02:29,000 And you know you might guess that this problem will be fixed once 54 00:02:29,000 --> 00:02:32,700 we choose a nonlinear kernel for our support vector machine model. 55 00:02:32,966 --> 00:02:33,300 All right. 56 00:02:33,300 --> 00:02:37,700 So let's find out about this in the next section on kernel SVM. 57 00:02:37,733 --> 00:02:40,733 And until then enjoy machine learning.