1 00:00:00,333 --> 00:00:00,633 Okay. 2 00:00:00,633 --> 00:00:02,800 So now we have it. So now we can use it. 3 00:00:02,800 --> 00:00:06,433 And therefore just below we will actually call it. 4 00:00:06,700 --> 00:00:10,400 And in order to do this efficiently we can just say the example here. 5 00:00:10,566 --> 00:00:13,600 And right below 6 00:00:13,633 --> 00:00:17,500 we paste it and we just replace once again why true. 7 00:00:17,500 --> 00:00:19,466 But why test. 8 00:00:19,466 --> 00:00:24,066 So that we can get indeed the accuracy under test it okay. 9 00:00:24,066 --> 00:00:25,533 And here we don't have to do a print 10 00:00:25,533 --> 00:00:30,233 because this accuracy score will directly return that accuracy. 11 00:00:30,233 --> 00:00:32,266 You know that rate of correct predictions. 12 00:00:32,266 --> 00:00:33,066 So there you go. 13 00:00:33,066 --> 00:00:36,266 We can just play this cell you know run this cell. 14 00:00:36,266 --> 00:00:38,333 And here you go. Congratulations. 15 00:00:38,333 --> 00:00:42,833 You have here the confusion matrix showing that we have indeed 65 16 00:00:42,833 --> 00:00:46,833 correct predictions of the class zero, meaning the customers of the test set 17 00:00:46,833 --> 00:00:48,533 who didn't buy the new SUV. 18 00:00:48,533 --> 00:00:53,366 Then 24 correct predictions of the class one meaning 19 00:00:53,600 --> 00:00:56,966 correct predictions of the customers who but the SUV 20 00:00:57,300 --> 00:01:01,033 and then three incorrect predictions of the class one 21 00:01:01,033 --> 00:01:04,033 meaning three incorrect predictions of the customers who 22 00:01:04,166 --> 00:01:07,666 but in reality the SUV but were predicted not two. 23 00:01:08,033 --> 00:01:13,000 And finally eight incorrect predictions of the class zero mean 24 00:01:13,333 --> 00:01:16,433 eight customers who in reality didn't buy 25 00:01:16,433 --> 00:01:19,433 the SUV but were predicted to buy it. 26 00:01:19,533 --> 00:01:23,000 Okay, so you see, the confusion matrix has no mystery. 27 00:01:23,000 --> 00:01:26,033 It's very easy to read, and in a flashlight we can indeed 28 00:01:26,033 --> 00:01:29,200 get the main information of our predictions. 29 00:01:29,800 --> 00:01:34,100 And finally, that little number that we have here is of course the accuracy. 30 00:01:34,366 --> 00:01:37,566 And here we get oh point 89, 31 00:01:37,566 --> 00:01:42,400 which means that we had 89% of correct predictions in the test set. 32 00:01:42,600 --> 00:01:46,400 And actually remember that there are 100 observations in a test set, 33 00:01:46,566 --> 00:01:50,400 which means that we had indeed 89 correct predictions. 34 00:01:50,400 --> 00:01:54,833 Actually, 65 here plus 24 is equal indeed to 89. 35 00:01:54,966 --> 00:01:55,333 All right. 36 00:01:55,333 --> 00:01:58,433 But for any size of the test set, this would mean that 37 00:01:58,433 --> 00:02:01,433 you had 89% of correct predictions. 38 00:02:01,433 --> 00:02:03,300 And that's exactly the accuracy. 39 00:02:03,300 --> 00:02:06,433 The accuracy is the rate of correct predictions. 40 00:02:07,166 --> 00:02:07,600 All right. 41 00:02:07,600 --> 00:02:12,066 So now you know how to, you know, quickly evaluate a classification model. 42 00:02:12,066 --> 00:02:13,866 You know the accuracy is usually 43 00:02:13,866 --> 00:02:17,900 the right metric to use when evaluating your classification models. 44 00:02:18,066 --> 00:02:20,133 So now you have it in the tool kit. 45 00:02:20,133 --> 00:02:23,766 And so here we go for our final step of the journey. 46 00:02:23,933 --> 00:02:26,933 We're going to visualize not only the training 47 00:02:26,933 --> 00:02:30,000 set results but also the test set results. 48 00:02:30,000 --> 00:02:33,766 And this will be super interesting because we will actually see 49 00:02:34,033 --> 00:02:36,800 how the logistic regression classifier 50 00:02:36,800 --> 00:02:40,200 was actually trained to classify our customers. 51 00:02:40,200 --> 00:02:45,366 You know, our observations into the two different classes, you know, 0 or 1. 52 00:02:45,766 --> 00:02:49,433 We will have super nice results showing all the real results 53 00:02:49,433 --> 00:02:51,000 in both the training set and the test. 54 00:02:51,000 --> 00:02:55,333 It and also the prediction regions, you know, the regions where 55 00:02:55,466 --> 00:02:59,466 our predictions are zero and the other region where the predictions are one. 56 00:02:59,800 --> 00:03:03,000 And you will see that the curve that separates 57 00:03:03,000 --> 00:03:06,166 these two regions is exactly the classification curve. 58 00:03:06,166 --> 00:03:09,866 And you will see that it will be a straight line for linear models 59 00:03:09,966 --> 00:03:13,066 and something else in a straight line for nonlinear models. 60 00:03:13,066 --> 00:03:15,600 I really, really can't wait to show you this 61 00:03:15,600 --> 00:03:19,666 because you will really see and visualize the difference between 62 00:03:19,666 --> 00:03:23,466 linear classification models and nonlinear classification models. 63 00:03:23,833 --> 00:03:24,600 So there you go. 64 00:03:24,600 --> 00:03:27,466 Take a little break now, and we'll tackle together 65 00:03:27,466 --> 00:03:30,933 this final step to visualize indeed these two results. 66 00:03:30,933 --> 00:03:32,900 And until then, enjoy machine learning.