1 00:00:00,166 --> 00:00:02,800 Hello and welcome to this art tutorial. 2 00:00:02,800 --> 00:00:06,266 So in the previous tutorial we predicted our test results 3 00:00:06,266 --> 00:00:10,500 and now in this tutorial we are going to evaluate those predictions 4 00:00:10,866 --> 00:00:14,133 by making the confusion matrix, which will count 5 00:00:14,133 --> 00:00:17,700 the number of correct predictions and the number of incorrect predictions. 6 00:00:18,200 --> 00:00:18,866 So let's do it. 7 00:00:18,866 --> 00:00:20,900 Let's make the matrix. It's very simple. 8 00:00:20,900 --> 00:00:23,166 It will just take us one line. 9 00:00:23,166 --> 00:00:26,166 So let's call it CM equals. 10 00:00:26,633 --> 00:00:30,866 Then very practically we're going to use the table function in R. 11 00:00:31,533 --> 00:00:35,433 And in this table I will first input the real values 12 00:00:35,433 --> 00:00:39,466 which are the test set brackets. 13 00:00:39,466 --> 00:00:42,933 And then I have to select the columns of the real results which is 14 00:00:43,466 --> 00:00:47,700 if we go to the test set this column because this column contains 15 00:00:47,700 --> 00:00:48,600 the real results. 16 00:00:48,600 --> 00:00:53,533 Whether the user bought yes or no, the SUV and this column has index three. 17 00:00:53,733 --> 00:00:55,566 So here I will put 18 00:00:56,566 --> 00:00:59,266 comma and three. 19 00:00:59,266 --> 00:01:00,833 All right. So that's my first argument. 20 00:01:00,833 --> 00:01:02,300 That's the real values. 21 00:01:02,300 --> 00:01:06,300 And then as the second argument I'm going to input the predicted values 22 00:01:06,633 --> 00:01:09,633 which is of course the y vector here. 23 00:01:09,833 --> 00:01:13,066 So here I'm going to add y prep. 24 00:01:13,766 --> 00:01:17,000 And I don't have to select any index because Whitbread is already 25 00:01:17,000 --> 00:01:19,033 the vector of predictions. 26 00:01:19,033 --> 00:01:24,266 So in short this is the vector of real values for all the observations. 27 00:01:24,633 --> 00:01:27,533 And this is the vector of predictions 28 00:01:27,533 --> 00:01:30,533 for the same observations as in this vector 29 00:01:31,433 --> 00:01:33,033 okay. And that's it. 30 00:01:33,033 --> 00:01:34,966 The confusion matrix is ready. 31 00:01:34,966 --> 00:01:37,966 So now let's select this line and execute. 32 00:01:38,500 --> 00:01:41,500 Here it is table created. 33 00:01:41,600 --> 00:01:44,533 Now let's go to the console and have a look at it 34 00:01:44,533 --> 00:01:47,266 CM enter. 35 00:01:47,266 --> 00:01:47,933 And here it is. 36 00:01:48,933 --> 00:01:50,700 The most important thing to understand here 37 00:01:50,700 --> 00:01:55,700 is that the 57 and the 26 here are the correct predictions. 38 00:01:56,166 --> 00:01:59,866 And the ten and the seven here are the incorrect predictions. 39 00:02:00,733 --> 00:02:05,333 So what's interesting here at first sight is that the classifier made 40 00:02:05,666 --> 00:02:09,000 57 plus 26 equals 83. 41 00:02:09,300 --> 00:02:14,233 Correct predictions, and ten plus seven equals 17 incorrect predictions. 42 00:02:14,766 --> 00:02:15,166 All right. 43 00:02:15,166 --> 00:02:17,900 So 17 incorrect predictions on the test set. 44 00:02:17,900 --> 00:02:20,133 That's not bad. But we can do better. 45 00:02:20,133 --> 00:02:23,133 And we will do better with other classifiers. 46 00:02:23,400 --> 00:02:25,800 You will see that in the next sections okay. 47 00:02:25,800 --> 00:02:28,333 So we're done with the confusion matrix. 48 00:02:28,333 --> 00:02:30,866 And finally now it's time for the best part 49 00:02:30,866 --> 00:02:34,666 because in the next tutorial we will be graphically looking at our results 50 00:02:34,933 --> 00:02:39,500 because we will plot this very cool chart that will allow us to make an awesome 51 00:02:39,500 --> 00:02:41,300 interpretation of the results. 52 00:02:41,300 --> 00:02:43,700 So I look forward to seeing you in the next tutorial 53 00:02:43,700 --> 00:02:46,133 where we will make this chart. 54 00:02:46,133 --> 00:02:47,833 Until then, enjoy machine learning.