1 00:00:00,300 --> 00:00:00,600 All right. 2 00:00:00,600 --> 00:00:01,733 So that's for the train set. 3 00:00:01,733 --> 00:00:04,400 And now let's come in quickly on the test set. 4 00:00:04,400 --> 00:00:06,900 All right. This is already executed. Perfect. 5 00:00:06,900 --> 00:00:09,566 So here are the results of Naive Bayes on the test set. 6 00:00:09,566 --> 00:00:13,466 And actually we realized that here we were pretty unlucky. 7 00:00:13,500 --> 00:00:14,933 You know, naive Bayes was pretty unlucky 8 00:00:14,933 --> 00:00:18,733 because the three predictions that it missed, you know, 9 00:00:19,066 --> 00:00:22,566 this red customer that falls incorrectly in the green region. 10 00:00:22,566 --> 00:00:26,666 And these two green customers that fall incorrectly in the red region. 11 00:00:26,933 --> 00:00:29,900 Well, Naive Bayes was pretty unlucky on this one 12 00:00:29,900 --> 00:00:32,366 because, you know, they're almost on the prediction boundary 13 00:00:32,366 --> 00:00:36,000 and it is what it missed by very closely to beat the record. 14 00:00:36,000 --> 00:00:37,166 And same for this one actually. 15 00:00:37,166 --> 00:00:39,766 You know, this one was very shortly missed. Right. 16 00:00:39,766 --> 00:00:43,133 Because it is an incorrect prediction of a customer who in reality 17 00:00:43,233 --> 00:00:46,866 didn't buy the SV because it's red, but was predicted to by DSV 18 00:00:46,866 --> 00:00:48,866 because it falls in the green region. 19 00:00:48,866 --> 00:00:50,000 So pretty unlucky. 20 00:00:50,000 --> 00:00:52,733 Which explains the accuracy of 90%. 21 00:00:52,733 --> 00:00:56,500 I think that if we tune a bit Naive Bayes, we could beat the record. 22 00:00:56,733 --> 00:01:01,000 But once again, this will be important and now the next step will be to buy 23 00:01:01,000 --> 00:01:03,166 the last two classification models, 24 00:01:03,166 --> 00:01:06,766 which belong to a branch of machine learning called and Simple models. 25 00:01:07,033 --> 00:01:09,166 I'm talking, of course, about decision tree 26 00:01:09,166 --> 00:01:12,166 classification and random forest classification. 27 00:01:12,200 --> 00:01:15,900 And actually, with these tools and especially random forest, 28 00:01:15,900 --> 00:01:19,166 we might have a chance to beat that record accuracy. 29 00:01:19,166 --> 00:01:21,433 You know, we might beat 93%. 30 00:01:21,433 --> 00:01:24,466 So that's what on the menu for this end of part three. 31 00:01:24,500 --> 00:01:26,266 So I can't wait to see you 32 00:01:26,266 --> 00:01:29,933 in the next part to build first the decision tree classification model 33 00:01:30,066 --> 00:01:33,066 and then finally the random forest classification. 34 00:01:33,500 --> 00:01:35,300 And until then, enjoy machine learning.