1 00:00:00,566 --> 00:00:00,900 All right. 2 00:00:00,900 --> 00:00:03,500 So that's the only thing we want to input here. 3 00:00:03,500 --> 00:00:05,733 Or the rest we're going to keep the default values. 4 00:00:05,733 --> 00:00:08,066 Feel free to read them if you need to learn more. 5 00:00:08,066 --> 00:00:09,966 But that's the main parameter. 6 00:00:09,966 --> 00:00:12,533 That's mostly what we have to select here. 7 00:00:12,533 --> 00:00:16,600 And then remember that we will also add a random state parameter 8 00:00:16,633 --> 00:00:20,666 to make sure that we have the same results displayed in our notebook. 9 00:00:20,933 --> 00:00:22,200 All right. So let's do this. 10 00:00:22,200 --> 00:00:26,533 Criterion equals end quote entropy. 11 00:00:27,200 --> 00:00:28,000 Perfect. 12 00:00:28,000 --> 00:00:34,300 And then the second one random state parameter that we set equal to zero. 13 00:00:34,600 --> 00:00:35,233 Great. 14 00:00:35,233 --> 00:00:38,233 And now final step you know exactly what to do. 15 00:00:38,266 --> 00:00:40,533 We take our classifier. 16 00:00:40,533 --> 00:00:44,066 And from this classifier we call the fit method 17 00:00:44,300 --> 00:00:47,166 to train our decision tree classifier. 18 00:00:47,166 --> 00:00:51,066 On the training set that is composed as is expected by the fit method 19 00:00:51,433 --> 00:00:54,566 of x train and y 20 00:00:55,000 --> 00:00:58,066 train, exactly the same as before. 21 00:00:58,266 --> 00:01:01,266 And now once again, we're done very efficiently 22 00:01:01,366 --> 00:01:05,033 with this implementation, so I can't wait to see the results. 23 00:01:05,033 --> 00:01:08,700 I don't think we will beat the accuracy record, but let's see. 24 00:01:08,700 --> 00:01:09,600 We never know. 25 00:01:09,600 --> 00:01:12,066 So let's click this folder button here. 26 00:01:12,066 --> 00:01:15,533 And then you know right now it is connecting to a runtime to enable 27 00:01:15,533 --> 00:01:19,833 file browsing so that, you know, we can access your files on your machine. 28 00:01:20,100 --> 00:01:23,033 And in a second we should be able to get the upload button. 29 00:01:23,033 --> 00:01:26,033 There we go. As usual upload. 30 00:01:26,133 --> 00:01:27,833 And so that's the right data set. 31 00:01:27,833 --> 00:01:29,500 Let me show you the path again. 32 00:01:29,500 --> 00:01:31,200 That's the whole machinery is that folder. 33 00:01:31,200 --> 00:01:32,833 Please find it on your machine. 34 00:01:32,833 --> 00:01:35,100 And then we're going to go to part three classification. 35 00:01:35,100 --> 00:01:36,900 Then decision tree classification. 36 00:01:36,900 --> 00:01:40,733 Then Python and then social network ads dot CSV. 37 00:01:41,500 --> 00:01:43,000 All right let's press okay. 38 00:01:43,000 --> 00:01:44,166 And now here we go. 39 00:01:44,166 --> 00:01:49,066 We are ready to run all the cells by clicking this runtime button. 40 00:01:49,066 --> 00:01:51,900 And then run oh all right. 41 00:01:51,900 --> 00:01:54,133 And now it is training the decision tree classification model. 42 00:01:54,133 --> 00:01:55,200 Here we go. 43 00:01:55,200 --> 00:01:58,933 We have it now you know with all the default values of the parameters 44 00:01:58,933 --> 00:02:01,933 except criterion which we set equal to entropy. 45 00:02:02,366 --> 00:02:04,400 Then what about that new result. Great. 46 00:02:04,400 --> 00:02:05,600 We got the right prediction. 47 00:02:05,600 --> 00:02:11,666 Remember that customer of age 30 and estimated salary $87,000 didn't buy. 48 00:02:11,666 --> 00:02:15,000 In reality, the end was predicted not to buy it either. 49 00:02:15,266 --> 00:02:16,333 So perfect. 50 00:02:16,333 --> 00:02:20,666 Then when predicting the test results, we indeed get a lot of good predictions 51 00:02:21,100 --> 00:02:23,600 except some incorrect ones here, for example. 52 00:02:23,600 --> 00:02:26,333 And then, well, it looks actually pretty good. 53 00:02:26,333 --> 00:02:28,200 Maybe, you know we will be the accuracy. 54 00:02:28,200 --> 00:02:31,200 That's another one. All right. Another one. 55 00:02:31,800 --> 00:02:33,800 And okay let's see okay. 56 00:02:33,800 --> 00:02:38,300 Because you know there's actually also when you scroll up some more prediction. 57 00:02:38,300 --> 00:02:39,933 But let's see I'm very curious actually. 58 00:02:39,933 --> 00:02:41,600 Maybe I spoke too fast. 59 00:02:41,600 --> 00:02:44,700 We're about to find out right now with the confusion matrix. 60 00:02:44,700 --> 00:02:45,666 Are you ready? 61 00:02:45,666 --> 00:02:48,900 The accuracy of the decision tree classification model 62 00:02:49,200 --> 00:02:52,700 is 91%. Wow. 63 00:02:52,700 --> 00:02:57,166 Okay, so it's actually in the podium, you know, right after K and N 64 00:02:57,166 --> 00:03:02,300 and a kernel SVM which got the best accuracy of 93%. Wow. 65 00:03:02,300 --> 00:03:03,000 So that's really good. 66 00:03:03,000 --> 00:03:07,266 Actually this is really a good sign for Random Forest because random forest 67 00:03:07,266 --> 00:03:10,900 is basically a team of decision trees making the predictions. 68 00:03:10,900 --> 00:03:14,133 And you know how team spirit always improves the results. 69 00:03:14,300 --> 00:03:18,966 So we might have a chance to beat the record accuracy with Random Forest. 70 00:03:19,433 --> 00:03:20,733 So that's pretty exciting. 71 00:03:20,733 --> 00:03:23,733 And now when visualizing the training set results which we already got, 72 00:03:23,833 --> 00:03:25,600 no, the execution was not too long. 73 00:03:25,600 --> 00:03:27,100 Let's see what it looks like. 74 00:03:27,100 --> 00:03:30,400 Wow. Okay, so that's pretty different as before. 75 00:03:30,400 --> 00:03:33,466 And no wonder why it got a pretty good accuracy 76 00:03:33,900 --> 00:03:34,500 because indeed it 77 00:03:34,500 --> 00:03:38,333 looks like it was able to catch, you know, the little observation points 78 00:03:38,333 --> 00:03:41,400 that were really hard to catch by either a straight line, you know, 79 00:03:41,400 --> 00:03:46,833 with linear classifiers or a nice curve like with kernel SVM or Naive Bayes. 80 00:03:47,133 --> 00:03:51,333 Here we actually splitted this whole grid into smaller subgrid. 81 00:03:51,566 --> 00:03:53,866 And that's because, you know, we have all these splits 82 00:03:53,866 --> 00:03:56,866 in the decision tree classification algorithm. 83 00:03:56,866 --> 00:04:00,800 So no wonder why we get all these subgrid and therefore we get separate 84 00:04:00,900 --> 00:04:03,100 prediction regions. It's really interesting. 85 00:04:03,100 --> 00:04:06,300 That captures very well the observation points. 86 00:04:06,600 --> 00:04:10,966 So it catches all the red customers here who didn't buy in reality the SUV. 87 00:04:11,266 --> 00:04:15,700 It catches also all these green customers who but in reality the SUV 88 00:04:16,033 --> 00:04:20,066 and it catches you know these very hard to catch customers here 89 00:04:20,266 --> 00:04:25,333 by creating these sub grids of the grid with the right predicted regions. 90 00:04:25,333 --> 00:04:27,766 So you see how it got that good accuracy. 91 00:04:27,766 --> 00:04:31,233 It really tried to catch everything, even for example, these green points 92 00:04:31,233 --> 00:04:33,900 that were cut among all these red points okay. 93 00:04:33,900 --> 00:04:35,800 These red customers okay. 94 00:04:35,800 --> 00:04:37,400 But let's be careful. 95 00:04:37,400 --> 00:04:40,933 The training set, you know, on which the model was trained. 96 00:04:41,200 --> 00:04:43,000 Let's see what happens with the test set. 97 00:04:43,000 --> 00:04:45,033 And we already know that we will get good results 98 00:04:45,033 --> 00:04:47,900 because we already know that the accuracy on the test set is 90%. 99 00:04:47,900 --> 00:04:50,000 But still, let's see what we get 100 00:04:50,000 --> 00:04:53,300 with new observations on which the model wasn't trained. 101 00:04:53,933 --> 00:04:54,433 All right. 102 00:04:54,433 --> 00:04:55,466 This is what we get. 103 00:04:55,466 --> 00:04:58,133 And actually here we see things more clearly. 104 00:04:58,133 --> 00:05:01,566 This is the prediction region you know which funnily was a good fit 105 00:05:01,566 --> 00:05:04,033 for the training set. But here it is not catching anything. 106 00:05:04,033 --> 00:05:04,833 You know, neither 107 00:05:04,833 --> 00:05:09,600 red customers or green customers here seem to be two incorrect predictions. 108 00:05:09,600 --> 00:05:11,500 You know, because they fall in the green region. 109 00:05:12,466 --> 00:05:12,966 then here 110 00:05:12,966 --> 00:05:16,200 that's all good, you know, that's all the customers of small age 111 00:05:16,200 --> 00:05:17,533 and small estimated salary, 112 00:05:17,533 --> 00:05:21,666 which therefore won't be likely to buy the SUV as it is the case here. 113 00:05:21,933 --> 00:05:24,933 And then all these green points are correctly predicted. 114 00:05:25,033 --> 00:05:26,800 This one is incorrectly predicted. 115 00:05:26,800 --> 00:05:30,300 So indeed we have our ten incorrect predictions in all this. 116 00:05:30,766 --> 00:05:32,066 But there you go. 117 00:05:32,066 --> 00:05:32,366 You know, 118 00:05:32,366 --> 00:05:34,033 if I didn't see the accuracy first, 119 00:05:34,033 --> 00:05:36,800 I would be afraid that we have some overfitting here. 120 00:05:36,800 --> 00:05:39,033 But no, it doesn't seem to be the case. 121 00:05:39,033 --> 00:05:41,500 Even with new observations of the test set. 122 00:05:41,500 --> 00:05:44,500 You know, we get great predictions. 123 00:05:44,566 --> 00:05:46,966 But now what I really want to see 124 00:05:46,966 --> 00:05:51,366 is the final accuracy of our final classification model. 125 00:05:51,600 --> 00:05:54,600 Let's find out about this in next practical activity. 126 00:05:54,833 --> 00:05:56,566 And until then, enjoy machine learning.