1 00:00:00,166 --> 00:00:02,166 It beats by very little. 2 00:00:02,166 --> 00:00:04,800 You know, the accuracy of the kernel SVM. 3 00:00:04,800 --> 00:00:07,800 The kernel SVM did eight incorrect predictions. 4 00:00:07,800 --> 00:00:08,866 Five plus three. 5 00:00:08,866 --> 00:00:13,600 And the decision tree classification did four plus three seven incorrect 6 00:00:13,600 --> 00:00:18,300 predictions, resulting in having an accuracy almost equal to 96%. 7 00:00:18,600 --> 00:00:19,800 Well, that's really, really good. 8 00:00:19,800 --> 00:00:22,333 You know, this is the first time actually that I'm doing this. 9 00:00:22,333 --> 00:00:26,233 You know, trying this breast cancer data set with you, with all these models. 10 00:00:26,233 --> 00:00:29,133 I tried it once with logistic regression in another course. 11 00:00:29,133 --> 00:00:32,100 But this is the first time that I implement all these models 12 00:00:32,100 --> 00:00:35,400 and deploy them to do a model selection process on this data set. 13 00:00:35,666 --> 00:00:38,900 So, you know, that shows you how I'm confident about these code templates. 14 00:00:38,900 --> 00:00:40,966 I never tried them before on this dataset. 15 00:00:40,966 --> 00:00:44,466 And now I'm making this demo with you for the first time on this data set. 16 00:00:45,000 --> 00:00:46,600 All right. So therefore the excitement. 17 00:00:46,600 --> 00:00:50,433 And now final one I'm very curious to see if we can still beat it 18 00:00:50,433 --> 00:00:52,000 with random forest classification. 19 00:00:52,000 --> 00:00:57,000 So let's do this run all and final accuracy is whoa. 20 00:00:57,400 --> 00:01:00,000 Oh. All right okay. So I'm very disappointing. 21 00:01:00,000 --> 00:01:03,400 The random forest classification indeed really messed up with the teamwork. 22 00:01:03,400 --> 00:01:05,333 And that's another exception to the rule because, 23 00:01:05,333 --> 00:01:08,333 you know, usually teamwork is better than individual work. 24 00:01:08,366 --> 00:01:12,500 But no, we had a very powerful decision tree classification model before 25 00:01:12,600 --> 00:01:15,500 and it didn't need anybody else to be performant. 26 00:01:15,500 --> 00:01:15,866 All right. 27 00:01:15,866 --> 00:01:16,866 So that's very interesting. 28 00:01:16,866 --> 00:01:18,866 Actually this is very surprising results. 29 00:01:18,866 --> 00:01:20,366 But that's why it's really, 30 00:01:20,366 --> 00:01:24,400 really important to try all your models and have these very efficient code 31 00:01:24,400 --> 00:01:27,400 templates that allow you to do a quick and efficient 32 00:01:27,400 --> 00:01:30,700 model selection process to quickly figure out the best model. 33 00:01:31,100 --> 00:01:31,600 All right. 34 00:01:31,600 --> 00:01:33,833 You have to understand that there is no rule of thumb. 35 00:01:33,833 --> 00:01:37,033 And for some other data sets with other machine learning problems, 36 00:01:37,200 --> 00:01:40,500 the best model will be another one of these five models. 37 00:01:40,633 --> 00:01:40,966 All right. 38 00:01:40,966 --> 00:01:45,300 So it was really important for me to show you this and to give you this okay. 39 00:01:45,300 --> 00:01:47,066 So now there we go. My friends. 40 00:01:47,066 --> 00:01:50,100 We are at the end of part three classification. 41 00:01:50,200 --> 00:01:53,400 Big congratulations to you for, well, completing this part 42 00:01:53,400 --> 00:01:56,300 three and making such a good progress on this course. 43 00:01:56,300 --> 00:01:57,366 Now in the next part 44 00:01:57,366 --> 00:02:01,700 we will start clustering, which will be our first unsupervised model. 45 00:02:01,800 --> 00:02:04,533 I remind that the difference between supervised and unsupervised 46 00:02:04,533 --> 00:02:07,266 is that with supervised learning, you know what to predict. 47 00:02:07,266 --> 00:02:09,366 You know which dependent variable to predict. 48 00:02:09,366 --> 00:02:12,066 And with unsupervised learning, you don't know what to predict, 49 00:02:12,066 --> 00:02:14,533 and you will actually have to find some patterns in the data 50 00:02:14,533 --> 00:02:18,033 to figure out something you could predict as a dependent variable, 51 00:02:18,033 --> 00:02:21,900 but that you don't have a priori and that you create a posteriori. 52 00:02:22,066 --> 00:02:25,500 All right, so no worries, we will see all this in the next part. 53 00:02:25,700 --> 00:02:28,500 And until then, I think you deserve a good break now. 54 00:02:28,500 --> 00:02:32,366 So get some good rest, relax a bit, and as soon as you're back to back 55 00:02:32,366 --> 00:02:36,566 with some great energy, well, join us in the next part to tackle cluster 56 00:02:36,933 --> 00:02:38,600 and until then, enjoy machine learning.