1 00:00:00,066 --> 00:00:01,033 Naive Bayes is ready. 2 00:00:01,033 --> 00:00:02,900 So now let's move on to decision tree classification. 3 00:00:02,900 --> 00:00:04,066 And there you go. 4 00:00:04,066 --> 00:00:07,700 So that happens when you have too many sessions opened on the Google Colab. 5 00:00:07,700 --> 00:00:11,533 I left it on purpose because I'm sure you will also encounter this situation 6 00:00:11,766 --> 00:00:14,366 and what to do in this situation. Well, no worries at all. 7 00:00:14,366 --> 00:00:16,133 That's very simple since Google 8 00:00:16,133 --> 00:00:20,133 Colab actually allows only maximum five sessions to run at the same time. 9 00:00:20,166 --> 00:00:23,166 Well, what we'll just do here is that for decision tree 10 00:00:23,166 --> 00:00:26,100 classification and random forest, because here that's the same. 11 00:00:26,100 --> 00:00:28,266 Well we'll just close them for now. 12 00:00:28,266 --> 00:00:28,966 All right. 13 00:00:28,966 --> 00:00:32,366 And we will reopen them after we get the best accuracy 14 00:00:32,366 --> 00:00:34,633 from these five models okay. 15 00:00:34,633 --> 00:00:35,966 So we'll get the best from these five. 16 00:00:35,966 --> 00:00:37,833 Then we'll run the last two ones. 17 00:00:37,833 --> 00:00:40,700 Decision tree classification and random forest classification. 18 00:00:40,700 --> 00:00:42,133 And we'll see which one wins. 19 00:00:42,133 --> 00:00:43,433 Which one is the big winner. 20 00:00:43,433 --> 00:00:44,266 All right. 21 00:00:44,266 --> 00:00:47,733 So now all are implementations already. 22 00:00:47,733 --> 00:00:51,600 So of course next natural step now is to run 23 00:00:51,600 --> 00:00:55,400 all these cells to get all the accuracies of these five models. 24 00:00:55,466 --> 00:00:58,200 All right so let's start with logistic regression. 25 00:00:58,200 --> 00:00:58,966 Are you ready. 26 00:00:58,966 --> 00:01:03,866 Let's click run time and then run and all the cells will be running. 27 00:01:04,133 --> 00:01:06,300 And we shouldn't get any error. Indeed. 28 00:01:06,300 --> 00:01:07,633 And we get wow. 29 00:01:07,633 --> 00:01:11,400 We start with a very good accuracy because we get an accuracy 30 00:01:11,400 --> 00:01:15,100 close to 95% for logistic regression. 31 00:01:15,100 --> 00:01:15,500 All right. 32 00:01:15,500 --> 00:01:15,800 And indeed 33 00:01:15,800 --> 00:01:20,400 we only have four plus five equals nine errors nine incorrect predictions. 34 00:01:20,666 --> 00:01:21,466 Well pretty good. 35 00:01:21,466 --> 00:01:24,433 So let's see what we get with the next ones. You know. 36 00:01:24,433 --> 00:01:26,700 And it's really reassuring that we get these high accuracies 37 00:01:26,700 --> 00:01:29,100 because we were doing predictions for breast cancer. 38 00:01:29,100 --> 00:01:33,200 So we really want to, you know, be accurate on predicting 39 00:01:33,200 --> 00:01:36,400 if patients have a benign or malignant tumor. 40 00:01:36,566 --> 00:01:37,066 Okay. 41 00:01:37,066 --> 00:01:39,900 So let's hope that we can do even better than this. 42 00:01:39,900 --> 00:01:41,133 All right. So I'm going to scroll back up. 43 00:01:41,133 --> 00:01:44,200 Oh no actually I'm going to leave that here in case we forget. 44 00:01:44,633 --> 00:01:46,200 So 0.947. 45 00:01:46,200 --> 00:01:48,733 Now let's move on to k nearest neighbors. 46 00:01:48,733 --> 00:01:51,300 Let's click run time here then run all. 47 00:01:51,300 --> 00:01:52,366 And there we go my friends. 48 00:01:52,366 --> 00:01:56,666 We're about to get the next accuracy which is exactly this same one. 49 00:01:56,666 --> 00:01:59,833 I just check that, you know I put the right model here. 50 00:01:59,833 --> 00:02:04,500 But we have exactly the same one actually, which you know, can totally happen 51 00:02:04,500 --> 00:02:07,500 because you just have to make nine incorrect predictions. 52 00:02:07,500 --> 00:02:10,833 You know, two classification models can make the same number of incorrect 53 00:02:10,833 --> 00:02:14,100 predictions, and therefore you will end up with the exact same accuracy. 54 00:02:14,400 --> 00:02:15,400 So that's very interesting. 55 00:02:15,400 --> 00:02:17,833 Actually, this is the first time I observe this. 56 00:02:17,833 --> 00:02:18,166 All right. 57 00:02:18,166 --> 00:02:22,433 So well let's still hope we can beat this with our next classification models. 58 00:02:22,666 --> 00:02:25,833 So now with Support Vector Machine we're going to click run time. 59 00:02:25,833 --> 00:02:30,566 And we're going to click Run All to see that next accuracy we're getting. 60 00:02:30,566 --> 00:02:32,533 And all right interesting. 61 00:02:32,533 --> 00:02:36,766 This time we get a lower accuracy but still a very very good one. 62 00:02:36,766 --> 00:02:40,966 And you know that makes me very excited to see what kernel SVM is going to do. 63 00:02:41,100 --> 00:02:42,933 You know with a nonlinear kernel. 64 00:02:42,933 --> 00:02:46,966 Because indeed here we get ten incorrect predictions as opposed to nine 65 00:02:46,966 --> 00:02:50,633 incorrect predictions before with logistic regression and K-nearest neighbors. 66 00:02:50,933 --> 00:02:53,566 But here with SVM, it's still very, very good. 67 00:02:53,566 --> 00:02:56,500 We get 94% accuracy. 68 00:02:56,500 --> 00:02:58,566 All right. And now let's try kernel SVM. 69 00:02:58,566 --> 00:03:00,600 I look forward to seeing what we're going to get. 70 00:03:00,600 --> 00:03:02,066 So let's click run time. 71 00:03:02,066 --> 00:03:03,800 And let's click run all. 72 00:03:03,800 --> 00:03:07,433 And the accuracy is yes we beat it 73 00:03:07,800 --> 00:03:11,400 95% 95.3%. 74 00:03:11,400 --> 00:03:12,066 That's excellent. 75 00:03:12,066 --> 00:03:14,500 And that was actually expected kernel SVM. 76 00:03:14,500 --> 00:03:16,333 You know is really really good. 77 00:03:16,333 --> 00:03:18,166 You will get good results with this because you know, 78 00:03:18,166 --> 00:03:21,900 we get flexibility on the curve to catch the correct predictions. 79 00:03:22,300 --> 00:03:23,400 All right. So very very good. 80 00:03:23,400 --> 00:03:26,466 But we still have three other classification models. 81 00:03:26,466 --> 00:03:29,866 Let's see what we're going to get with them starting with Naive Bayes. 82 00:03:30,366 --> 00:03:30,833 All right. 83 00:03:30,833 --> 00:03:33,833 So let's click run time and then run all 84 00:03:33,900 --> 00:03:37,033 and and the next accuracy is okay. 85 00:03:37,033 --> 00:03:43,200 So like SVM ten incorrect predictions resulting in an accuracy of 94%. 86 00:03:43,366 --> 00:03:44,133 All right. 87 00:03:44,133 --> 00:03:45,133 That's okay. 88 00:03:45,133 --> 00:03:47,933 And now well we still have two more chances. 89 00:03:47,933 --> 00:03:49,666 One was decision tree classification 90 00:03:49,666 --> 00:03:52,166 and the other one with random forest classification. 91 00:03:52,166 --> 00:03:55,333 So now what we're going to do is we're going to click runtime 92 00:03:55,333 --> 00:03:58,333 here then manage sessions. 93 00:03:58,433 --> 00:04:01,100 Then we're going to terminate all these sessions here. 94 00:04:01,100 --> 00:04:06,700 Because you know we're only allowed to run maximum five sessions at the same time. 95 00:04:07,033 --> 00:04:09,000 So I terminated all of them. 96 00:04:09,000 --> 00:04:11,566 You can close it now. And we still keep the accuracy. 97 00:04:11,566 --> 00:04:12,833 So that's totally fine right. 98 00:04:12,833 --> 00:04:14,866 We keep the accuracy everywhere here. 99 00:04:14,866 --> 00:04:17,633 So we can totally compare with our last two. 100 00:04:17,633 --> 00:04:18,566 So let's do this. 101 00:04:18,566 --> 00:04:21,866 Let's open first you know random forest classification because 102 00:04:22,500 --> 00:04:25,500 you know it gives them in that order. 103 00:04:25,666 --> 00:04:27,266 Well actually that doesn't really matter here. 104 00:04:27,266 --> 00:04:30,900 But anyway let's open decision tree classification now. 105 00:04:31,466 --> 00:04:32,233 All right. 106 00:04:32,233 --> 00:04:34,166 And here we go. 107 00:04:34,166 --> 00:04:36,933 We have our last two models I can't wait to try them 108 00:04:36,933 --> 00:04:39,266 because I can't wait to see who is going to be the big winner. 109 00:04:39,266 --> 00:04:44,233 And if we can beat even more that best accuracy of 95.3%. 110 00:04:44,700 --> 00:04:45,066 All right. 111 00:04:45,066 --> 00:04:47,433 So next step is not to click runtime here. 112 00:04:47,433 --> 00:04:51,366 Because remember we haven't uploaded yet the data set into the notebook. 113 00:04:51,566 --> 00:04:54,566 So no need for refresh here I'll good upload 114 00:04:54,800 --> 00:04:57,700 then data dot CSV open. 115 00:04:57,700 --> 00:05:00,800 Then let's do quickly the same for random forest classification. 116 00:05:00,800 --> 00:05:03,866 But first let's not forget to replace this by data dot 117 00:05:03,866 --> 00:05:07,366 CSV are good now random forest classification 118 00:05:07,900 --> 00:05:10,500 little folder here, then upload, 119 00:05:10,500 --> 00:05:14,333 then data dot csv open, then okay. 120 00:05:14,633 --> 00:05:17,933 And then let's replace this by data dot CSV. 121 00:05:18,600 --> 00:05:19,066 All right. 122 00:05:19,066 --> 00:05:22,400 And now my friends we're about to reveal the final podium. 123 00:05:22,400 --> 00:05:26,066 You know the three best models with the three highest accuracies. 124 00:05:26,266 --> 00:05:29,433 So let's do this starting with decision tree classification. 125 00:05:29,766 --> 00:05:32,500 So let's click run time here. Run all. 126 00:05:32,500 --> 00:05:34,600 And now there we go. 127 00:05:34,600 --> 00:05:36,900 Wow that's incredible. 128 00:05:36,900 --> 00:05:41,100 We actually beat the accuracy I didn't I really didn't expect this. 129 00:05:41,100 --> 00:05:44,066 Usually decision tree classification is not the winner. 130 00:05:44,066 --> 00:05:46,466 But here we have a beautiful exception to the rule. 131 00:05:46,466 --> 00:05:52,300 Indeed we get a beautiful accuracy of almost 96% 95.9.