1 00:00:00,166 --> 00:00:00,566 Okay. 2 00:00:00,566 --> 00:00:02,366 Good. So now we're going to test that out. 3 00:00:02,366 --> 00:00:04,600 Of course this whole implementation is over 4 00:00:04,600 --> 00:00:07,333 because all the rest comes from our different templates. 5 00:00:07,333 --> 00:00:10,100 The data preprocessing template, the data preprocessing toolkit 6 00:00:10,100 --> 00:00:12,466 and the logistic regression implementation. 7 00:00:12,466 --> 00:00:13,600 You know, for the rest. 8 00:00:13,600 --> 00:00:18,600 So now we can just run the whole code and see what happens in the end. 9 00:00:18,633 --> 00:00:21,400 So basically we are ready to run this code. 10 00:00:21,400 --> 00:00:25,866 But before let's not forget to upload the data sets inside this notebook. 11 00:00:26,133 --> 00:00:28,066 So let's click this button. 12 00:00:28,066 --> 00:00:30,300 Then please find your machine learning. 13 00:00:30,300 --> 00:00:33,300 Is that folder on your machine wherever you downloaded it. 14 00:00:33,466 --> 00:00:34,766 And then let's go inside. 15 00:00:34,766 --> 00:00:36,900 Part nine dimensionality reduction. 16 00:00:36,900 --> 00:00:40,833 Then principal component analysis then Python. 17 00:00:41,033 --> 00:00:41,600 And there we go. 18 00:00:41,600 --> 00:00:46,033 In the Or wine dataset click open and click okay. 19 00:00:46,466 --> 00:00:49,333 And now now you're ready to run this 20 00:00:49,333 --> 00:00:52,333 implementation with simple run all. 21 00:00:52,533 --> 00:00:54,400 So let's click runtime here. 22 00:00:54,400 --> 00:00:55,800 And are you ready. 23 00:00:55,800 --> 00:00:58,600 Let's do this in 321. 24 00:00:58,600 --> 00:01:00,033 Go. Run all. 25 00:01:00,033 --> 00:01:00,400 All right. 26 00:01:00,400 --> 00:01:02,666 Importing the libraries and importing the data set. 27 00:01:02,666 --> 00:01:05,533 Or applying the data preprocessing phase. 28 00:01:05,533 --> 00:01:06,566 Then applying PCA. 29 00:01:06,566 --> 00:01:08,466 Well it goes way faster than me. 30 00:01:08,466 --> 00:01:10,233 And what do we end up with? 31 00:01:10,233 --> 00:01:15,833 Well, we actually end up with an amazing accuracy of 97%. 32 00:01:16,200 --> 00:01:19,600 So this business owner, you know, of this wine shop definitely 33 00:01:19,600 --> 00:01:24,133 had a great intuition with this idea of applying dimensionality reduction. 34 00:01:24,133 --> 00:01:24,900 Because, you know, 35 00:01:24,900 --> 00:01:29,066 dimensionality reduction doesn't only consist of reducing the complexity 36 00:01:29,066 --> 00:01:32,266 of your data set, it can also improve the final results, you know, 37 00:01:32,266 --> 00:01:36,066 by combining dimensionality reduction and your predictive model. 38 00:01:36,300 --> 00:01:38,133 And here this is exactly what happens. 39 00:01:38,133 --> 00:01:44,000 We get an amazing accuracy of 97% with actually only one incorrect prediction. 40 00:01:44,266 --> 00:01:46,300 Right. And here by the way this is interesting. 41 00:01:46,300 --> 00:01:50,333 This is the first time you see a confusion matrix of three rows and three columns. 42 00:01:50,333 --> 00:01:53,100 And that of course this time we have three classes, right. 43 00:01:53,100 --> 00:01:55,900 We have three customer segments one, two and three. 44 00:01:55,900 --> 00:01:57,866 Therefore we have three classes to predict. 45 00:01:57,866 --> 00:02:01,800 And so this is the number of correct predictions of customer segment one. 46 00:02:02,000 --> 00:02:05,000 This is number of great predictions of customer segment two 47 00:02:05,100 --> 00:02:07,900 this is the number of repetitions of customer segment three. 48 00:02:07,900 --> 00:02:12,133 And this is the number of incorrect predictions of customer segment one. 49 00:02:12,333 --> 00:02:12,666 Right. 50 00:02:12,666 --> 00:02:15,600 So anyway only one incorrect prediction in total. 51 00:02:15,600 --> 00:02:18,600 And that's why we end up with such a great accuracy. 52 00:02:18,833 --> 00:02:22,133 And so now we should see amazing results on the graphs. 53 00:02:22,133 --> 00:02:25,166 You know, for us with visualizing the training set results. 54 00:02:25,433 --> 00:02:29,466 And indeed well, you know, these are our two principal components 55 00:02:29,666 --> 00:02:33,733 PC1 on the x axis and PC2 on the y axis. 56 00:02:34,200 --> 00:02:37,200 And then these are you know, the different prediction regions. 57 00:02:37,233 --> 00:02:42,433 This is the prediction region of class number three where each one of which 58 00:02:42,433 --> 00:02:46,233 the extracted coordinates of PC1 and PC2 are in this blue 59 00:02:46,233 --> 00:02:49,933 region will be predicted to belong to customer segment number three. 60 00:02:50,333 --> 00:02:53,733 Then this is the prediction region of customer segment number two, 61 00:02:54,033 --> 00:02:59,133 where all the wines for which the PC1 and PC2 extracted features 62 00:02:59,366 --> 00:03:03,000 that fall in this green region will be predicted to belong to customer 63 00:03:03,000 --> 00:03:04,066 segment number two. 64 00:03:04,066 --> 00:03:09,033 And finally, this prediction region red corresponds to customer segment number 65 00:03:09,033 --> 00:03:14,733 one, inside which all the wines of which the extracted feature is PC1 and PC2 66 00:03:14,900 --> 00:03:19,233 fall in this region will be predicted to belong to customer segment number one. 67 00:03:19,666 --> 00:03:21,766 All right. And then of course all the dots. 68 00:03:21,766 --> 00:03:23,066 You know the green dots here, 69 00:03:23,066 --> 00:03:27,333 the blue dots here and the red dots here are the real observations. 70 00:03:27,333 --> 00:03:31,000 You know the real wines themselves of the training set. 71 00:03:31,000 --> 00:03:34,133 And so here we can indeed see that we have a few incorrect predictions. 72 00:03:34,133 --> 00:03:36,100 But that's only on the training set. 73 00:03:36,100 --> 00:03:40,366 For example, this is a green wine meaning a wine that belongs to customer 74 00:03:40,366 --> 00:03:41,400 segment number two, 75 00:03:41,400 --> 00:03:45,833 which was predicted by the model to belong to customer segment number three. 76 00:03:46,166 --> 00:03:49,666 And these are two other incorrect predictions of wines 77 00:03:49,666 --> 00:03:52,333 that belong in reality to customer segment number two, 78 00:03:52,333 --> 00:03:56,000 but were predicted by the model to belong to customer segment number one. 79 00:03:56,466 --> 00:03:56,866 All right. 80 00:03:56,866 --> 00:03:58,500 And then another incorrect prediction here. 81 00:03:58,500 --> 00:04:00,533 But to look at the incorrect predictions, 82 00:04:00,533 --> 00:04:03,533 it's more interesting to check them out on the test set. 83 00:04:03,600 --> 00:04:04,800 And here is a test set. 84 00:04:04,800 --> 00:04:07,633 And indeed even on new observations. 85 00:04:07,633 --> 00:04:10,900 Well our logistic regression model combined to dimensionality 86 00:04:10,900 --> 00:04:14,800 reduction was perfectly able to separate well the three classes. 87 00:04:15,033 --> 00:04:19,333 And we can see very well that, you know, incorrect prediction that we saw here 88 00:04:19,333 --> 00:04:23,566 in the confusion matrix, which is, you know, right here 89 00:04:23,633 --> 00:04:27,600 that's the incorrect prediction and corresponds to a green wine, 90 00:04:27,600 --> 00:04:31,166 meaning a wine that in reality belongs to customer segment number two, 91 00:04:31,333 --> 00:04:35,166 but was predicted by the model to belong to customer segment number one. 92 00:04:35,533 --> 00:04:36,300 That's okay. 93 00:04:36,300 --> 00:04:40,066 You know, any business owner or data scientist that ends up with such 94 00:04:40,066 --> 00:04:43,533 result with only one incorrect prediction can just be super happy. 95 00:04:44,033 --> 00:04:47,333 But let's see if we can beat that with our other 96 00:04:47,333 --> 00:04:51,133 dimensionality reduction techniques like linear discriminant analysis. 97 00:04:51,500 --> 00:04:56,300 And you know, to beat that, well, we'll have to simply get 100% accuracy. 98 00:04:56,633 --> 00:04:59,866 So we'll see if you know the linear discriminant, 99 00:04:59,866 --> 00:05:03,833 which are the extracted features of the LDA technique, can build, you know, 100 00:05:03,833 --> 00:05:08,400 a prediction boundary that can separate, well, all these three classes. 101 00:05:08,633 --> 00:05:10,033 So this will be quite challenging. 102 00:05:10,033 --> 00:05:11,533 But this is doable. 103 00:05:11,533 --> 00:05:15,233 And then we'll check if we can also do the same with kernel PCA 104 00:05:15,333 --> 00:05:18,366 which is our final dimensionality reduction technique. 105 00:05:18,966 --> 00:05:19,500 All right. 106 00:05:19,500 --> 00:05:23,266 So as soon as you're ready for the next technique well I'll be super 107 00:05:23,266 --> 00:05:27,333 happy to join you in the next practical activity to implement LDA. 108 00:05:27,633 --> 00:05:29,700 And until then, enjoy machine learning.