1 00:00:00,166 --> 00:00:02,100 Hello my friends, and welcome to the. 2 00:00:02,100 --> 00:00:04,266 Final practical activity of. 3 00:00:04,266 --> 00:00:07,266 This part nine dimensionality. Reduction. 4 00:00:07,333 --> 00:00:08,966 We already built two. 5 00:00:08,966 --> 00:00:10,166 Dimensionality reduction. 6 00:00:10,166 --> 00:00:12,433 Models first principal. 7 00:00:12,433 --> 00:00:16,033 Component analysis and then second linear discriminant. 8 00:00:16,033 --> 00:00:17,133 Analysis. 9 00:00:17,133 --> 00:00:18,333 We got amazing. 10 00:00:18,333 --> 00:00:22,333 Results with both, but slightly better results and actually perfect results. 11 00:00:22,333 --> 00:00:24,733 With linear discriminant analysis. 12 00:00:24,733 --> 00:00:27,300 So now we're. Hoping that with our third tool of. 13 00:00:27,300 --> 00:00:29,066 The dimensionality reduction. 14 00:00:29,066 --> 00:00:32,133 Toolkit that we get at least the same thing as PCA 15 00:00:32,133 --> 00:00:35,133 or that same perfect results as with LDA. 16 00:00:35,400 --> 00:00:37,000 And you might guess that since. 17 00:00:37,000 --> 00:00:40,900 Now we're about to add a kernel, and as we saw with SVM 18 00:00:40,900 --> 00:00:44,000 and kernel SVM, adding a kernel always improves the result. 19 00:00:44,166 --> 00:00:45,900 Well, you might guess that we are about to. 20 00:00:45,900 --> 00:00:48,000 Get amazing results. As well. 21 00:00:48,000 --> 00:00:48,566 All right. 22 00:00:48,566 --> 00:00:50,866 So let's start. Let's build that final model. 23 00:00:50,866 --> 00:00:54,033 But before this let's make sure everyone here is on the same page. 24 00:00:54,033 --> 00:00:57,166 I give you the link to this whole folder right before this tutorial. 25 00:00:57,166 --> 00:00:58,666 So make sure to connect to it. 26 00:00:58,666 --> 00:00:59,766 And now here we go. 27 00:00:59,766 --> 00:01:01,100 Let's enter port. 28 00:01:01,100 --> 00:01:01,800 Nine and then. 29 00:01:01,800 --> 00:01:04,800 Section 45 kernel PCA. 30 00:01:05,100 --> 00:01:07,266 And as usual we're going to start with Python. 31 00:01:07,266 --> 00:01:10,033 And this Python folder contains two files. 32 00:01:10,033 --> 00:01:11,166 First the kernel. 33 00:01:11,166 --> 00:01:14,166 PCA implementation and the. Ipynb format. 34 00:01:14,166 --> 00:01:16,233 And of course the same data set. 35 00:01:16,233 --> 00:01:19,700 Wine. CSV which is a data set of. 36 00:01:19,700 --> 00:01:21,233 Many wines and many different. 37 00:01:21,233 --> 00:01:24,200 Wines. Each row corresponds to a wine. 38 00:01:24,200 --> 00:01:25,333 And for each of these wines, 39 00:01:25,333 --> 00:01:29,433 we have these features from the alcohol level to the proline. 40 00:01:29,700 --> 00:01:30,733 And then for each. 41 00:01:30,733 --> 00:01:33,633 Of these wines, we also have the customer. Segment. 42 00:01:33,633 --> 00:01:36,733 Which is the segment of customers to which 43 00:01:37,000 --> 00:01:40,800 the wine belongs to in the sense that all the customers of each. 44 00:01:40,800 --> 00:01:43,200 Segment, and we have three segments in total. 45 00:01:43,200 --> 00:01:46,133 Have the same preference for such wines. 46 00:01:46,133 --> 00:01:46,833 Okay. 47 00:01:46,833 --> 00:01:48,400 And so now the challenge is. 48 00:01:48,400 --> 00:01:50,100 To build a logistic regression. 49 00:01:50,100 --> 00:01:53,166 Model, combine to some dimensionality reduction techniques. 50 00:01:53,166 --> 00:01:55,000 Apply to this data set so. 51 00:01:55,000 --> 00:01:56,133 That we can end up with. 52 00:01:56,133 --> 00:01:58,300 Less complex data set, which. 53 00:01:58,300 --> 00:02:01,333 At the same time will provide an excellent way. 54 00:02:01,333 --> 00:02:04,000 For the logistic. Regression model to learn. 55 00:02:04,000 --> 00:02:07,233 The correlations between all these features. And. 56 00:02:07,566 --> 00:02:09,366 The dependent variable. 57 00:02:09,366 --> 00:02:09,833 All right. 58 00:02:09,833 --> 00:02:11,100 And then for each new. Wine, we. 59 00:02:11,100 --> 00:02:12,133 Will deploy this. 60 00:02:12,133 --> 00:02:13,166 Predictive model. 61 00:02:13,166 --> 00:02:16,200 To predict the. Customer segment to. Which. 62 00:02:16,200 --> 00:02:17,666 This wine belongs. 63 00:02:17,666 --> 00:02:18,666 So that the owner of. 64 00:02:18,666 --> 00:02:20,700 The wine shop can recommend. 65 00:02:20,700 --> 00:02:23,466 Each new wine to the right customer. 66 00:02:23,466 --> 00:02:23,833 All right. 67 00:02:23,833 --> 00:02:26,200 So that's the exact same case as before. 68 00:02:26,200 --> 00:02:28,933 And now let's open this implementation. 69 00:02:28,933 --> 00:02:31,400 With either Google Collaboratory or. 70 00:02:31,400 --> 00:02:32,766 Jupyter. Notebook. 71 00:02:32,766 --> 00:02:35,500 As you can see, I kept this PCA. 72 00:02:35,500 --> 00:02:39,266 Implementation and this LDA implementation so that we can compare. 73 00:02:39,633 --> 00:02:40,400 And now. 74 00:02:40,400 --> 00:02:41,400 Well as usual this. 75 00:02:41,400 --> 00:02:43,433 Implementation is in read only mode. 76 00:02:43,433 --> 00:02:45,000 Because you all have access to it. 77 00:02:45,000 --> 00:02:46,866 So let's create a copy by. 78 00:02:46,866 --> 00:02:48,000 Clicking. File here. 79 00:02:48,000 --> 00:02:50,800 Then save a copy in drive. Because indeed in this. 80 00:02:50,800 --> 00:02:51,566 Copy we will. 81 00:02:51,566 --> 00:02:55,633 Re-Implement the cell that implements kernel. PCA. 82 00:02:56,133 --> 00:02:58,933 Let's get rid of this so that we can have clearly the three. 83 00:02:58,933 --> 00:03:00,933 Dimensionality reduction techniques. 84 00:03:00,933 --> 00:03:03,166 And now there we go. Let's implement. 85 00:03:03,166 --> 00:03:04,166 Kernel PCA. 86 00:03:04,166 --> 00:03:07,866 But first this time let's upload the data set first so that. 87 00:03:07,866 --> 00:03:08,733 We can, you know. 88 00:03:08,733 --> 00:03:11,233 Get the assistance of Google Colab. 89 00:03:11,233 --> 00:03:13,400 Right now our notebook is connecting to runtime. 90 00:03:13,400 --> 00:03:15,766 There we go. Then let's click upload. 91 00:03:15,766 --> 00:03:16,266 We'll end. 92 00:03:16,266 --> 00:03:19,166 Up in the Linear Discriminant analysis. Folder. 93 00:03:19,166 --> 00:03:21,800 So let's just do the whole path again. 94 00:03:21,800 --> 00:03:24,666 So this is the. Whole machine. Learning is it folder. 95 00:03:24,666 --> 00:03:28,900 Let's go inside and let's go to part nine dimensionality reduction and section. 96 00:03:28,900 --> 00:03:30,700 45 kernel PCA. 97 00:03:30,700 --> 00:03:32,500 Python and wine dot. 98 00:03:32,500 --> 00:03:34,266 CSV. Open. 99 00:03:34,266 --> 00:03:36,233 Okay and there we go. 100 00:03:36,233 --> 00:03:39,466 Now our notebook is connected okay. 101 00:03:39,466 --> 00:03:40,766 So now we're going to do two things. 102 00:03:40,766 --> 00:03:42,966 First we're. Going to remove. That. 103 00:03:42,966 --> 00:03:43,633 Cell. 104 00:03:43,633 --> 00:03:46,733 You know put it in the bin so that we can re-implement it. 105 00:03:46,733 --> 00:03:47,900 But also lets. 106 00:03:47,900 --> 00:03:49,833 You know remove all. The outputs. 107 00:03:49,833 --> 00:03:51,133 By train not to look at them. 108 00:03:51,133 --> 00:03:51,833 You know. 109 00:03:51,833 --> 00:03:54,166 I hope I hope you didn't look at the results. 110 00:03:54,166 --> 00:03:56,600 But anyway, I'm. Sure you expect. 111 00:03:56,600 --> 00:03:58,266 An amazing result. As well. 112 00:03:58,266 --> 00:04:00,900 So let's just remove the output here. 113 00:04:00,900 --> 00:04:03,466 That's the visualization of the training set results. 114 00:04:03,466 --> 00:04:06,466 And this one visualization of the test set results. 115 00:04:06,866 --> 00:04:09,400 All right then let's take the table of. 116 00:04:09,400 --> 00:04:11,833 Contents applying kernel PCA. 117 00:04:11,833 --> 00:04:12,833 And there we go. 118 00:04:12,833 --> 00:04:15,066 We are ready to implement this. 119 00:04:15,066 --> 00:04:16,900 So let's create a new code cell. 120 00:04:16,900 --> 00:04:22,400 And now do we want to re-implement this you know from the very scratch. 121 00:04:22,500 --> 00:04:24,633 Or do we want to be efficient. 122 00:04:24,633 --> 00:04:25,400 And well. 123 00:04:25,400 --> 00:04:27,766 Of course that's really my spirit as a coder. 124 00:04:27,766 --> 00:04:30,600 As a machine learning programmer, I always want to be. 125 00:04:30,600 --> 00:04:31,400 Efficient. 126 00:04:31,400 --> 00:04:33,300 And by. That I mean that, you know. 127 00:04:33,300 --> 00:04:34,566 The kernel PCA. 128 00:04:34,566 --> 00:04:35,833 Implementation is. 129 00:04:35,833 --> 00:04:39,233 Super close to the PCA. Implementation. 130 00:04:39,466 --> 00:04:42,066 Because basically it will be. Almost the same, except. 131 00:04:42,066 --> 00:04:43,300 That we will have to add. 132 00:04:43,300 --> 00:04:45,700 A kernel in one of the inputs. 133 00:04:45,700 --> 00:04:47,800 So what we're going to do, you know, in that spirit. 134 00:04:47,800 --> 00:04:51,333 Of efficiency, is we will go to our PCA. 135 00:04:51,333 --> 00:04:54,233 Implementation. We will say that cell. 136 00:04:54,233 --> 00:04:58,033 Because you're going to see that it's going to be almost the same. 137 00:04:58,366 --> 00:05:00,833 So let's paste it here. 138 00:05:00,833 --> 00:05:03,633 And now the only thing that we have to change. 139 00:05:03,633 --> 00:05:05,900 Is first the. Name of the. Class. 140 00:05:05,900 --> 00:05:08,566 But not the. Module, because the class we're about. 141 00:05:08,566 --> 00:05:10,833 To import to implement. Kernel PCA. 142 00:05:10,833 --> 00:05:13,400 Still belongs to this decomposition module. 143 00:05:13,400 --> 00:05:14,966 By the cyclic library. 144 00:05:14,966 --> 00:05:20,400 And that class is of course kernel PCA just like that. 145 00:05:20,666 --> 00:05:22,833 So that's the class then. 146 00:05:22,833 --> 00:05:25,100 Well let's give a. Different name to the object. 147 00:05:25,100 --> 00:05:27,766 We're not going to call. It PCA but we can call. It you know. 148 00:05:27,766 --> 00:05:30,500 K PCA as. You want. You know. 149 00:05:30,500 --> 00:05:32,400 Then of course here when we call the class to. 150 00:05:32,400 --> 00:05:34,133 Create an instance of. 151 00:05:34,133 --> 00:05:37,200 This object, which will be this CPK variable, well. 152 00:05:37,233 --> 00:05:37,700 Of course we. 153 00:05:37,700 --> 00:05:41,100 Need to call the right class which is kernel PCA. 154 00:05:41,733 --> 00:05:44,133 And now inside this. Class well same. 155 00:05:44,133 --> 00:05:45,666 We have to choose. A number of. 156 00:05:45,666 --> 00:05:47,333 Extracted features which is. 157 00:05:47,333 --> 00:05:50,066 Still given by this argument and components. 158 00:05:50,066 --> 00:05:54,300 But since now we're working with a kernel, you know, we're doing kernel PCA. 159 00:05:54,533 --> 00:05:55,800 Well, exactly the. 160 00:05:55,800 --> 00:05:57,633 Same as when we transitioned from. 161 00:05:57,633 --> 00:05:59,466 SVM to kernel. SVM. 162 00:05:59,466 --> 00:06:01,233 Well, we simply need to. Add a. 163 00:06:01,233 --> 00:06:03,833 Kernel argument here, and we'll actually. 164 00:06:03,833 --> 00:06:04,366 Choose. 165 00:06:04,366 --> 00:06:07,300 The same kernel as with kernel SVM, meaning the. 166 00:06:07,300 --> 00:06:09,866 RBF kernel which is the radial basis. 167 00:06:09,866 --> 00:06:11,100 Function kernel. 168 00:06:11,100 --> 00:06:12,433 So there we go. That's our. 169 00:06:12,433 --> 00:06:14,633 Second argument. Here. Kernel 170 00:06:16,000 --> 00:06:17,100 in. Quotes. 171 00:06:17,100 --> 00:06:20,700 Well r d f radial basis function. 172 00:06:21,366 --> 00:06:24,466 And now let's see let's see what there is left to change. 173 00:06:24,600 --> 00:06:25,833 So this line is good. 174 00:06:25,833 --> 00:06:27,000 The next line of code. 175 00:06:27,000 --> 00:06:29,033 Well same in order to perform. 176 00:06:29,033 --> 00:06:30,300 The kernel PCA. 177 00:06:30,300 --> 00:06:32,166 Dimensionality reduction technique. 178 00:06:32,166 --> 00:06:35,266 Well we only. Need the features of Xtrain. 179 00:06:35,300 --> 00:06:37,500 And not the dependent. Variable y train. 180 00:06:37,500 --> 00:06:38,200 So all good. 181 00:06:38,200 --> 00:06:42,666 You know that's the same as PCA but not the same as LDA which required 182 00:06:42,833 --> 00:06:44,500 the dependent variable y train. 183 00:06:44,500 --> 00:06:47,000 So all good here. However, be careful we. 184 00:06:47,000 --> 00:06:48,733 Renamed our object not. 185 00:06:48,733 --> 00:06:51,100 PCA but k PCA. 186 00:06:51,100 --> 00:06:53,566 So same here. Cbc.ca. 187 00:06:53,566 --> 00:06:56,800 And now my friends, this implementation is over. 188 00:06:57,100 --> 00:06:58,033 That's what. Happens. 189 00:06:58,033 --> 00:06:59,633 You know when we're being efficient, 190 00:06:59,633 --> 00:07:03,066 the implementation is sometimes completed faster than expected. 191 00:07:03,400 --> 00:07:03,766 And that's. 192 00:07:03,766 --> 00:07:06,366 Because as you can see, kernel PCA is very. 193 00:07:06,366 --> 00:07:07,933 Similar to. PCA. 194 00:07:07,933 --> 00:07:10,466 You know, in terms of its implementation. 195 00:07:10,466 --> 00:07:12,033 Okay. So now we're. 196 00:07:12,033 --> 00:07:14,233 Just ready to. Run all again. 197 00:07:14,233 --> 00:07:15,866 We have our data set. 198 00:07:15,866 --> 00:07:17,966 Our implementation is all good. 199 00:07:17,966 --> 00:07:19,100 So let's do this. 200 00:07:19,100 --> 00:07:21,400 Let's click. Runtime here. 201 00:07:21,400 --> 00:07:22,233 And then. 202 00:07:22,233 --> 00:07:25,233 Three to. One run. Oh go. 203 00:07:25,566 --> 00:07:26,200 All right. So now. 204 00:07:26,200 --> 00:07:27,400 All the cells. Are running. 205 00:07:27,400 --> 00:07:29,366 Our logistic regression model is built. 206 00:07:29,366 --> 00:07:30,666 and. 207 00:07:30,666 --> 00:07:34,166 As expected well we get of course an accuracy of. 208 00:07:34,166 --> 00:07:37,566 100%. I've really. Seen some. Cases. 209 00:07:37,566 --> 00:07:40,600 Where you know, the non kernel version of the model. 210 00:07:40,633 --> 00:07:43,333 Beats the kernel. Version of the model. 211 00:07:43,333 --> 00:07:45,400 It can happen. But it's very rare. 212 00:07:45,400 --> 00:07:45,933 There you go. 213 00:07:45,933 --> 00:07:51,200 Here of course kernel PCA manages to beat the PCA model. 214 00:07:51,300 --> 00:07:52,333 Thinks that kernel. 215 00:07:52,333 --> 00:07:55,833 Well we fixed that incorrect prediction which we had. 216 00:07:56,000 --> 00:07:58,966 Remember in. PCA right here. 217 00:07:58,966 --> 00:08:01,166 So all good here we get a one. 218 00:08:01,166 --> 00:08:02,700 Hundred percent accuracy. 219 00:08:02,700 --> 00:08:04,500 And now. Let's have a look at the results. To know. 220 00:08:04,500 --> 00:08:08,433 How kernel PCA was able to separate our classes. 221 00:08:08,433 --> 00:08:09,066 In the test. 222 00:08:09,066 --> 00:08:10,266 Set right, which. 223 00:08:10,266 --> 00:08:13,266 Are new observations on which the model wasn't trained. 224 00:08:13,500 --> 00:08:15,033 Well there you go. That's our two. 225 00:08:15,033 --> 00:08:18,000 Principal components PC1 and. PC2. 226 00:08:18,000 --> 00:08:20,966 And now in a new dimension once again 227 00:08:20,966 --> 00:08:23,600 you know, because the wine's observation points. 228 00:08:23,600 --> 00:08:24,533 Here are. 229 00:08:24,533 --> 00:08:28,233 Arranged in a different way than with. PCA. 230 00:08:28,266 --> 00:08:31,100 Right? We have very different arrangement of the points here. 231 00:08:31,100 --> 00:08:32,333 We can see them more. 232 00:08:32,333 --> 00:08:35,333 Dispersed. Than. Here right with our PCA. 233 00:08:35,500 --> 00:08:38,033 Well that's because. We are in a new dimension. 234 00:08:38,033 --> 00:08:40,933 We are with different. Dimensions pc1. PC2. 235 00:08:40,933 --> 00:08:43,200 Meaning. Different extracted features. 236 00:08:43,200 --> 00:08:44,933 So that's totally normal that. 237 00:08:44,933 --> 00:08:47,766 Our observation. Points, you know the wines here. Are. 238 00:08:47,766 --> 00:08:49,600 Arranged in a very different way. 239 00:08:49,600 --> 00:08:51,766 That's because we are in a different dimension in. 240 00:08:51,766 --> 00:08:54,000 Which, well, the logistic regression. 241 00:08:54,000 --> 00:08:57,600 Model. Was perfectly able to classify our. 242 00:08:57,600 --> 00:08:58,800 Observation points with. 243 00:08:58,800 --> 00:09:01,800 These three prediction regions. 244 00:09:01,800 --> 00:09:03,566 And similar to LDA, the. 245 00:09:03,566 --> 00:09:08,000 Observation points are arranged differently because once again, these. 246 00:09:08,000 --> 00:09:10,233 Are. Some different dimensions. 247 00:09:10,233 --> 00:09:12,633 We are in another dimension here. 248 00:09:12,633 --> 00:09:14,833 Thanks to these extracted features. 249 00:09:14,833 --> 00:09:17,300 So you see. This dimensionality reduction technique is. 250 00:09:17,300 --> 00:09:17,566 Pretty. 251 00:09:17,566 --> 00:09:18,833 Fascinating because it. 252 00:09:18,833 --> 00:09:21,766 Basically allows us to create a new space of. 253 00:09:21,766 --> 00:09:24,300 Dimensions and in some new dimension. 254 00:09:24,300 --> 00:09:25,633 Well, indeed, we can. 255 00:09:25,633 --> 00:09:27,166 Perfectly classify. 256 00:09:27,166 --> 00:09:29,566 Some observations. Like it is the case for. 257 00:09:29,566 --> 00:09:32,700 Linear discriminant analysis and kernel. PCA. 258 00:09:33,366 --> 00:09:35,100 Now what I recommend for you to. 259 00:09:35,100 --> 00:09:38,100 Do is to practice. This on other. Data set. 260 00:09:38,100 --> 00:09:39,666 So I recommend, for example. 261 00:09:39,666 --> 00:09:41,333 To check out the UCI. 262 00:09:41,333 --> 00:09:43,966 ML repository platform and go to. 263 00:09:43,966 --> 00:09:45,000 The classification. 264 00:09:45,000 --> 00:09:47,966 Section and try the kernel. PCA. On other. 265 00:09:47,966 --> 00:09:49,033 Data sets. And you'll. 266 00:09:49,033 --> 00:09:51,100 See you will end up with similar results. 267 00:09:51,100 --> 00:09:53,866 With. Some prediction boundaries. Like that. 268 00:09:53,866 --> 00:09:55,000 Separating well. 269 00:09:55,000 --> 00:09:56,833 The classes. Please share your. 270 00:09:56,833 --> 00:09:59,266 Results in the Q&A, especially the ones where we. 271 00:09:59,266 --> 00:09:59,933 Clearly see. 272 00:09:59,933 --> 00:10:00,900 An improvement. 273 00:10:00,900 --> 00:10:03,900 With kernel PCA. With respect to PCA. 274 00:10:03,966 --> 00:10:05,433 You know, maybe you'll find some data. 275 00:10:05,433 --> 00:10:07,900 Sets where PCA performs poorly, but. 276 00:10:07,900 --> 00:10:08,433 Then by. 277 00:10:08,433 --> 00:10:11,933 Adding a kernel with kernel PCA, you will get much better results. 278 00:10:12,066 --> 00:10:13,200 So please share. This. 279 00:10:13,200 --> 00:10:16,200 I'm actually very interested to see what you get. 280 00:10:16,500 --> 00:10:18,066 All right, thanks in advance. 281 00:10:18,066 --> 00:10:19,733 And now congratulations. 282 00:10:19,733 --> 00:10:22,933 This new chapter on dimensionality reduction is done. 283 00:10:23,266 --> 00:10:24,933 And now we're going to move on to the final. 284 00:10:24,933 --> 00:10:27,000 Chapter of the course, Poisson. 285 00:10:27,000 --> 00:10:29,066 Model Selection and Boosting. 286 00:10:29,066 --> 00:10:31,033 Where you will learn your three. 287 00:10:31,033 --> 00:10:32,600 Last and very important. 288 00:10:32,600 --> 00:10:34,733 Tools which are first k fold. 289 00:10:34,733 --> 00:10:36,000 Cross-Validation to. 290 00:10:36,000 --> 00:10:38,833 Evaluate your machine learning models. The best way. 291 00:10:38,833 --> 00:10:41,500 Then parameter tuning to find the. 292 00:10:41,500 --> 00:10:44,166 Best values of your. Hyperparameters. 293 00:10:44,166 --> 00:10:46,066 And finally. The cherry. On the. 294 00:10:46,066 --> 00:10:50,833 Cake of this course I will teach you and give you the x g boost. 295 00:10:50,833 --> 00:10:54,466 Model, which is one of the best and most powerful machine learning. 296 00:10:54,466 --> 00:10:57,000 Models for regression or classification. 297 00:10:57,000 --> 00:10:59,000 That will complete your master machine. 298 00:10:59,000 --> 00:11:00,800 Learning toolkit the best way. 299 00:11:00,800 --> 00:11:02,833 And until then, enjoy machine learning.