1 00:00:00,200 --> 00:00:02,500 Hello and welcome to this art tutorial. 2 00:00:02,500 --> 00:00:06,933 So in the previous section, this PCA feature extraction technique 3 00:00:06,933 --> 00:00:10,166 reduced the dimensionality of our problem by extracting 4 00:00:10,366 --> 00:00:13,366 the variables that explained the most the variance. 5 00:00:13,500 --> 00:00:15,566 And now in LDA this is quite different. 6 00:00:15,566 --> 00:00:18,500 We are extracting some new independent variables 7 00:00:18,500 --> 00:00:22,433 that will separate the most the classes of the dependent variable. 8 00:00:22,833 --> 00:00:26,166 And therefore, since this time it considers the classes 9 00:00:26,166 --> 00:00:29,266 of the dependent variable, well, that means that it considers 10 00:00:29,266 --> 00:00:33,000 the dependent variable to proceed to this feature extraction technique, 11 00:00:33,133 --> 00:00:38,033 and therefore that makes LDA a supervised dimensionality reduction model. 12 00:00:38,666 --> 00:00:38,966 All right. 13 00:00:38,966 --> 00:00:41,933 So now let's apply LDA on R. 14 00:00:41,933 --> 00:00:45,000 So first very quickly let's set the right folder as working directory. 15 00:00:45,000 --> 00:00:47,500 So we will go to our machine learning A to z folder 16 00:00:47,500 --> 00:00:49,433 for nine dimensional data reduction. 17 00:00:49,433 --> 00:00:53,666 And we are now in this section 44 linear discriminant analysis. 18 00:00:54,000 --> 00:00:55,133 So let's go inside. 19 00:00:55,133 --> 00:00:57,866 And that's the folder you want to set as working directory. 20 00:00:57,866 --> 00:01:00,233 We are still working on the one dot csv file. 21 00:01:00,233 --> 00:01:02,566 So that's exactly the same business problem 22 00:01:02,566 --> 00:01:05,566 as the one we worked with when implementing PCA. 23 00:01:05,700 --> 00:01:09,566 And so that will be a good opportunity to compare this dimensionality 24 00:01:09,566 --> 00:01:12,833 reduction technique LDA to the previous one. PCA. 25 00:01:13,433 --> 00:01:14,333 So now let's not forget 26 00:01:14,333 --> 00:01:17,800 to click on this more button here and set as working directory. 27 00:01:18,300 --> 00:01:19,000 Perfect. 28 00:01:19,000 --> 00:01:22,600 And so since we are working on the same business problem as before, 29 00:01:22,600 --> 00:01:26,833 as for PCA, well implementing LDA here is going to be very easy. 30 00:01:27,166 --> 00:01:31,500 We will just take the PCA code, take everything from here 31 00:01:31,733 --> 00:01:37,866 down to the bottom copy and we'll go back to LDA and paste the whole thing here. 32 00:01:38,366 --> 00:01:42,400 And inside of this code we will just need basically to replace 33 00:01:42,600 --> 00:01:48,100 this applying PCA section by a new section dedicated to applying LDA. 34 00:01:48,333 --> 00:01:51,400 So I'm going to remove all this section. 35 00:01:51,600 --> 00:01:54,600 So let's remove this and let's 36 00:01:54,733 --> 00:01:57,466 replace here PCA by LDA. 37 00:01:57,466 --> 00:02:00,466 And now it's time to implement LDA. 38 00:02:00,533 --> 00:02:03,300 So the package that we're going to use to apply LDA 39 00:02:03,300 --> 00:02:06,133 is the max package Max. 40 00:02:06,133 --> 00:02:10,566 And it's actually a package that is by default in your list of packages. 41 00:02:10,900 --> 00:02:13,133 It's the package right here. 42 00:02:13,133 --> 00:02:16,933 Mass support functions and data sets for the labels and replacements. 43 00:02:17,400 --> 00:02:17,700 All right. 44 00:02:17,700 --> 00:02:19,933 So as you can see this is not imported. 45 00:02:19,933 --> 00:02:22,933 So let's import it by using the library command. 46 00:02:23,533 --> 00:02:24,266 Here we go. 47 00:02:24,266 --> 00:02:27,266 And mess this way in parenthesis. 48 00:02:27,900 --> 00:02:28,500 All right. 49 00:02:28,500 --> 00:02:32,666 We can already select this to import the package. 50 00:02:33,000 --> 00:02:35,400 And now let's implement LDA. 51 00:02:35,400 --> 00:02:40,766 So the first thing that we have to do is as for PCA create an LDA variable 52 00:02:40,866 --> 00:02:43,300 that we will use to transform our original data 53 00:02:43,300 --> 00:02:47,266 set into this new data set composed of the linear discriminant. 54 00:02:47,566 --> 00:02:50,933 And so we're going to call this variable LDA equals. 55 00:02:51,333 --> 00:02:54,600 And now we're going to use the LDA function as simple as that. 56 00:02:55,166 --> 00:02:57,566 And let's add some parenthesis. 57 00:02:57,566 --> 00:03:00,966 And now let's see the arguments by pressing F1. 58 00:03:01,566 --> 00:03:01,866 All right. 59 00:03:01,866 --> 00:03:03,333 So the arguments are here. 60 00:03:03,333 --> 00:03:05,433 The first argument is formula. 61 00:03:05,433 --> 00:03:09,466 So that's exactly the formula of your dependent variable 62 00:03:09,466 --> 00:03:13,633 with respect to the independent variables so far the original ones. 63 00:03:13,800 --> 00:03:16,766 So here as a first argument we will input 64 00:03:16,766 --> 00:03:20,100 formula equals customer segment. 65 00:03:21,566 --> 00:03:24,066 Remember that's the name of the dependent variable. 66 00:03:24,066 --> 00:03:26,100 And tilde. 67 00:03:26,100 --> 00:03:28,333 And we can add a dot. 68 00:03:28,333 --> 00:03:31,800 Don't worry we don't have to write all the names of the independent variables. 69 00:03:31,800 --> 00:03:33,633 The dot is here for us. 70 00:03:33,633 --> 00:03:36,633 So come up and then next argument. 71 00:03:36,933 --> 00:03:38,833 And then the next argument is data. 72 00:03:38,833 --> 00:03:43,300 And as for PCA the data here is going to be the training set. 73 00:03:43,633 --> 00:03:46,566 So let's add here training training set. 74 00:03:46,566 --> 00:03:47,800 Here we go. 75 00:03:47,800 --> 00:03:50,333 All right. And actually that's all. 76 00:03:50,333 --> 00:03:53,966 And that's for a specific reason a very important reason that is directly 77 00:03:53,966 --> 00:03:57,233 related to LDA. It's due to the fact 78 00:03:57,233 --> 00:04:00,666 that LDA is a supervised dimensionality reduction technique. 79 00:04:00,966 --> 00:04:02,100 Remember supervised 80 00:04:02,100 --> 00:04:05,700 means that the LDA model takes into account the dependent variable. 81 00:04:06,200 --> 00:04:08,800 And since it takes into account the dependent variable, 82 00:04:08,800 --> 00:04:13,733 well, it's quite intuitive to understand that the number of linear discriminant 83 00:04:14,000 --> 00:04:15,166 will be related 84 00:04:15,166 --> 00:04:18,900 to the information of the dependent variable, and that information is actually 85 00:04:18,900 --> 00:04:21,666 the number of classes in the dependent variable. 86 00:04:21,666 --> 00:04:25,300 And there is this explicit correlation between the number of linear 87 00:04:25,300 --> 00:04:28,500 discriminant and the information of the dependent variable. 88 00:04:28,766 --> 00:04:31,933 It's that there will be k minus one linear 89 00:04:31,933 --> 00:04:34,933 discriminant where k is the number of classes. 90 00:04:35,066 --> 00:04:38,866 So here, since we have three classes, that means that we'll get at most 91 00:04:38,866 --> 00:04:41,866 three minus one equals to linear discriminant. 92 00:04:42,000 --> 00:04:46,400 So here without specifying the number of linear discriminant equal to two 93 00:04:46,633 --> 00:04:49,866 we will get automatically to linear discriminant. 94 00:04:50,200 --> 00:04:51,600 And therefore that's all. 95 00:04:51,600 --> 00:04:54,266 We don't need to add any other argument. 96 00:04:54,266 --> 00:04:58,866 So the LDA object is ready to be used to transform 97 00:04:59,166 --> 00:05:03,600 our original data set into this new one, composed of the linear discriminant. 98 00:05:03,900 --> 00:05:08,100 We will get to linear discriminant, which is exactly what we want. 99 00:05:08,400 --> 00:05:12,133 As for PCA, we want two new extracted features, and 100 00:05:12,133 --> 00:05:16,433 this time these two new extracted features will separate the most two classes. 101 00:05:16,433 --> 00:05:19,433 So we should get very good results as well. 102 00:05:19,433 --> 00:05:22,433 And then that's the same as for PCA. 103 00:05:22,600 --> 00:05:25,600 We need to transform into training set and the test set 104 00:05:25,866 --> 00:05:28,766 so that we can then use them in the next sections. 105 00:05:28,766 --> 00:05:31,766 So we will use the training set to fit SVM 106 00:05:31,966 --> 00:05:34,966 well actually to the training set to build the classifier. 107 00:05:35,100 --> 00:05:39,233 And then we will make our predictions predicting the test results with this test 108 00:05:39,233 --> 00:05:43,133 set that will be transformed as well and make the confusion matrix. 109 00:05:43,133 --> 00:05:47,533 And most importantly we will visualize the training set and the test results, 110 00:05:47,766 --> 00:05:51,600 something that we'll be able to do because we now have two features. 111 00:05:52,066 --> 00:05:53,266 All right. So let's do it. 112 00:05:53,266 --> 00:05:57,233 Let's do the same to apply LDA on the training set and the test set. 113 00:05:57,500 --> 00:05:59,733 So first let's start with the training set. 114 00:05:59,733 --> 00:06:01,033 Training set. 115 00:06:01,033 --> 00:06:02,466 So remember we're keeping this 116 00:06:02,466 --> 00:06:04,866 name of the training set so that we don't have to change it 117 00:06:04,866 --> 00:06:06,333 in the rest of the sections. 118 00:06:06,333 --> 00:06:08,700 So training set equals. 119 00:06:08,700 --> 00:06:11,600 And then remember we need to use the predict function. 120 00:06:11,600 --> 00:06:14,266 That's actually exactly the same as for PCA. 121 00:06:14,266 --> 00:06:17,966 We will however need to add something to make it work. 122 00:06:18,133 --> 00:06:19,600 And I will explain what it is. 123 00:06:19,600 --> 00:06:23,300 But definitely we are doing this transformation with the predict function. 124 00:06:23,800 --> 00:06:25,200 So parentheses. 125 00:06:25,200 --> 00:06:29,033 And now inside this predict function we need to specify 126 00:06:29,033 --> 00:06:31,500 you know the first argument here which is the object. 127 00:06:31,500 --> 00:06:35,700 So the object is LDA and then comma and then the second argument. 128 00:06:35,700 --> 00:06:37,000 And the second argument 129 00:06:37,000 --> 00:06:39,700 is the data set on which we want to make the transformation. 130 00:06:39,700 --> 00:06:42,133 That is, extract the new features. 131 00:06:42,133 --> 00:06:44,500 And that's the training set here it is. 132 00:06:44,500 --> 00:06:45,433 So as a reminder 133 00:06:45,433 --> 00:06:49,266 this is the original data set composed of the 13 independent variables. 134 00:06:49,533 --> 00:06:54,200 And this will be the new training set composed of the two new extracted features 135 00:06:54,466 --> 00:06:57,400 that are the two linear discriminant. 136 00:06:57,400 --> 00:06:58,000 All right. 137 00:06:58,000 --> 00:07:03,166 So we can already do that to see if we get the right training set as we expect. 138 00:07:03,566 --> 00:07:08,800 So before executing this line of course we need to execute the previous sections 139 00:07:08,800 --> 00:07:12,266 because we need to import the data set and apply data preprocessing. 140 00:07:12,600 --> 00:07:13,700 So let's do it. 141 00:07:13,700 --> 00:07:15,300 We don't have anything to change here. 142 00:07:15,300 --> 00:07:17,166 Everything is already well prepared 143 00:07:17,166 --> 00:07:20,300 thanks to what we did in the previous section with PCA. 144 00:07:20,500 --> 00:07:22,033 So let's execute this. 145 00:07:22,033 --> 00:07:24,033 Here we go. Well executed. 146 00:07:24,033 --> 00:07:28,766 And now let's apply LDA, create the LDA object 147 00:07:29,100 --> 00:07:32,500 and then use this object to transform our original training 148 00:07:32,500 --> 00:07:36,166 set into this new training set composed of the two linear discriminant. 149 00:07:36,633 --> 00:07:39,200 So we already imported the math package. 150 00:07:39,200 --> 00:07:42,100 So we just need to execute this line of code. 151 00:07:42,100 --> 00:07:43,200 Let's do it. 152 00:07:43,200 --> 00:07:45,833 Here we go LDA object well created. 153 00:07:45,833 --> 00:07:48,833 And now we are ready to transform the training set. 154 00:07:48,900 --> 00:07:51,433 But before we select this line and execute it, 155 00:07:51,433 --> 00:07:56,366 we need to add this one more thing that I just mentioned, which is a function 156 00:07:56,500 --> 00:08:00,500 that we will apply to this whole predict LDA training set here. 157 00:08:01,033 --> 00:08:04,600 And that will set this training set that will be transformed 158 00:08:04,833 --> 00:08:06,566 into a data frame. 159 00:08:06,566 --> 00:08:11,066 Because for PCA, when we did this, when we transform 160 00:08:11,133 --> 00:08:13,300 here, the training set to this new training set 161 00:08:13,300 --> 00:08:17,366 composed of the principal components, well we got a data frame. 162 00:08:17,633 --> 00:08:20,233 But for LDA that's not the same. 163 00:08:20,233 --> 00:08:21,766 We will get a matrix 164 00:08:21,766 --> 00:08:25,700 and we need a data frame because then you know we have the next code sections. 165 00:08:25,833 --> 00:08:29,766 And inside these code sections the functions that we use expect a data 166 00:08:29,766 --> 00:08:31,366 frame for the training set here. 167 00:08:31,366 --> 00:08:33,633 For example it expects a data frame. 168 00:08:33,633 --> 00:08:37,700 And so we absolutely need to convert this transform training set. 169 00:08:37,833 --> 00:08:41,666 That will not be a data frame if we execute it this way, but a matrix. 170 00:08:42,000 --> 00:08:44,733 And so to simply convert 171 00:08:44,733 --> 00:08:48,100 this into a data frame, I think we already did it before. 172 00:08:48,266 --> 00:08:52,466 We need to use the function as dot data, dot 173 00:08:52,566 --> 00:08:55,566 frame and parenthesis. 174 00:08:56,333 --> 00:08:58,500 And we close the parenthesis here. 175 00:08:58,500 --> 00:09:02,100 And that will set this transform training set as a data frame. 176 00:09:02,466 --> 00:09:06,500 And now we are ready to execute this line of code to obtain our new training 177 00:09:06,500 --> 00:09:09,933 set composed of the extracted features the linear discriminant. 178 00:09:10,266 --> 00:09:11,266 So let's do it. 179 00:09:11,266 --> 00:09:14,033 Let's select this line and execute. 180 00:09:14,033 --> 00:09:17,600 And as you can see the training set is still a training set. 181 00:09:17,900 --> 00:09:19,800 Otherwise it would be in values. 182 00:09:19,800 --> 00:09:23,866 And when I click on it let's have a look at what we just created. 183 00:09:24,300 --> 00:09:28,333 So first of all the first thing that we see is this first column here. 184 00:09:28,433 --> 00:09:30,400 That is the dependent variable itself. 185 00:09:30,400 --> 00:09:33,433 I know this is no longer called customer segment, 186 00:09:33,600 --> 00:09:35,566 but it's actually the customer segment column. 187 00:09:35,566 --> 00:09:38,133 That's exactly the same one for the same observations 188 00:09:38,133 --> 00:09:40,200 and the same labels one, two and three. 189 00:09:40,200 --> 00:09:43,233 But automatically it was called class by the predict function. 190 00:09:43,766 --> 00:09:44,966 So don't worry about that. 191 00:09:44,966 --> 00:09:46,533 That's the dependent variable. 192 00:09:46,533 --> 00:09:48,600 And then the next interesting thing we see 193 00:09:48,600 --> 00:09:52,600 are the two linear discriminant L1 and L2. 194 00:09:53,033 --> 00:09:55,900 And as I told you that's the number of linear discriminant 195 00:09:55,900 --> 00:10:00,100 we get due to the fact that we have three classes for the dependent variable. 196 00:10:00,533 --> 00:10:02,266 So that's what matters for us now. 197 00:10:02,266 --> 00:10:04,100 And that's the variables that we'll use. 198 00:10:04,100 --> 00:10:07,900 That's the new extracted features that we'll use to train the SVM model 199 00:10:07,900 --> 00:10:08,933 to make the predictions, 200 00:10:08,933 --> 00:10:12,466 to make the confusion matrix and eventually to visualize the results. 201 00:10:13,066 --> 00:10:16,333 And then we have this other three variables posterior one, procedure 202 00:10:16,333 --> 00:10:17,500 two and three. 203 00:10:17,500 --> 00:10:20,700 That is just variables derived from the LDA model equations. 204 00:10:20,933 --> 00:10:22,433 So that's not very important here. 205 00:10:22,433 --> 00:10:25,433 What matters is that we have our dependent variable class 206 00:10:25,533 --> 00:10:29,966 and our two new extracted features the linear discriminant one and two. 207 00:10:30,566 --> 00:10:34,800 And so now what we have to do is set our training set into the right format. 208 00:10:35,033 --> 00:10:38,366 That is we want the training set that is composed of first to two 209 00:10:38,366 --> 00:10:41,366 extracted features to two new independent variables. 210 00:10:41,533 --> 00:10:44,633 And then in last position, the dependent variable class. 211 00:10:44,633 --> 00:10:46,766 That is nothing else than the customer segment. 212 00:10:46,766 --> 00:10:50,866 So basically what we need to do here is the same as what we did for PCA. 213 00:10:51,166 --> 00:10:55,800 That is, play with the indexes to not only set the right order for our columns, 214 00:10:55,966 --> 00:10:57,233 but also to not include 215 00:10:57,233 --> 00:11:00,600 the three columns here posterior one, pursue two and pursue three. 216 00:11:01,000 --> 00:11:05,166 So what we'll do here to be efficient is take our PCA model 217 00:11:05,600 --> 00:11:08,600 and we will take this line here. 218 00:11:08,833 --> 00:11:14,266 Copy and go back to LDA and paste that here. 219 00:11:14,566 --> 00:11:15,000 All right. 220 00:11:15,000 --> 00:11:19,200 And now as for PCA we need to include three indexes. 221 00:11:19,500 --> 00:11:22,466 This first index here will be the index of LDA one. 222 00:11:22,466 --> 00:11:27,500 So that is the index of this column that is 1234 and five. 223 00:11:27,633 --> 00:11:29,166 So that's index five. 224 00:11:29,166 --> 00:11:30,633 So let's add it here. 225 00:11:30,633 --> 00:11:32,866 Replace two by five. 226 00:11:32,866 --> 00:11:35,133 Then the second index here should be the index 227 00:11:35,133 --> 00:11:38,633 of the second new extracted feature that is LDA two. 228 00:11:38,866 --> 00:11:40,433 So that is index six. 229 00:11:40,433 --> 00:11:42,266 This column has index six. 230 00:11:42,266 --> 00:11:45,266 So let's replace here three by six. 231 00:11:45,533 --> 00:11:46,466 All right. 232 00:11:46,466 --> 00:11:50,100 And eventually this should be the index of the dependent variable. 233 00:11:50,333 --> 00:11:54,566 And this index is of course one because this is the first column here 234 00:11:54,600 --> 00:11:56,200 that has index one. 235 00:11:56,200 --> 00:11:59,533 So now when I execute this line here we go. 236 00:11:59,766 --> 00:12:02,033 Look at our new training set. 237 00:12:02,033 --> 00:12:04,800 Well that is this time exactly what we want. 238 00:12:04,800 --> 00:12:07,500 The first two columns are the new extracted features. 239 00:12:07,500 --> 00:12:10,266 And the last column is the dependent variable vector. 240 00:12:10,266 --> 00:12:13,666 Exactly as what is expected in the rest of the code sections. 241 00:12:14,233 --> 00:12:15,233 So perfect. 242 00:12:15,233 --> 00:12:18,233 Our training set is well transformed and ready 243 00:12:18,233 --> 00:12:21,233 to be used to train the SVM model. 244 00:12:21,333 --> 00:12:21,700 All right. 245 00:12:21,700 --> 00:12:23,633 So now we need to do the same for the test set. 246 00:12:23,633 --> 00:12:25,500 So that will be very quick and easy. 247 00:12:25,500 --> 00:12:30,000 We will select these two lines here copy paste 248 00:12:30,433 --> 00:12:33,366 and now we just need to replace training set 249 00:12:33,366 --> 00:12:36,500 by test set here here as well 250 00:12:37,533 --> 00:12:41,100 here and eventually here also. 251 00:12:41,400 --> 00:12:43,666 And now we can just execute these two lines. 252 00:12:43,666 --> 00:12:46,133 But let's execute them one by one. 253 00:12:46,133 --> 00:12:48,000 This is the test set so far 254 00:12:48,000 --> 00:12:51,400 composed of the 13 independent variables the original ones. 255 00:12:51,900 --> 00:12:55,666 Then when we select this line and execute 256 00:12:56,100 --> 00:12:59,666 well we only get two new extracted features 257 00:12:59,700 --> 00:13:03,400 LD1 and 92 and the three variables here of the equation. 258 00:13:03,700 --> 00:13:05,933 And of course the dependent variable class. 259 00:13:05,933 --> 00:13:09,400 And then when we do this again to take the right 260 00:13:09,400 --> 00:13:12,400 indexes in the correct order we execute this. 261 00:13:12,466 --> 00:13:16,433 And now we get the test set composed of the two new extracted features 262 00:13:16,433 --> 00:13:21,400 in first positions L1 and L2 and the dependent variable and last position. 263 00:13:21,766 --> 00:13:22,866 So that's perfect. 264 00:13:22,866 --> 00:13:28,100 Now we are ready to execute the rest of the sections to build our SVM model. 265 00:13:28,733 --> 00:13:29,833 All right so let's do it. 266 00:13:29,833 --> 00:13:33,366 And actually we don't have much thing to change in this section. 267 00:13:33,366 --> 00:13:35,066 Do thing. We need to change something. 268 00:13:35,066 --> 00:13:40,800 Well the answer is yes because remember the dependent variable is no longer called 269 00:13:41,000 --> 00:13:44,200 customer segment even if it is the customer segment variable. 270 00:13:44,400 --> 00:13:47,400 But this time it has a different name which is class. 271 00:13:47,533 --> 00:13:49,933 And that's actually the only thing that we need to change 272 00:13:49,933 --> 00:13:52,933 because the training set still has the same name. 273 00:13:52,933 --> 00:13:55,066 It's the training set that we just transform here. 274 00:13:55,066 --> 00:13:55,900 So that's fine. 275 00:13:55,900 --> 00:13:58,166 And then it's the same type and the same kernel 276 00:13:58,166 --> 00:14:01,166 because we are building a linear SVM model. 277 00:14:01,600 --> 00:14:01,933 All right. 278 00:14:01,933 --> 00:14:02,400 So perfect. 279 00:14:02,400 --> 00:14:05,400 Let's execute this section. Let's do it. 280 00:14:05,966 --> 00:14:07,933 Done model created. 281 00:14:07,933 --> 00:14:10,933 And now we are ready to predict the test set results. 282 00:14:11,500 --> 00:14:14,600 So the test results do we need to change something here. 283 00:14:14,600 --> 00:14:20,300 Well this time the answer is no because we have our test set transformed. 284 00:14:20,300 --> 00:14:22,966 It has the same name and the classifier is this one. 285 00:14:22,966 --> 00:14:24,466 So everything is perfect. 286 00:14:24,466 --> 00:14:27,933 We are ready to execute this line of code. 287 00:14:28,566 --> 00:14:30,733 Perfect and simple. Here. 288 00:14:30,733 --> 00:14:32,266 We don't need to change anything. 289 00:14:32,266 --> 00:14:36,633 We can just make the confusion matrix by executing this line. 290 00:14:36,800 --> 00:14:39,166 Here we go. Confusion matrix created. 291 00:14:39,166 --> 00:14:42,733 Let's see if we also get 100% accuracy. 292 00:14:43,033 --> 00:14:45,600 We will be able to see that in a flashlight, 293 00:14:45,600 --> 00:14:48,966 because if there is one incorrect prediction, 294 00:14:49,500 --> 00:14:54,233 well that means that we will not get 100% accuracy as we obtained with PCA. 295 00:14:54,400 --> 00:14:57,400 So that means that it will not be as perfect as PCA. 296 00:14:57,433 --> 00:14:59,066 So let's do it. 297 00:14:59,066 --> 00:15:01,566 Let's type CM here and press enter. 298 00:15:01,566 --> 00:15:05,600 And unfortunately we get one incorrect prediction here. 299 00:15:06,000 --> 00:15:07,766 But that's not such a big deal 300 00:15:07,766 --> 00:15:11,566 because not only one incorrect prediction is still excellent. 301 00:15:11,566 --> 00:15:16,166 But also remember when we visualize the training set results with PCA, well, 302 00:15:16,166 --> 00:15:19,400 we had incorrect predictions, so we were just a little lucky 303 00:15:19,400 --> 00:15:22,400 to get zero incorrect predictions with PCA. 304 00:15:22,566 --> 00:15:24,933 So all right so that's still excellent results. 305 00:15:24,933 --> 00:15:27,500 And now let's visualize the training set results. 306 00:15:27,500 --> 00:15:30,066 Now do we need to change something in the section. 307 00:15:30,066 --> 00:15:34,866 Well try to figure out if yes we do because it's important to understand this. 308 00:15:34,866 --> 00:15:36,200 In case you use this code 309 00:15:36,200 --> 00:15:39,900 section here to visualize the results on your problem, on your data set. 310 00:15:40,400 --> 00:15:45,833 Well, the answer is this time, yes, because this line of code here call names 311 00:15:46,300 --> 00:15:50,100 expects to have the real name of your independent 312 00:15:50,100 --> 00:15:53,333 variables, the new extracted features L1 and L2. 313 00:15:53,533 --> 00:15:58,000 So here we need to replace PC1 and PC2 by respectively 314 00:15:58,600 --> 00:16:04,066 x dot L1, because that's the name of the first extracted feature, 315 00:16:04,066 --> 00:16:08,000 the first new independent variable and x dot L2. 316 00:16:08,100 --> 00:16:11,700 The second new extracted feature, the second new independent variable. 317 00:16:12,100 --> 00:16:15,333 So let's replace them PC1 318 00:16:15,600 --> 00:16:18,600 by x dot L1 319 00:16:18,866 --> 00:16:21,866 and pc2 by x dot ld two. 320 00:16:22,200 --> 00:16:23,266 So that's very important. 321 00:16:23,266 --> 00:16:26,233 That's the only thing that you will need to change the cool names. 322 00:16:26,233 --> 00:16:29,233 You must have the real names of your independent variables 323 00:16:29,366 --> 00:16:31,033 when you visualize the results. 324 00:16:31,033 --> 00:16:33,266 And then do we need to change something else. 325 00:16:33,266 --> 00:16:35,400 Well this we can also change. 326 00:16:35,400 --> 00:16:36,466 But that's not compulsory. 327 00:16:36,466 --> 00:16:40,300 That's just for the labels of the x axis and the y axis. 328 00:16:40,500 --> 00:16:41,733 So let's do it anyway. 329 00:16:41,733 --> 00:16:45,200 And this time we don't need to specify the real name of the independent variable. 330 00:16:45,400 --> 00:16:49,033 We can just replace PC one by L2 one to specify 331 00:16:49,033 --> 00:16:52,433 that it's the linear discriminant and not the principal component. 332 00:16:52,833 --> 00:16:55,966 And same here we can replace PC2 by L2. 333 00:16:56,600 --> 00:16:57,000 All right. 334 00:16:57,000 --> 00:16:57,900 And now that's done. 335 00:16:57,900 --> 00:17:00,133 Now this code is ready to be executed. 336 00:17:00,133 --> 00:17:05,066 So let's make the quick same changes for the visualization of the test results. 337 00:17:05,366 --> 00:17:09,100 So let's replace PC1 here by x dot LD one 338 00:17:09,766 --> 00:17:13,233 then PC two by x dot LD two 339 00:17:13,500 --> 00:17:19,400 and replace pc1 by LD1 and PC2 by two. 340 00:17:19,766 --> 00:17:21,333 And now everything is ready. 341 00:17:21,333 --> 00:17:23,166 We don't need to change anything more. 342 00:17:23,166 --> 00:17:24,600 We can grab a cup of coffee 343 00:17:24,600 --> 00:17:28,366 and visualize the training set results and test set results. 344 00:17:28,566 --> 00:17:29,900 So let's do it. 345 00:17:29,900 --> 00:17:31,800 Let's hope that everything is okay. 346 00:17:31,800 --> 00:17:35,133 So I'm going to select all the section up to here. 347 00:17:35,600 --> 00:17:37,800 So that's to visualize the training set results. 348 00:17:37,800 --> 00:17:40,433 Let's do it. And here we go. 349 00:17:40,433 --> 00:17:42,300 All right. So it's executing. 350 00:17:42,300 --> 00:17:44,366 It's always taking a little time. 351 00:17:44,366 --> 00:17:45,500 But we will get there. 352 00:17:45,500 --> 00:17:47,866 We can already click on plots. 353 00:17:47,866 --> 00:17:48,200 All right. 354 00:17:48,200 --> 00:17:51,166 So the computations are being made. Almost done. 355 00:17:52,966 --> 00:17:54,833 And here are the results. 356 00:17:54,833 --> 00:17:56,200 Beautiful results. 357 00:17:56,200 --> 00:17:58,966 The three classes were almost well separated. 358 00:17:58,966 --> 00:18:00,166 Perfectly well separated. 359 00:18:00,166 --> 00:18:02,800 We can see the incorrect prediction. But be careful. 360 00:18:02,800 --> 00:18:05,800 That's not the same incorrect prediction we saw with the confusion 361 00:18:05,800 --> 00:18:08,900 matrix because that concerns the test set here. 362 00:18:08,900 --> 00:18:09,733 It's the training set. 363 00:18:09,733 --> 00:18:13,566 So we also have one incorrect prediction on the training set. 364 00:18:13,566 --> 00:18:14,400 That's this one. 365 00:18:14,400 --> 00:18:16,066 But that's almost perfect 366 00:18:16,066 --> 00:18:19,066 which is quite intuitive to understand because as you remember 367 00:18:19,266 --> 00:18:23,700 LDA tries to separate the most the classes of your dependent variable. 368 00:18:24,000 --> 00:18:26,966 So that's why here we can see that the prediction boundary 369 00:18:26,966 --> 00:18:29,833 is kind of equidistant to the majority 370 00:18:29,833 --> 00:18:32,833 of the green points here, and the majority of the blue points here. 371 00:18:33,200 --> 00:18:34,666 So that's perfect. 372 00:18:34,666 --> 00:18:37,966 Each one is in its correct segment of customers. 373 00:18:38,200 --> 00:18:41,200 And therefore this one business owner can feel pretty confident 374 00:18:41,500 --> 00:18:44,233 at predicting for each new wine to which 375 00:18:44,233 --> 00:18:47,233 customer segment he should recommend it. 376 00:18:47,266 --> 00:18:50,500 And not only he can be pretty confident at recommending the new wines 377 00:18:50,500 --> 00:18:53,933 to the right customers, but also thanks to the feature extraction 378 00:18:53,933 --> 00:18:57,633 technique that allows him to visualize its results in two dimensions. 379 00:18:57,900 --> 00:19:00,900 Thanks to this number of two new independent variables 380 00:19:01,066 --> 00:19:02,533 the linear discriminant. 381 00:19:02,533 --> 00:19:04,400 Well, now this wine business owner 382 00:19:04,400 --> 00:19:08,166 can make a clear plot of its different segment of customers, 383 00:19:08,166 --> 00:19:11,433 and putting in each segment of customers the different wines. 384 00:19:11,733 --> 00:19:14,733 So eventually that can be pretty convenient. 385 00:19:14,900 --> 00:19:16,200 Okay, so perfect. 386 00:19:16,200 --> 00:19:20,466 We managed to build a great LDA model, so we'll finish on this good note, 387 00:19:20,700 --> 00:19:23,533 and therefore we'll move on to the next section of this course, 388 00:19:23,533 --> 00:19:26,533 which is going to be another feature extraction technique. 389 00:19:26,566 --> 00:19:29,566 But this time adapt it for nonlinear problems. 390 00:19:29,733 --> 00:19:32,566 So since this problem is obviously 391 00:19:32,566 --> 00:19:36,600 a linear problem because we managed to apply very successfully 392 00:19:36,600 --> 00:19:40,566 linear models as PCA and LDA, well, it's won't be relevant 393 00:19:40,566 --> 00:19:44,466 to apply a nonlinear feature extraction model on this data set. 394 00:19:44,866 --> 00:19:48,800 So we will work on another data set that is of course going to be nonlinear. 395 00:19:49,333 --> 00:19:51,833 And this next new feature extraction technique 396 00:19:51,833 --> 00:19:54,900 that we're going to see is going to be kernel PCA. 397 00:19:55,233 --> 00:19:57,666 So I look forward to studying this in the next section. 398 00:19:57,666 --> 00:19:59,433 And until then enjoy machine learning.