1
00:00:00,200 --> 00:00:02,500
Hello and welcome to this art tutorial.

2
00:00:02,500 --> 00:00:06,933
So in the previous section,
this PCA feature extraction technique

3
00:00:06,933 --> 00:00:10,166
reduced the dimensionality of our problem
by extracting

4
00:00:10,366 --> 00:00:13,366
the variables
that explained the most the variance.

5
00:00:13,500 --> 00:00:15,566
And now in LDA this is quite different.

6
00:00:15,566 --> 00:00:18,500
We are extracting
some new independent variables

7
00:00:18,500 --> 00:00:22,433
that will separate the most
the classes of the dependent variable.

8
00:00:22,833 --> 00:00:26,166
And therefore,
since this time it considers the classes

9
00:00:26,166 --> 00:00:29,266
of the dependent variable, well,
that means that it considers

10
00:00:29,266 --> 00:00:33,000
the dependent variable to proceed
to this feature extraction technique,

11
00:00:33,133 --> 00:00:38,033
and therefore that makes LDA a supervised
dimensionality reduction model.

12
00:00:38,666 --> 00:00:38,966
All right.

13
00:00:38,966 --> 00:00:41,933
So now let's apply LDA on R.

14
00:00:41,933 --> 00:00:45,000
So first very quickly let's set the right
folder as working directory.

15
00:00:45,000 --> 00:00:47,500
So we will go to our machine
learning A to z folder

16
00:00:47,500 --> 00:00:49,433
for nine dimensional data reduction.

17
00:00:49,433 --> 00:00:53,666
And we are now in this section
44 linear discriminant analysis.

18
00:00:54,000 --> 00:00:55,133
So let's go inside.

19
00:00:55,133 --> 00:00:57,866
And that's the folder
you want to set as working directory.

20
00:00:57,866 --> 00:01:00,233
We are still working on the one dot
csv file.

21
00:01:00,233 --> 00:01:02,566
So that's exactly
the same business problem

22
00:01:02,566 --> 00:01:05,566
as the one we worked with
when implementing PCA.

23
00:01:05,700 --> 00:01:09,566
And so that will be a good opportunity
to compare this dimensionality

24
00:01:09,566 --> 00:01:12,833
reduction technique LDA
to the previous one. PCA.

25
00:01:13,433 --> 00:01:14,333
So now let's not forget

26
00:01:14,333 --> 00:01:17,800
to click on this more button here
and set as working directory.

27
00:01:18,300 --> 00:01:19,000
Perfect.

28
00:01:19,000 --> 00:01:22,600
And so since we are working
on the same business problem as before,

29
00:01:22,600 --> 00:01:26,833
as for PCA, well implementing
LDA here is going to be very easy.

30
00:01:27,166 --> 00:01:31,500
We will just take the PCA code,
take everything from here

31
00:01:31,733 --> 00:01:37,866
down to the bottom copy and we'll go back
to LDA and paste the whole thing here.

32
00:01:38,366 --> 00:01:42,400
And inside of this code
we will just need basically to replace

33
00:01:42,600 --> 00:01:48,100
this applying PCA section by a new section
dedicated to applying LDA.

34
00:01:48,333 --> 00:01:51,400
So I'm going to remove all this section.

35
00:01:51,600 --> 00:01:54,600
So let's remove this and let's

36
00:01:54,733 --> 00:01:57,466
replace here PCA by LDA.

37
00:01:57,466 --> 00:02:00,466
And now it's time to implement LDA.

38
00:02:00,533 --> 00:02:03,300
So the package
that we're going to use to apply LDA

39
00:02:03,300 --> 00:02:06,133
is the max package Max.

40
00:02:06,133 --> 00:02:10,566
And it's actually a package that is by
default in your list of packages.

41
00:02:10,900 --> 00:02:13,133
It's the package right here.

42
00:02:13,133 --> 00:02:16,933
Mass support functions and data
sets for the labels and replacements.

43
00:02:17,400 --> 00:02:17,700
All right.

44
00:02:17,700 --> 00:02:19,933
So as you can see this is not imported.

45
00:02:19,933 --> 00:02:22,933
So let's import it
by using the library command.

46
00:02:23,533 --> 00:02:24,266
Here we go.

47
00:02:24,266 --> 00:02:27,266
And mess this way in parenthesis.

48
00:02:27,900 --> 00:02:28,500
All right.

49
00:02:28,500 --> 00:02:32,666
We can already select this
to import the package.

50
00:02:33,000 --> 00:02:35,400
And now let's implement LDA.

51
00:02:35,400 --> 00:02:40,766
So the first thing that we have to do
is as for PCA create an LDA variable

52
00:02:40,866 --> 00:02:43,300
that we will use
to transform our original data

53
00:02:43,300 --> 00:02:47,266
set into this new data
set composed of the linear discriminant.

54
00:02:47,566 --> 00:02:50,933
And so we're going to call this variable
LDA equals.

55
00:02:51,333 --> 00:02:54,600
And now we're going to use the LDA
function as simple as that.

56
00:02:55,166 --> 00:02:57,566
And let's add some parenthesis.

57
00:02:57,566 --> 00:03:00,966
And now let's see the arguments
by pressing F1.

58
00:03:01,566 --> 00:03:01,866
All right.

59
00:03:01,866 --> 00:03:03,333
So the arguments are here.

60
00:03:03,333 --> 00:03:05,433
The first argument is formula.

61
00:03:05,433 --> 00:03:09,466
So that's exactly the formula
of your dependent variable

62
00:03:09,466 --> 00:03:13,633
with respect to the independent variables
so far the original ones.

63
00:03:13,800 --> 00:03:16,766
So here as a first argument we will input

64
00:03:16,766 --> 00:03:20,100
formula equals customer segment.

65
00:03:21,566 --> 00:03:24,066
Remember
that's the name of the dependent variable.

66
00:03:24,066 --> 00:03:26,100
And tilde.

67
00:03:26,100 --> 00:03:28,333
And we can add a dot.

68
00:03:28,333 --> 00:03:31,800
Don't worry we don't have to write all
the names of the independent variables.

69
00:03:31,800 --> 00:03:33,633
The dot is here for us.

70
00:03:33,633 --> 00:03:36,633
So come up and then next argument.

71
00:03:36,933 --> 00:03:38,833
And then the next argument is data.

72
00:03:38,833 --> 00:03:43,300
And as for PCA the data here
is going to be the training set.

73
00:03:43,633 --> 00:03:46,566
So let's add here training training set.

74
00:03:46,566 --> 00:03:47,800
Here we go.

75
00:03:47,800 --> 00:03:50,333
All right. And actually that's all.

76
00:03:50,333 --> 00:03:53,966
And that's for a specific reason
a very important reason that is directly

77
00:03:53,966 --> 00:03:57,233
related to LDA. It's due to the fact

78
00:03:57,233 --> 00:04:00,666
that LDA is a supervised dimensionality
reduction technique.

79
00:04:00,966 --> 00:04:02,100
Remember supervised

80
00:04:02,100 --> 00:04:05,700
means that the LDA model
takes into account the dependent variable.

81
00:04:06,200 --> 00:04:08,800
And since it takes into account
the dependent variable,

82
00:04:08,800 --> 00:04:13,733
well, it's quite intuitive to understand
that the number of linear discriminant

83
00:04:14,000 --> 00:04:15,166
will be related

84
00:04:15,166 --> 00:04:18,900
to the information of the dependent
variable, and that information is actually

85
00:04:18,900 --> 00:04:21,666
the number of classes
in the dependent variable.

86
00:04:21,666 --> 00:04:25,300
And there is this explicit correlation
between the number of linear

87
00:04:25,300 --> 00:04:28,500
discriminant and the information
of the dependent variable.

88
00:04:28,766 --> 00:04:31,933
It's that there will be k minus one linear

89
00:04:31,933 --> 00:04:34,933
discriminant
where k is the number of classes.

90
00:04:35,066 --> 00:04:38,866
So here, since we have three classes,
that means that we'll get at most

91
00:04:38,866 --> 00:04:41,866
three minus
one equals to linear discriminant.

92
00:04:42,000 --> 00:04:46,400
So here without specifying the number
of linear discriminant equal to two

93
00:04:46,633 --> 00:04:49,866
we will get automatically
to linear discriminant.

94
00:04:50,200 --> 00:04:51,600
And therefore that's all.

95
00:04:51,600 --> 00:04:54,266
We don't need to add any other argument.

96
00:04:54,266 --> 00:04:58,866
So the LDA object is ready to be used
to transform

97
00:04:59,166 --> 00:05:03,600
our original data set into this new one,
composed of the linear discriminant.

98
00:05:03,900 --> 00:05:08,100
We will get to linear discriminant,
which is exactly what we want.

99
00:05:08,400 --> 00:05:12,133
As for PCA,
we want two new extracted features, and

100
00:05:12,133 --> 00:05:16,433
this time these two new extracted features
will separate the most two classes.

101
00:05:16,433 --> 00:05:19,433
So we should get
very good results as well.

102
00:05:19,433 --> 00:05:22,433
And then that's the same as for PCA.

103
00:05:22,600 --> 00:05:25,600
We need to transform into training set
and the test set

104
00:05:25,866 --> 00:05:28,766
so that we can
then use them in the next sections.

105
00:05:28,766 --> 00:05:31,766
So we will use the training set to fit SVM

106
00:05:31,966 --> 00:05:34,966
well actually to the training set
to build the classifier.

107
00:05:35,100 --> 00:05:39,233
And then we will make our predictions
predicting the test results with this test

108
00:05:39,233 --> 00:05:43,133
set that will be transformed as well
and make the confusion matrix.

109
00:05:43,133 --> 00:05:47,533
And most importantly we will visualize
the training set and the test results,

110
00:05:47,766 --> 00:05:51,600
something that we'll be able to do
because we now have two features.

111
00:05:52,066 --> 00:05:53,266
All right. So let's do it.

112
00:05:53,266 --> 00:05:57,233
Let's do the same to apply LDA
on the training set and the test set.

113
00:05:57,500 --> 00:05:59,733
So first let's start
with the training set.

114
00:05:59,733 --> 00:06:01,033
Training set.

115
00:06:01,033 --> 00:06:02,466
So remember we're keeping this

116
00:06:02,466 --> 00:06:04,866
name of the training set
so that we don't have to change it

117
00:06:04,866 --> 00:06:06,333
in the rest of the sections.

118
00:06:06,333 --> 00:06:08,700
So training set equals.

119
00:06:08,700 --> 00:06:11,600
And then remember
we need to use the predict function.

120
00:06:11,600 --> 00:06:14,266
That's
actually exactly the same as for PCA.

121
00:06:14,266 --> 00:06:17,966
We will however
need to add something to make it work.

122
00:06:18,133 --> 00:06:19,600
And I will explain what it is.

123
00:06:19,600 --> 00:06:23,300
But definitely we are doing this
transformation with the predict function.

124
00:06:23,800 --> 00:06:25,200
So parentheses.

125
00:06:25,200 --> 00:06:29,033
And now inside this predict function
we need to specify

126
00:06:29,033 --> 00:06:31,500
you know the first argument here
which is the object.

127
00:06:31,500 --> 00:06:35,700
So the object is LDA and then comma
and then the second argument.

128
00:06:35,700 --> 00:06:37,000
And the second argument

129
00:06:37,000 --> 00:06:39,700
is the data set on which
we want to make the transformation.

130
00:06:39,700 --> 00:06:42,133
That is, extract the new features.

131
00:06:42,133 --> 00:06:44,500
And that's the training set here it is.

132
00:06:44,500 --> 00:06:45,433
So as a reminder

133
00:06:45,433 --> 00:06:49,266
this is the original data set
composed of the 13 independent variables.

134
00:06:49,533 --> 00:06:54,200
And this will be the new training set
composed of the two new extracted features

135
00:06:54,466 --> 00:06:57,400
that are the two linear discriminant.

136
00:06:57,400 --> 00:06:58,000
All right.

137
00:06:58,000 --> 00:07:03,166
So we can already do that to see if we get
the right training set as we expect.

138
00:07:03,566 --> 00:07:08,800
So before executing this line of course
we need to execute the previous sections

139
00:07:08,800 --> 00:07:12,266
because we need to import the data
set and apply data preprocessing.

140
00:07:12,600 --> 00:07:13,700
So let's do it.

141
00:07:13,700 --> 00:07:15,300
We don't have anything to change here.

142
00:07:15,300 --> 00:07:17,166
Everything is already well prepared

143
00:07:17,166 --> 00:07:20,300
thanks to what we did
in the previous section with PCA.

144
00:07:20,500 --> 00:07:22,033
So let's execute this.

145
00:07:22,033 --> 00:07:24,033
Here we go. Well executed.

146
00:07:24,033 --> 00:07:28,766
And now let's apply
LDA, create the LDA object

147
00:07:29,100 --> 00:07:32,500
and then use this object
to transform our original training

148
00:07:32,500 --> 00:07:36,166
set into this new training set
composed of the two linear discriminant.

149
00:07:36,633 --> 00:07:39,200
So we already imported the math package.

150
00:07:39,200 --> 00:07:42,100
So we just need to execute this
line of code.

151
00:07:42,100 --> 00:07:43,200
Let's do it.

152
00:07:43,200 --> 00:07:45,833
Here we go LDA object well created.

153
00:07:45,833 --> 00:07:48,833
And now we are ready
to transform the training set.

154
00:07:48,900 --> 00:07:51,433
But before we select this line
and execute it,

155
00:07:51,433 --> 00:07:56,366
we need to add this one more thing
that I just mentioned, which is a function

156
00:07:56,500 --> 00:08:00,500
that we will apply to this whole predict
LDA training set here.

157
00:08:01,033 --> 00:08:04,600
And that will set this training set
that will be transformed

158
00:08:04,833 --> 00:08:06,566
into a data frame.

159
00:08:06,566 --> 00:08:11,066
Because for PCA, when we did this,
when we transform

160
00:08:11,133 --> 00:08:13,300
here, the training
set to this new training set

161
00:08:13,300 --> 00:08:17,366
composed of the principal components,
well we got a data frame.

162
00:08:17,633 --> 00:08:20,233
But for LDA that's not the same.

163
00:08:20,233 --> 00:08:21,766
We will get a matrix

164
00:08:21,766 --> 00:08:25,700
and we need a data frame because then
you know we have the next code sections.

165
00:08:25,833 --> 00:08:29,766
And inside these code sections
the functions that we use expect a data

166
00:08:29,766 --> 00:08:31,366
frame for the training set here.

167
00:08:31,366 --> 00:08:33,633
For example it expects a data frame.

168
00:08:33,633 --> 00:08:37,700
And so we absolutely need to convert this
transform training set.

169
00:08:37,833 --> 00:08:41,666
That will not be a data frame
if we execute it this way, but a matrix.

170
00:08:42,000 --> 00:08:44,733
And so to simply convert

171
00:08:44,733 --> 00:08:48,100
this into a data frame,
I think we already did it before.

172
00:08:48,266 --> 00:08:52,466
We need to use the function
as dot data, dot

173
00:08:52,566 --> 00:08:55,566
frame and parenthesis.

174
00:08:56,333 --> 00:08:58,500
And we close the parenthesis here.

175
00:08:58,500 --> 00:09:02,100
And that will set this transform training
set as a data frame.

176
00:09:02,466 --> 00:09:06,500
And now we are ready to execute this line
of code to obtain our new training

177
00:09:06,500 --> 00:09:09,933
set composed of the extracted features
the linear discriminant.

178
00:09:10,266 --> 00:09:11,266
So let's do it.

179
00:09:11,266 --> 00:09:14,033
Let's select this line and execute.

180
00:09:14,033 --> 00:09:17,600
And as you can see
the training set is still a training set.

181
00:09:17,900 --> 00:09:19,800
Otherwise it would be in values.

182
00:09:19,800 --> 00:09:23,866
And when I click on it
let's have a look at what we just created.

183
00:09:24,300 --> 00:09:28,333
So first of all the first thing
that we see is this first column here.

184
00:09:28,433 --> 00:09:30,400
That is the dependent variable itself.

185
00:09:30,400 --> 00:09:33,433
I know this is no longer called
customer segment,

186
00:09:33,600 --> 00:09:35,566
but it's actually the customer
segment column.

187
00:09:35,566 --> 00:09:38,133
That's exactly the same one
for the same observations

188
00:09:38,133 --> 00:09:40,200
and the same labels one, two and three.

189
00:09:40,200 --> 00:09:43,233
But automatically it was called class
by the predict function.

190
00:09:43,766 --> 00:09:44,966
So don't worry about that.

191
00:09:44,966 --> 00:09:46,533
That's the dependent variable.

192
00:09:46,533 --> 00:09:48,600
And then the next interesting thing we see

193
00:09:48,600 --> 00:09:52,600
are the two linear discriminant L1 and L2.

194
00:09:53,033 --> 00:09:55,900
And as I told you
that's the number of linear discriminant

195
00:09:55,900 --> 00:10:00,100
we get due to the fact that we have
three classes for the dependent variable.

196
00:10:00,533 --> 00:10:02,266
So that's what matters for us now.

197
00:10:02,266 --> 00:10:04,100
And that's the variables that we'll use.

198
00:10:04,100 --> 00:10:07,900
That's the new extracted features
that we'll use to train the SVM model

199
00:10:07,900 --> 00:10:08,933
to make the predictions,

200
00:10:08,933 --> 00:10:12,466
to make the confusion matrix
and eventually to visualize the results.

201
00:10:13,066 --> 00:10:16,333
And then we have this other three
variables posterior one, procedure

202
00:10:16,333 --> 00:10:17,500
two and three.

203
00:10:17,500 --> 00:10:20,700
That is just variables
derived from the LDA model equations.

204
00:10:20,933 --> 00:10:22,433
So that's not very important here.

205
00:10:22,433 --> 00:10:25,433
What matters is that we have our dependent
variable class

206
00:10:25,533 --> 00:10:29,966
and our two new extracted features
the linear discriminant one and two.

207
00:10:30,566 --> 00:10:34,800
And so now what we have to do is set
our training set into the right format.

208
00:10:35,033 --> 00:10:38,366
That is we want the training set
that is composed of first to two

209
00:10:38,366 --> 00:10:41,366
extracted features
to two new independent variables.

210
00:10:41,533 --> 00:10:44,633
And then in last position,
the dependent variable class.

211
00:10:44,633 --> 00:10:46,766
That is nothing else
than the customer segment.

212
00:10:46,766 --> 00:10:50,866
So basically what we need to do here
is the same as what we did for PCA.

213
00:10:51,166 --> 00:10:55,800
That is, play with the indexes to not only
set the right order for our columns,

214
00:10:55,966 --> 00:10:57,233
but also to not include

215
00:10:57,233 --> 00:11:00,600
the three columns here posterior one,
pursue two and pursue three.

216
00:11:01,000 --> 00:11:05,166
So what we'll do here to be
efficient is take our PCA model

217
00:11:05,600 --> 00:11:08,600
and we will take this line here.

218
00:11:08,833 --> 00:11:14,266
Copy and go back to LDA
and paste that here.

219
00:11:14,566 --> 00:11:15,000
All right.

220
00:11:15,000 --> 00:11:19,200
And now as for PCA
we need to include three indexes.

221
00:11:19,500 --> 00:11:22,466
This first index here
will be the index of LDA one.

222
00:11:22,466 --> 00:11:27,500
So that is the index of this column
that is 1234 and five.

223
00:11:27,633 --> 00:11:29,166
So that's index five.

224
00:11:29,166 --> 00:11:30,633
So let's add it here.

225
00:11:30,633 --> 00:11:32,866
Replace two by five.

226
00:11:32,866 --> 00:11:35,133
Then the second index here
should be the index

227
00:11:35,133 --> 00:11:38,633
of the second new extracted feature
that is LDA two.

228
00:11:38,866 --> 00:11:40,433
So that is index six.

229
00:11:40,433 --> 00:11:42,266
This column has index six.

230
00:11:42,266 --> 00:11:45,266
So let's replace here three by six.

231
00:11:45,533 --> 00:11:46,466
All right.

232
00:11:46,466 --> 00:11:50,100
And eventually this should be the index
of the dependent variable.

233
00:11:50,333 --> 00:11:54,566
And this index is of course one
because this is the first column here

234
00:11:54,600 --> 00:11:56,200
that has index one.

235
00:11:56,200 --> 00:11:59,533
So now when I execute this line here
we go.

236
00:11:59,766 --> 00:12:02,033
Look at our new training set.

237
00:12:02,033 --> 00:12:04,800
Well that is this time
exactly what we want.

238
00:12:04,800 --> 00:12:07,500
The first two columns
are the new extracted features.

239
00:12:07,500 --> 00:12:10,266
And the last column
is the dependent variable vector.

240
00:12:10,266 --> 00:12:13,666
Exactly as what is expected
in the rest of the code sections.

241
00:12:14,233 --> 00:12:15,233
So perfect.

242
00:12:15,233 --> 00:12:18,233
Our training set is well transformed
and ready

243
00:12:18,233 --> 00:12:21,233
to be used to train the SVM model.

244
00:12:21,333 --> 00:12:21,700
All right.

245
00:12:21,700 --> 00:12:23,633
So now we need to do the same
for the test set.

246
00:12:23,633 --> 00:12:25,500
So that will be very quick and easy.

247
00:12:25,500 --> 00:12:30,000
We will select these two lines here
copy paste

248
00:12:30,433 --> 00:12:33,366
and now we just need to replace
training set

249
00:12:33,366 --> 00:12:36,500
by test set here here as well

250
00:12:37,533 --> 00:12:41,100
here and eventually here also.

251
00:12:41,400 --> 00:12:43,666
And now we can just execute
these two lines.

252
00:12:43,666 --> 00:12:46,133
But let's execute them one by one.

253
00:12:46,133 --> 00:12:48,000
This is the test set so far

254
00:12:48,000 --> 00:12:51,400
composed of the 13 independent variables
the original ones.

255
00:12:51,900 --> 00:12:55,666
Then when we select this line and execute

256
00:12:56,100 --> 00:12:59,666
well we only get two new extracted
features

257
00:12:59,700 --> 00:13:03,400
LD1 and 92 and the three variables
here of the equation.

258
00:13:03,700 --> 00:13:05,933
And of course
the dependent variable class.

259
00:13:05,933 --> 00:13:09,400
And then when we do this again
to take the right

260
00:13:09,400 --> 00:13:12,400
indexes in the correct order
we execute this.

261
00:13:12,466 --> 00:13:16,433
And now we get the test set
composed of the two new extracted features

262
00:13:16,433 --> 00:13:21,400
in first positions L1 and L2 and the
dependent variable and last position.

263
00:13:21,766 --> 00:13:22,866
So that's perfect.

264
00:13:22,866 --> 00:13:28,100
Now we are ready to execute the rest
of the sections to build our SVM model.

265
00:13:28,733 --> 00:13:29,833
All right so let's do it.

266
00:13:29,833 --> 00:13:33,366
And actually we don't have much thing
to change in this section.

267
00:13:33,366 --> 00:13:35,066
Do thing. We need to change something.

268
00:13:35,066 --> 00:13:40,800
Well the answer is yes because remember
the dependent variable is no longer called

269
00:13:41,000 --> 00:13:44,200
customer segment even if it is
the customer segment variable.

270
00:13:44,400 --> 00:13:47,400
But this time it has a different name
which is class.

271
00:13:47,533 --> 00:13:49,933
And that's actually the only thing
that we need to change

272
00:13:49,933 --> 00:13:52,933
because the training set
still has the same name.

273
00:13:52,933 --> 00:13:55,066
It's the training set
that we just transform here.

274
00:13:55,066 --> 00:13:55,900
So that's fine.

275
00:13:55,900 --> 00:13:58,166
And then it's the same type
and the same kernel

276
00:13:58,166 --> 00:14:01,166
because we are building a linear
SVM model.

277
00:14:01,600 --> 00:14:01,933
All right.

278
00:14:01,933 --> 00:14:02,400
So perfect.

279
00:14:02,400 --> 00:14:05,400
Let's execute this section. Let's do it.

280
00:14:05,966 --> 00:14:07,933
Done model created.

281
00:14:07,933 --> 00:14:10,933
And now we are ready to predict the test
set results.

282
00:14:11,500 --> 00:14:14,600
So the test results do
we need to change something here.

283
00:14:14,600 --> 00:14:20,300
Well this time the answer is no
because we have our test set transformed.

284
00:14:20,300 --> 00:14:22,966
It has the same name and the classifier
is this one.

285
00:14:22,966 --> 00:14:24,466
So everything is perfect.

286
00:14:24,466 --> 00:14:27,933
We are ready to execute this line of code.

287
00:14:28,566 --> 00:14:30,733
Perfect and simple. Here.

288
00:14:30,733 --> 00:14:32,266
We don't need to change anything.

289
00:14:32,266 --> 00:14:36,633
We can just make the confusion matrix
by executing this line.

290
00:14:36,800 --> 00:14:39,166
Here we go. Confusion matrix created.

291
00:14:39,166 --> 00:14:42,733
Let's see if we also get 100% accuracy.

292
00:14:43,033 --> 00:14:45,600
We will be able to see that
in a flashlight,

293
00:14:45,600 --> 00:14:48,966
because if there is
one incorrect prediction,

294
00:14:49,500 --> 00:14:54,233
well that means that we will not get 100%
accuracy as we obtained with PCA.

295
00:14:54,400 --> 00:14:57,400
So that means that it will not be
as perfect as PCA.

296
00:14:57,433 --> 00:14:59,066
So let's do it.

297
00:14:59,066 --> 00:15:01,566
Let's type CM here and press enter.

298
00:15:01,566 --> 00:15:05,600
And unfortunately we get one incorrect
prediction here.

299
00:15:06,000 --> 00:15:07,766
But that's not such a big deal

300
00:15:07,766 --> 00:15:11,566
because not only one incorrect
prediction is still excellent.

301
00:15:11,566 --> 00:15:16,166
But also remember when we visualize
the training set results with PCA, well,

302
00:15:16,166 --> 00:15:19,400
we had incorrect predictions,
so we were just a little lucky

303
00:15:19,400 --> 00:15:22,400
to get zero
incorrect predictions with PCA.

304
00:15:22,566 --> 00:15:24,933
So all right
so that's still excellent results.

305
00:15:24,933 --> 00:15:27,500
And now let's visualize
the training set results.

306
00:15:27,500 --> 00:15:30,066
Now do we need to change
something in the section.

307
00:15:30,066 --> 00:15:34,866
Well try to figure out if yes we do
because it's important to understand this.

308
00:15:34,866 --> 00:15:36,200
In case you use this code

309
00:15:36,200 --> 00:15:39,900
section here to visualize the results
on your problem, on your data set.

310
00:15:40,400 --> 00:15:45,833
Well, the answer is this time, yes,
because this line of code here call names

311
00:15:46,300 --> 00:15:50,100
expects to have the real name
of your independent

312
00:15:50,100 --> 00:15:53,333
variables,
the new extracted features L1 and L2.

313
00:15:53,533 --> 00:15:58,000
So here we need to replace PC1 and PC2
by respectively

314
00:15:58,600 --> 00:16:04,066
x dot L1, because that's
the name of the first extracted feature,

315
00:16:04,066 --> 00:16:08,000
the first new independent variable
and x dot L2.

316
00:16:08,100 --> 00:16:11,700
The second new extracted feature,
the second new independent variable.

317
00:16:12,100 --> 00:16:15,333
So let's replace them PC1

318
00:16:15,600 --> 00:16:18,600
by x dot L1

319
00:16:18,866 --> 00:16:21,866
and pc2 by x dot ld two.

320
00:16:22,200 --> 00:16:23,266
So that's very important.

321
00:16:23,266 --> 00:16:26,233
That's the only thing that you will need
to change the cool names.

322
00:16:26,233 --> 00:16:29,233
You must have the real names
of your independent variables

323
00:16:29,366 --> 00:16:31,033
when you visualize the results.

324
00:16:31,033 --> 00:16:33,266
And then do
we need to change something else.

325
00:16:33,266 --> 00:16:35,400
Well this we can also change.

326
00:16:35,400 --> 00:16:36,466
But that's not compulsory.

327
00:16:36,466 --> 00:16:40,300
That's just for the labels of the x axis
and the y axis.

328
00:16:40,500 --> 00:16:41,733
So let's do it anyway.

329
00:16:41,733 --> 00:16:45,200
And this time we don't need to specify
the real name of the independent variable.

330
00:16:45,400 --> 00:16:49,033
We can just replace
PC one by L2 one to specify

331
00:16:49,033 --> 00:16:52,433
that it's the linear discriminant
and not the principal component.

332
00:16:52,833 --> 00:16:55,966
And same here we can replace PC2 by L2.

333
00:16:56,600 --> 00:16:57,000
All right.

334
00:16:57,000 --> 00:16:57,900
And now that's done.

335
00:16:57,900 --> 00:17:00,133
Now this code is ready to be executed.

336
00:17:00,133 --> 00:17:05,066
So let's make the quick same changes
for the visualization of the test results.

337
00:17:05,366 --> 00:17:09,100
So let's replace PC1 here by x dot LD one

338
00:17:09,766 --> 00:17:13,233
then PC two by x dot LD two

339
00:17:13,500 --> 00:17:19,400
and replace pc1 by LD1 and PC2 by two.

340
00:17:19,766 --> 00:17:21,333
And now everything is ready.

341
00:17:21,333 --> 00:17:23,166
We don't need to change anything more.

342
00:17:23,166 --> 00:17:24,600
We can grab a cup of coffee

343
00:17:24,600 --> 00:17:28,366
and visualize the training set results
and test set results.

344
00:17:28,566 --> 00:17:29,900
So let's do it.

345
00:17:29,900 --> 00:17:31,800
Let's hope that everything is okay.

346
00:17:31,800 --> 00:17:35,133
So I'm going to select
all the section up to here.

347
00:17:35,600 --> 00:17:37,800
So that's to visualize the training set
results.

348
00:17:37,800 --> 00:17:40,433
Let's do it. And here we go.

349
00:17:40,433 --> 00:17:42,300
All right. So it's executing.

350
00:17:42,300 --> 00:17:44,366
It's always taking a little time.

351
00:17:44,366 --> 00:17:45,500
But we will get there.

352
00:17:45,500 --> 00:17:47,866
We can already click on plots.

353
00:17:47,866 --> 00:17:48,200
All right.

354
00:17:48,200 --> 00:17:51,166
So the computations
are being made. Almost done.

355
00:17:52,966 --> 00:17:54,833
And here are the results.

356
00:17:54,833 --> 00:17:56,200
Beautiful results.

357
00:17:56,200 --> 00:17:58,966
The three classes were almost
well separated.

358
00:17:58,966 --> 00:18:00,166
Perfectly well separated.

359
00:18:00,166 --> 00:18:02,800
We can see the incorrect prediction.
But be careful.

360
00:18:02,800 --> 00:18:05,800
That's not the same incorrect prediction
we saw with the confusion

361
00:18:05,800 --> 00:18:08,900
matrix
because that concerns the test set here.

362
00:18:08,900 --> 00:18:09,733
It's the training set.

363
00:18:09,733 --> 00:18:13,566
So we also have one incorrect prediction
on the training set.

364
00:18:13,566 --> 00:18:14,400
That's this one.

365
00:18:14,400 --> 00:18:16,066
But that's almost perfect

366
00:18:16,066 --> 00:18:19,066
which is quite intuitive to understand
because as you remember

367
00:18:19,266 --> 00:18:23,700
LDA tries to separate the most
the classes of your dependent variable.

368
00:18:24,000 --> 00:18:26,966
So that's why here we can see that
the prediction boundary

369
00:18:26,966 --> 00:18:29,833
is kind of equidistant to the majority

370
00:18:29,833 --> 00:18:32,833
of the green points here,
and the majority of the blue points here.

371
00:18:33,200 --> 00:18:34,666
So that's perfect.

372
00:18:34,666 --> 00:18:37,966
Each one is in its correct
segment of customers.

373
00:18:38,200 --> 00:18:41,200
And therefore this one business owner
can feel pretty confident

374
00:18:41,500 --> 00:18:44,233
at predicting for each new wine to which

375
00:18:44,233 --> 00:18:47,233
customer segment he should recommend it.

376
00:18:47,266 --> 00:18:50,500
And not only he can be pretty confident
at recommending the new wines

377
00:18:50,500 --> 00:18:53,933
to the right customers,
but also thanks to the feature extraction

378
00:18:53,933 --> 00:18:57,633
technique that allows him to visualize
its results in two dimensions.

379
00:18:57,900 --> 00:19:00,900
Thanks to this number
of two new independent variables

380
00:19:01,066 --> 00:19:02,533
the linear discriminant.

381
00:19:02,533 --> 00:19:04,400
Well, now this wine business owner

382
00:19:04,400 --> 00:19:08,166
can make a clear plot
of its different segment of customers,

383
00:19:08,166 --> 00:19:11,433
and putting in each segment of customers
the different wines.

384
00:19:11,733 --> 00:19:14,733
So eventually
that can be pretty convenient.

385
00:19:14,900 --> 00:19:16,200
Okay, so perfect.

386
00:19:16,200 --> 00:19:20,466
We managed to build a great LDA model,
so we'll finish on this good note,

387
00:19:20,700 --> 00:19:23,533
and therefore we'll move on
to the next section of this course,

388
00:19:23,533 --> 00:19:26,533
which is going to be another feature
extraction technique.

389
00:19:26,566 --> 00:19:29,566
But this time
adapt it for nonlinear problems.

390
00:19:29,733 --> 00:19:32,566
So since this problem is obviously

391
00:19:32,566 --> 00:19:36,600
a linear problem because we managed
to apply very successfully

392
00:19:36,600 --> 00:19:40,566
linear models as PCA and LDA,
well, it's won't be relevant

393
00:19:40,566 --> 00:19:44,466
to apply a nonlinear feature
extraction model on this data set.

394
00:19:44,866 --> 00:19:48,800
So we will work on another data set
that is of course going to be nonlinear.

395
00:19:49,333 --> 00:19:51,833
And this next new feature
extraction technique

396
00:19:51,833 --> 00:19:54,900
that we're going to see
is going to be kernel PCA.

397
00:19:55,233 --> 00:19:57,666
So I look forward to studying this
in the next section.

398
00:19:57,666 --> 00:19:59,433
And until then enjoy machine learning.