1
00:00:00,233 --> 00:00:02,500
Hello and welcome to this art tutorial.

2
00:00:02,500 --> 00:00:05,933
So you learned a lot of stuff,
but it would be a shame

3
00:00:05,933 --> 00:00:10,966
to leave this course without having some
introduction to one of the most popular

4
00:00:11,166 --> 00:00:14,900
algorithm in machine learning
that is quite recently popular.

5
00:00:14,900 --> 00:00:15,666
But still,

6
00:00:15,666 --> 00:00:20,133
it is definitely a very powerful model,
especially if you work on large data sets.

7
00:00:20,433 --> 00:00:24,166
It will offer you very high performance
while being fast to execute.

8
00:00:24,600 --> 00:00:27,600
And speaking of performance and execution
speed,

9
00:00:27,900 --> 00:00:32,400
it is important to remind that XGBoost
is the most powerful implementation

10
00:00:32,400 --> 00:00:37,066
of gradient boosting in terms of model
performance and execution speed.

11
00:00:37,366 --> 00:00:40,900
Therefore, it's very important
for you to have it in your toolkit.

12
00:00:41,266 --> 00:00:43,133
So let's implement XGBoost.

13
00:00:43,133 --> 00:00:47,066
This is only going to be an introduction,
so we will make a simple

14
00:00:47,066 --> 00:00:48,900
implementation of XGBoost.

15
00:00:48,900 --> 00:00:52,500
But you will have the code template
on your computer and you'll be able

16
00:00:52,500 --> 00:00:55,500
to try it on your problems
on your data sets.

17
00:00:55,533 --> 00:00:58,533
And you'll see that
even with this simple implementation,

18
00:00:58,633 --> 00:01:01,633
it will definitely give you
some excellent performance.

19
00:01:01,833 --> 00:01:05,100
And so now what we're going to do
is take one of the business problem

20
00:01:05,100 --> 00:01:06,733
we dealt with in this course.

21
00:01:06,733 --> 00:01:10,700
This is going to be actually the problem
that we solved in the deep

22
00:01:10,700 --> 00:01:11,700
learning section.

23
00:01:11,700 --> 00:01:14,000
Remember
this is the churn modeling problem

24
00:01:14,000 --> 00:01:17,000
where we need to predict
the customers of the bank.

25
00:01:17,033 --> 00:01:18,400
That will leave the bank.

26
00:01:18,400 --> 00:01:21,933
So this was a classification problem
where we classify the customers

27
00:01:21,933 --> 00:01:23,000
in two classes,

28
00:01:23,000 --> 00:01:26,500
those who will leave the bank
and those who will not leave the bank.

29
00:01:26,733 --> 00:01:30,766
And so remember for this problem
we obtained an accuracy of 86%.

30
00:01:31,133 --> 00:01:35,600
But that took quite a while because
we trained an artificial neural network

31
00:01:35,733 --> 00:01:39,533
with many epochs, and therefore it took
quite some time to execute.

32
00:01:40,100 --> 00:01:42,600
And so now in this section
we're going to do the same.

33
00:01:42,600 --> 00:01:46,200
We're going to apply XGBoost
on this churn modeling problem.

34
00:01:46,333 --> 00:01:47,766
This data set contains,

35
00:01:47,766 --> 00:01:51,366
if I remember, 13 features,
but that's not a large data set.

36
00:01:51,633 --> 00:01:55,733
And what is important to highlight
is that even if this was a large data set,

37
00:01:55,733 --> 00:01:59,933
a very large data set, well, XGBoost
would be one of the best model

38
00:01:59,933 --> 00:02:05,400
in terms of performance, that is to get
a good accuracy and execution speed.

39
00:02:05,700 --> 00:02:06,500
So for example,

40
00:02:06,500 --> 00:02:11,333
if you are working with a large data set,
I strongly encourage you to test XGBoost.

41
00:02:12,400 --> 00:02:12,733
All right.

42
00:02:12,733 --> 00:02:16,966
So what we're going to do now
is take the pre-processing phase.

43
00:02:16,966 --> 00:02:19,966
So there is only part one
because part two is to implement

44
00:02:20,033 --> 00:02:21,900
the artificial neural network.

45
00:02:21,900 --> 00:02:25,933
And so we just want to preprocess the data
for this churn modeling problem

46
00:02:26,266 --> 00:02:29,266
associated to this churn modeling
CSV file.

47
00:02:29,833 --> 00:02:34,400
But actually we're not going to take
everything in this pre-processing phase.

48
00:02:34,700 --> 00:02:37,566
The reason is that for the artificial

49
00:02:37,566 --> 00:02:40,700
neural network, well, feature
scaling was totally compulsory.

50
00:02:41,000 --> 00:02:42,166
No questions asked.

51
00:02:42,166 --> 00:02:45,266
Feature scaling
must be applied for deep learning.

52
00:02:45,633 --> 00:02:50,600
But the good news is that for XGBoost,
well, since XGBoost is a gradient

53
00:02:50,600 --> 00:02:51,966
boosting model with decision

54
00:02:51,966 --> 00:02:55,600
trees, well, accordingly
feature scaling is totally unnecessary.

55
00:02:55,833 --> 00:02:58,566
And that's one of the very good thing
about XGBoost.

56
00:02:58,566 --> 00:03:02,133
Besides, it's high performance
and it's fast execution speed.

57
00:03:02,433 --> 00:03:05,933
It's that you can keep the interpretation
of your problem,

58
00:03:06,200 --> 00:03:09,866
of your data set, and of the results
you'll get after building the model.

59
00:03:10,600 --> 00:03:14,000
So we can understand now
why XGBoost is so popular.

60
00:03:14,133 --> 00:03:18,433
It's because it has the three qualities
first quality, high performance,

61
00:03:18,433 --> 00:03:22,166
second quality,
first execution, speed, and third quality.

62
00:03:22,366 --> 00:03:25,733
You can keep all the interpretation
of your problem and your model.

63
00:03:26,133 --> 00:03:29,133
So definitely a model
to have in your toolkit.

64
00:03:29,200 --> 00:03:29,500
All right.

65
00:03:29,500 --> 00:03:32,100
So feature scaling is unnecessary here.

66
00:03:32,100 --> 00:03:37,233
And therefore we will take everything
from here up to the top like this.

67
00:03:37,766 --> 00:03:38,666
Copy.

68
00:03:38,666 --> 00:03:41,700
And we'll paste that in our XGBoost file.

69
00:03:42,233 --> 00:03:42,733
Here we go.

70
00:03:42,733 --> 00:03:45,733
And now we can implement XGBoost.

71
00:03:45,733 --> 00:03:48,733
So first let's introduce new section
fitting

72
00:03:49,500 --> 00:03:51,800
XGBoost to the training set.

73
00:03:54,700 --> 00:03:55,500
All right.

74
00:03:55,500 --> 00:03:58,500
And first let's install XGBoost.

75
00:03:58,666 --> 00:04:00,966
So as usual there's a package.

76
00:04:00,966 --> 00:04:05,466
It's XGBoost package that will allow us
to implement XGBoost very efficiently.

77
00:04:05,733 --> 00:04:06,700
So let's type here.

78
00:04:06,700 --> 00:04:10,200
As usual install dot packages

79
00:04:10,533 --> 00:04:13,200
and inside the name of the extra
boost package,

80
00:04:13,200 --> 00:04:16,100
which is simply XGBoost.

81
00:04:16,100 --> 00:04:17,200
Like this.

82
00:04:17,200 --> 00:04:19,733
So then you select this line and press
Command

83
00:04:19,733 --> 00:04:22,733
and Control plus enter to execute.

84
00:04:22,766 --> 00:04:25,766
And that's installing the XGBoost package.

85
00:04:26,033 --> 00:04:28,800
All right. We can see it's processing.

86
00:04:28,800 --> 00:04:30,633
And here again it's downloaded.

87
00:04:30,633 --> 00:04:33,600
Binary packages are in this package
folder.

88
00:04:33,600 --> 00:04:36,266
All good XGBoost is installed.

89
00:04:36,266 --> 00:04:39,266
So let's put that section in comment

90
00:04:40,333 --> 00:04:41,100
there.

91
00:04:41,100 --> 00:04:45,900
And now let's import the execute package
because indeed we installed it.

92
00:04:46,200 --> 00:04:51,200
But if we go down to the bottom
XGBoost is installed but not imported.

93
00:04:51,533 --> 00:04:53,133
And we want to make it automatic.

94
00:04:53,133 --> 00:04:58,166
So as usual we use the command library
and inside XGBoost

95
00:04:58,833 --> 00:05:01,433
and that will import the package.

96
00:05:01,433 --> 00:05:02,133
All right.

97
00:05:02,133 --> 00:05:04,833
And now let's implement XGBoost.

98
00:05:04,833 --> 00:05:07,166
And actually
this is going to take one line

99
00:05:07,166 --> 00:05:11,366
because we're just going to make the
classifier the XGBoost classifier itself.

100
00:05:11,700 --> 00:05:14,366
And so basically
we just need to create a new variable

101
00:05:14,366 --> 00:05:18,533
that we call as usual classifier
and then equals.

102
00:05:18,533 --> 00:05:22,266
And then we use the XGBoost function
from this XGBoost package.

103
00:05:22,533 --> 00:05:25,766
So XGBoost and parenthesis

104
00:05:26,166 --> 00:05:29,066
and let's click here

105
00:05:29,066 --> 00:05:32,966
press F1 and get some information
about this execute function.

106
00:05:33,533 --> 00:05:36,533
So the information is
we are interested in the arguments

107
00:05:36,600 --> 00:05:40,066
and what arguments do we need here okay.

108
00:05:40,066 --> 00:05:40,966
So first we see that

109
00:05:40,966 --> 00:05:45,400
we have this params parameter
which is actually a list of parameters.

110
00:05:45,800 --> 00:05:48,966
And these parameters are
all the parameters that you can see here.

111
00:05:49,200 --> 00:05:52,733
For example the eta parameter
that controls the learning rate,

112
00:05:53,100 --> 00:05:56,100
the gamma parameter
which is the minimum loss reduction.

113
00:05:56,266 --> 00:05:59,000
Well,
you have a lot of these parameters, but

114
00:05:59,000 --> 00:06:02,700
this tutorial is
just an introduction of XGBoost.

115
00:06:02,700 --> 00:06:06,533
So we will not do some tuning
on our XGBoost model in this course.

116
00:06:06,800 --> 00:06:10,600
But I'm sure in some future courses
I will make some more complex

117
00:06:10,600 --> 00:06:14,166
implementations of XGBoost
on some more complex problems,

118
00:06:14,600 --> 00:06:18,433
which in this course is just to end
with a simple introduction of boost

119
00:06:18,600 --> 00:06:21,266
so that you can at least
have some knowledge about it

120
00:06:21,266 --> 00:06:23,266
and have it in your toolkit.

121
00:06:23,266 --> 00:06:26,566
So let's not focus on this now,
and let's move on to the compulsory

122
00:06:26,566 --> 00:06:30,633
parameters
that are of course the first one is data.

123
00:06:31,100 --> 00:06:33,733
So data is of course your training set

124
00:06:33,733 --> 00:06:36,700
the data sets on which you want
to train your XGBoost model.

125
00:06:36,700 --> 00:06:38,500
And so let's import that right now.

126
00:06:38,500 --> 00:06:43,533
So first argument data equals
then training set.

127
00:06:44,400 --> 00:06:44,933
Here we go.

128
00:06:44,933 --> 00:06:48,600
And actually here we only need
the features in the training set.

129
00:06:48,600 --> 00:06:51,633
So we will remove the dependent variable
from this training set.

130
00:06:51,633 --> 00:06:55,900
Because this training set contains both
the features and the dependent variable.

131
00:06:56,233 --> 00:07:00,133
But what this data parameter expects
is only the features.

132
00:07:00,366 --> 00:07:04,666
So here we add some brackets
and we remove that have been invaluable.

133
00:07:05,033 --> 00:07:06,300
And what is this index.

134
00:07:06,300 --> 00:07:09,366
Well to do this
we need to import the data set.

135
00:07:09,366 --> 00:07:11,000
But first before importing the data

136
00:07:11,000 --> 00:07:14,100
set let's quickly set the right
folder as working directory.

137
00:07:14,333 --> 00:07:18,133
So right now
we're importing then section 49 XGBoost.

138
00:07:18,433 --> 00:07:19,533
That's the right folder.

139
00:07:19,533 --> 00:07:21,666
Make sure that you have the churn modeling
CSV file.

140
00:07:21,666 --> 00:07:24,666
And then click on Sets Working
Directory here.

141
00:07:24,900 --> 00:07:25,733
And here we go.

142
00:07:25,733 --> 00:07:27,766
Now we can import the data set.

143
00:07:27,766 --> 00:07:30,733
So let's import it.
And that's the data set.

144
00:07:30,733 --> 00:07:33,900
But remember in this data sets
we don't take all the independent

145
00:07:33,900 --> 00:07:38,266
variables because we're not interested
in row number customer ID and surname.

146
00:07:38,266 --> 00:07:42,666
We know that these three variables
have no impact on the dependent variable.

147
00:07:42,900 --> 00:07:44,966
So we remove them.

148
00:07:44,966 --> 00:07:46,866
And that's what we do in this line.

149
00:07:46,866 --> 00:07:49,600
Data set equals data set for 14.

150
00:07:49,600 --> 00:07:54,200
That means that we take all the variables
from the fourth variable of the data set.

151
00:07:54,533 --> 00:07:58,866
That is credit score up to the last
variable, exited the dependent variable.

152
00:07:58,866 --> 00:08:00,500
That's the dependent variable.

153
00:08:00,500 --> 00:08:03,966
And so let's select this line and execute.

154
00:08:04,366 --> 00:08:09,833
And now if we look at our data set well
this contains all the relevant features.

155
00:08:10,133 --> 00:08:12,266
And the dependent variable exited.

156
00:08:12,266 --> 00:08:15,666
And so the challenge is
with all these independent variables here

157
00:08:15,900 --> 00:08:19,700
we want to predict if the customer
will leave or stay in the bank.

158
00:08:20,066 --> 00:08:21,566
And so that's the data set.

159
00:08:21,566 --> 00:08:25,333
We consider to train the model
and test its performance.

160
00:08:25,700 --> 00:08:29,300
And therefore the index of the dependent
variable we have to remove.

161
00:08:29,300 --> 00:08:32,366
Now in the XGBoost function for the data
parameter

162
00:08:32,366 --> 00:08:35,366
is the last index here
of the exited column.

163
00:08:35,433 --> 00:08:40,000
And since we have 11 variables,
well that index is 11.

164
00:08:40,800 --> 00:08:44,966
So let's go back to XGBoost
and let's go back to our function.

165
00:08:45,266 --> 00:08:49,000
And therefore here we have to input -11.

166
00:08:49,766 --> 00:08:50,100
All right.

167
00:08:50,100 --> 00:08:53,566
So we have our whole training set
but without the dependent variable.

168
00:08:53,566 --> 00:08:55,766
So that's perfect.
That's exactly what we want.

169
00:08:55,766 --> 00:08:57,133
So now let's go back to help

170
00:08:57,133 --> 00:09:00,233
to see if we need some more info
about this first parameter.

171
00:09:00,733 --> 00:09:04,500
Well indeed there is some very important
information that we need to consider here.

172
00:09:04,833 --> 00:09:10,466
It's that this input data
set needs to be an XGBoost data matrix.

173
00:09:10,466 --> 00:09:13,133
So that's basically a type of matrix.

174
00:09:13,133 --> 00:09:16,633
But we can also see that in addition data

175
00:09:16,633 --> 00:09:19,633
the data parameter also accepts matrix S.

176
00:09:20,200 --> 00:09:22,300
But this is not a matrix.

177
00:09:22,300 --> 00:09:24,000
This is a data frame.

178
00:09:24,000 --> 00:09:27,000
So this won't work
if we input the features this way.

179
00:09:27,166 --> 00:09:33,300
So we can either convert this
into an XGBoost matrix or a simple matrix.

180
00:09:33,500 --> 00:09:35,400
So let's take the simple solution.

181
00:09:35,400 --> 00:09:38,966
Let's convert
this DataFrame features into a matrix.

182
00:09:39,200 --> 00:09:40,600
And you know how to do this.

183
00:09:40,600 --> 00:09:43,600
We just need to use the as dot

184
00:09:43,600 --> 00:09:47,500
matrix function
and put inside some parenthesis.

185
00:09:47,500 --> 00:09:48,766
Because it's a function.

186
00:09:48,766 --> 00:09:51,033
This dataframe of features.

187
00:09:51,033 --> 00:09:51,666
Here we go.

188
00:09:51,666 --> 00:09:53,400
And now this becomes a matrix.

189
00:09:53,400 --> 00:09:55,533
And that's exactly what we need.

190
00:09:55,533 --> 00:09:58,200
All right perfect then next argument.

191
00:09:58,200 --> 00:10:00,733
So here again
you have a lot of other arguments.

192
00:10:00,733 --> 00:10:02,633
But these are not compulsory.

193
00:10:02,633 --> 00:10:04,400
So we won't focus on them now.

194
00:10:04,400 --> 00:10:08,266
But the next compulsory argument
is this label argument.

195
00:10:08,633 --> 00:10:11,633
Because indeed here
we input the matrix of features.

196
00:10:11,700 --> 00:10:14,566
But of course to train a classification
model we need

197
00:10:14,566 --> 00:10:17,933
not only the matrix of features
but also the dependent variable.

198
00:10:18,200 --> 00:10:21,533
And that's what we put
in this label parameter.

199
00:10:21,733 --> 00:10:26,066
And so as you might expect,
since we input the features into a matrix,

200
00:10:26,333 --> 00:10:30,266
well we need to input this label parameter
as a vector.

201
00:10:30,766 --> 00:10:33,600
And to get our dependent variable
as a vector,

202
00:10:33,600 --> 00:10:37,366
we need to input label equals
our training set.

203
00:10:38,200 --> 00:10:39,166
Then dollar.

204
00:10:39,166 --> 00:10:42,200
And then we take the name
of our dependent variable which is exited.

205
00:10:42,766 --> 00:10:45,000
And this will give us a vector.

206
00:10:45,000 --> 00:10:49,966
So training set our exited is the
dependent variable but given as a vector.

207
00:10:50,266 --> 00:10:51,400
So that's exactly what we need.

208
00:10:51,400 --> 00:10:55,400
Because indeed, as you can see
label is expected to be a vector.

209
00:10:55,800 --> 00:10:57,900
The vector of response values.

210
00:10:57,900 --> 00:11:01,500
The response values are of course
the values of the dependent variable.

211
00:11:02,333 --> 00:11:03,066
All right.

212
00:11:03,066 --> 00:11:05,266
Now next argument.

213
00:11:05,266 --> 00:11:06,566
What is the next argument.

214
00:11:06,566 --> 00:11:11,433
Well there is a third compulsory argument
that we need to input here.

215
00:11:11,433 --> 00:11:13,433
And that is actually above.

216
00:11:13,433 --> 00:11:17,133
But I wanted to put the label
after the matrix of features that made

217
00:11:17,400 --> 00:11:18,733
kind of sense.

218
00:11:18,733 --> 00:11:21,733
And now there is a third argument
that we need to input,

219
00:11:21,966 --> 00:11:24,200
which is the in rounds argument.

220
00:11:24,200 --> 00:11:27,900
And the in rounds argument
is the maximum number of iterations.

221
00:11:28,200 --> 00:11:31,166
So since we're not working on a
two complex problem,

222
00:11:31,166 --> 00:11:34,600
well, a maximum number of ten iterations
will be sufficient.

223
00:11:34,900 --> 00:11:36,166
So we will input here.

224
00:11:36,166 --> 00:11:39,166
And rounds equals ten.

225
00:11:39,400 --> 00:11:42,733
And XGBoost will be trained
in maximum ten iterations.

226
00:11:43,533 --> 00:11:44,333
Perfect.

227
00:11:44,333 --> 00:11:48,400
And now actually
this line of code is ready

228
00:11:48,400 --> 00:11:51,933
to be executed
to train the XGBoost classifier.

229
00:11:52,300 --> 00:11:55,533
So even if XGBoost is a very advanced

230
00:11:55,533 --> 00:11:58,633
machine learning problem, well,
thanks to this extra boost package,

231
00:11:58,900 --> 00:12:03,900
you just need a single simple line of code
to implement it very efficiently.

232
00:12:04,900 --> 00:12:07,300
All right,
we're not going to execute this line now

233
00:12:07,300 --> 00:12:10,500
because first we need to run the data
preprocessing phase.

234
00:12:10,500 --> 00:12:11,266
And then

235
00:12:11,266 --> 00:12:15,533
I would like to add some code sections
to evaluate our XGBoost model performance.

236
00:12:15,666 --> 00:12:18,666
So we are going to execute the whole thing
in the end.

237
00:12:18,666 --> 00:12:24,066
But for now let's add the last sections
to evaluate the boost performance.

238
00:12:24,066 --> 00:12:24,833
And of course

239
00:12:24,833 --> 00:12:29,333
we are going to take our k fold
cross-validation technique to evaluate it.

240
00:12:29,500 --> 00:12:32,500
And therefore here
I'm going to take the k fold

241
00:12:32,500 --> 00:12:35,500
cross validation section
which is right here.

242
00:12:35,633 --> 00:12:39,033
And we are going to use it
on our XGBoost model.

243
00:12:39,366 --> 00:12:41,733
So here I just need to copy this section.

244
00:12:41,733 --> 00:12:45,166
Go back to my exhibitor model and paste it
here.

245
00:12:45,666 --> 00:12:47,566
And be careful inside of it.

246
00:12:47,566 --> 00:12:49,333
We need to change the classifier

247
00:12:49,333 --> 00:12:52,200
because right here
that's the kernel SVM classifier.

248
00:12:52,200 --> 00:12:56,800
And so basically we just need to replace
this kernel SVM classifier

249
00:12:57,233 --> 00:13:00,400
by our XGBoost classifier.

250
00:13:00,733 --> 00:13:02,533
So I'm just copying that here.

251
00:13:02,533 --> 00:13:05,600
And go back to my k
fold cross-validation section

252
00:13:06,000 --> 00:13:11,466
and paste the code to train the XGBoost
classifier on the training set right here.

253
00:13:12,200 --> 00:13:12,600
All right.

254
00:13:12,600 --> 00:13:15,600
And then we need to add
another line of code inside this section.

255
00:13:15,900 --> 00:13:19,433
It's related to the fact
that this XGBoost model

256
00:13:19,433 --> 00:13:22,433
will return the predictions
as probabilities.

257
00:13:22,566 --> 00:13:25,566
You know it will return
the probability of class one.

258
00:13:25,666 --> 00:13:28,533
And therefore you know this trick
to convert

259
00:13:28,533 --> 00:13:32,066
the probabilities
into the real predictions 0 or 1.

260
00:13:32,400 --> 00:13:35,400
Well,
we need to add this line of code y pred

261
00:13:36,666 --> 00:13:40,200
equals and then parenthesis y pret

262
00:13:41,233 --> 00:13:44,266
larger than 0.5.

263
00:13:44,700 --> 00:13:49,533
So that's if the probability is
larger than 0.5 then y breath will be one.

264
00:13:49,833 --> 00:13:54,433
And if the probability is lower than 0.5,
then y pred will be zero.

265
00:13:54,733 --> 00:13:57,700
So that's where we'll get the binary
outcome 0 or 1.

266
00:13:57,700 --> 00:14:01,433
And that's exactly what this k fold
cross-validation section expects.

267
00:14:01,866 --> 00:14:04,733
And eventually
before we execute the whole thing,

268
00:14:04,733 --> 00:14:07,066
there are two things
that we still need to change.

269
00:14:07,066 --> 00:14:09,833
First,
it's the fact that since the training set

270
00:14:09,833 --> 00:14:13,500
is expected to be a matrix, well, that's
going to be the same for the test set.

271
00:14:13,766 --> 00:14:17,900
So here we also need to add as dot matrix.

272
00:14:18,266 --> 00:14:21,333
And inside of the parenthesis
we put our test fold.

273
00:14:22,066 --> 00:14:23,400
So that's the first change.

274
00:14:23,400 --> 00:14:25,366
And now the second change is of course

275
00:14:25,366 --> 00:14:28,366
related to the index
of the dependent variable.

276
00:14:28,366 --> 00:14:29,400
Because three.

277
00:14:29,400 --> 00:14:32,733
Here was the index of the dependent
variable in our previous problem

278
00:14:33,033 --> 00:14:35,700
where we implemented
k fold cross-validation.

279
00:14:35,700 --> 00:14:39,600
So we need to replace this three index
by the index of the dependent

280
00:14:39,600 --> 00:14:43,266
variable in our new problem
which is not three but 11.

281
00:14:43,666 --> 00:14:48,400
And same right here in the confusion
matrix it is 11.

282
00:14:49,200 --> 00:14:49,800
All right.

283
00:14:49,800 --> 00:14:51,633
And now everything is ready.

284
00:14:51,633 --> 00:14:53,766
We can execute the whole code.

285
00:14:53,766 --> 00:14:54,933
So let's do it.

286
00:14:54,933 --> 00:14:57,566
And let's see which accuracy we get.

287
00:14:57,566 --> 00:14:59,966
So let's go back to the top.

288
00:14:59,966 --> 00:15:01,966
We already imported the data set.

289
00:15:01,966 --> 00:15:06,500
So now let's encode
the categorical variables as vectors.

290
00:15:06,933 --> 00:15:08,700
Here we go. Done.

291
00:15:08,700 --> 00:15:11,100
Now let's split
the data sets into the training set.

292
00:15:11,100 --> 00:15:14,200
And the test set. Here
we go. Done as well.

293
00:15:14,733 --> 00:15:17,933
And now let's fit
the XGBoost to the training set.

294
00:15:18,300 --> 00:15:20,700
So the extra boost package
was already imported.

295
00:15:20,700 --> 00:15:25,866
So basically we just need
to select this line and execute.

296
00:15:26,400 --> 00:15:27,166
Here we go.

297
00:15:27,166 --> 00:15:30,733
We get the information of the root
mean squared error at each round.

298
00:15:30,966 --> 00:15:33,366
So basically the root
mean squared error is irrelevant.

299
00:15:33,366 --> 00:15:34,833
Computation of the error.

300
00:15:34,833 --> 00:15:36,566
You can picture this as the error.

301
00:15:36,566 --> 00:15:40,266
And of course the lower is the error
the better is your model.

302
00:15:40,566 --> 00:15:43,966
And we can see that
from the first iteration to the last one.

303
00:15:43,966 --> 00:15:45,100
The 10th one.

304
00:15:45,100 --> 00:15:49,666
Well the error decreased from
oh point 41 down to oh point 29.

305
00:15:49,800 --> 00:15:52,800
And besides we can see that
the maximum number

306
00:15:52,800 --> 00:15:56,566
of ten iterations was a good choice,
because we can see that

307
00:15:56,566 --> 00:16:00,066
it is more or less converging around
oh point 30.

308
00:16:00,400 --> 00:16:03,400
Well, feel free
to try with more iterations and try to see

309
00:16:03,400 --> 00:16:06,400
if it's converging to a number
that is less than 30.

310
00:16:06,533 --> 00:16:10,566
If you get a number close to oh point 30,
then ten iterations was a good choice.

311
00:16:11,166 --> 00:16:15,200
So perfect XGBoost is implemented
and trained on the training set.

312
00:16:15,500 --> 00:16:18,200
And now let's apply k
fold cross-validation

313
00:16:18,200 --> 00:16:21,566
to evaluate its performance
with the accuracy metric.

314
00:16:22,066 --> 00:16:25,066
And actually I'm noticing
that there is still one thing to change.

315
00:16:25,066 --> 00:16:27,166
It's the name of the dependent variable
here.

316
00:16:27,166 --> 00:16:30,666
Congratulations to those of you
who noticed that we need to replace

317
00:16:30,666 --> 00:16:34,366
project here by the real name
of the dependent variable in our problem,

318
00:16:34,666 --> 00:16:37,666
which is not purchased but exited.

319
00:16:37,966 --> 00:16:41,400
So let's replace
purchase here by accident.

320
00:16:42,366 --> 00:16:43,133
And here we go.

321
00:16:43,133 --> 00:16:45,166
Now everything should be fine.

322
00:16:45,166 --> 00:16:48,500
Let's do one last check as matrix
for the training set

323
00:16:48,500 --> 00:16:52,500
as matrix for the test
set Y pred converted into a binary outcome

324
00:16:52,500 --> 00:16:55,933
0 or 1 indexes are correct
for the dependent variable.

325
00:16:56,300 --> 00:16:57,533
Everything looks fine.

326
00:16:57,533 --> 00:17:00,566
Let's select this whole section here

327
00:17:00,900 --> 00:17:04,933
and get the ultimate accuracy
of our executed model.

328
00:17:05,266 --> 00:17:06,766
Here we go.

329
00:17:06,766 --> 00:17:09,033
All executed properly and very fast

330
00:17:09,033 --> 00:17:13,133
and we get a final accuracy of 88%.

331
00:17:13,533 --> 00:17:18,066
So not only that was very efficient,
but also we managed to beat the accuracy

332
00:17:18,066 --> 00:17:23,266
obtained with and and besides this value
is the relevant accuracy of XGBoost.

333
00:17:23,266 --> 00:17:26,366
So we can trust this value of 88%.

334
00:17:26,633 --> 00:17:27,700
So that's very good.

335
00:17:27,700 --> 00:17:32,100
Not only XGBoost was very fast,
but also it gave us an amazing accuracy.

336
00:17:32,100 --> 00:17:35,600
Probably the best of all the models
we implemented in this course.

337
00:17:36,100 --> 00:17:38,500
So that was an amazing job.

338
00:17:38,500 --> 00:17:40,800
And now it is time to say goodbye,

339
00:17:40,800 --> 00:17:43,766
because this was actually
the last tutorial of this course.

340
00:17:43,766 --> 00:17:46,766
So that's quite a feeling
because this is the end of this machine

341
00:17:46,766 --> 00:17:50,400
learning journey that I introduced
in my very first tutorial of this course.

342
00:17:50,600 --> 00:17:53,466
So yes, that's right,
that's the end of the journey.

343
00:17:53,466 --> 00:17:57,000
However, I am sure this is not the last
machine learning journey.

344
00:17:57,300 --> 00:17:59,400
This is your first machine
learning journey.

345
00:17:59,400 --> 00:18:01,800
I was so happy to take this adventure
with you.

346
00:18:01,800 --> 00:18:03,066
I really enjoyed that journey.

347
00:18:03,066 --> 00:18:04,700
I hope that's the case for you too

348
00:18:04,700 --> 00:18:06,633
and I'll be very happy
to make some new machine

349
00:18:06,633 --> 00:18:09,666
learning courses to start some new machine
learning journeys.

350
00:18:09,933 --> 00:18:11,633
So I hope I'll see you very soon.

351
00:18:11,633 --> 00:18:13,466
And until then, enjoy machine learning.