1
00:00:00,233 --> 00:00:02,533
Hello and welcome to this art tutorial.

2
00:00:02,533 --> 00:00:03,266
So we just.

3
00:00:03,266 --> 00:00:06,266
Trained our artificial neural network
on the training set,

4
00:00:06,433 --> 00:00:08,666
and now it's time to make the predictions
on the.

5
00:00:08,666 --> 00:00:09,600
Test set.

6
00:00:09,600 --> 00:00:13,233
So lucky for us,
we already have everything ready here,

7
00:00:13,266 --> 00:00:15,366
thanks to our classification. Templates.

8
00:00:15,366 --> 00:00:17,433
That we pasted in the first tutorial.

9
00:00:17,433 --> 00:00:20,100
So actually this. Section predicts.

10
00:00:20,100 --> 00:00:21,566
The test set results.

11
00:00:21,566 --> 00:00:24,233
And this section makes the confusion
matrix.

12
00:00:24,233 --> 00:00:25,233
Thanks to which we.

13
00:00:25,233 --> 00:00:26,400
Will obtain the.

14
00:00:26,400 --> 00:00:28,433
Accuracy. On the. Test set.

15
00:00:28,433 --> 00:00:31,400
That is some accuracy. On you.
Observations.

16
00:00:31,400 --> 00:00:31,933
On which.

17
00:00:31,933 --> 00:00:32,600
Artificial.

18
00:00:32,600 --> 00:00:35,033
Neural network model wasn't trained.

19
00:00:35,033 --> 00:00:36,833
So first let's take care of this.

20
00:00:36,833 --> 00:00:38,366
Section and.

21
00:00:38,366 --> 00:00:39,966
Let's. See what we need to change.

22
00:00:39,966 --> 00:00:44,200
So first of all this first line
gets the predicted probabilities

23
00:00:44,200 --> 00:00:46,033
thanks to this predict function.

24
00:00:46,033 --> 00:00:47,133
But that was the.

25
00:00:47,133 --> 00:00:49,133
Predict function used. For.

26
00:00:49,133 --> 00:00:50,966
Built in our. Packages.

27
00:00:50,966 --> 00:00:52,200
But here since we are using.

28
00:00:52,200 --> 00:00:54,966
The H2O package,
that is kind of special. There are.

29
00:00:54,966 --> 00:00:57,166
Actually some things
that we need to change here.

30
00:00:57,166 --> 00:00:58,766
But a very few things.

31
00:00:58,766 --> 00:01:00,800
So first, as for.

32
00:01:00,800 --> 00:01:01,666
All the functions we.

33
00:01:01,666 --> 00:01:04,433
Used with this H2O package.

34
00:01:04,433 --> 00:01:06,866
Well, you notice
that when we use the function we.

35
00:01:06,866 --> 00:01:08,533
First take the H2O package.

36
00:01:08,533 --> 00:01:10,733
Then a dot
and then the name of the function.

37
00:01:10,733 --> 00:01:13,200
Well we need to do the same. Here
for the predict.

38
00:01:13,200 --> 00:01:13,800
Function.

39
00:01:13,800 --> 00:01:19,033
So here we just need to add h to zero dot
predict.

40
00:01:19,500 --> 00:01:21,700
Okay. So that's the first thing
we need to change.

41
00:01:21,700 --> 00:01:24,100
And then let's see.
Let's go inside the function.

42
00:01:24,100 --> 00:01:26,700
So the first. Argument. Is classifier.

43
00:01:26,700 --> 00:01:28,233
Let's press here.

44
00:01:28,233 --> 00:01:31,033
F1 to get some information about.

45
00:01:31,033 --> 00:01:32,200
The predict.

46
00:01:32,200 --> 00:01:35,100
Function of the. H2O. Model.

47
00:01:35,100 --> 00:01:38,100
So let's scroll down to have a look
at the arguments and let's see.

48
00:01:38,100 --> 00:01:39,466
What they are.

49
00:01:39,466 --> 00:01:40,666
So as we can see we have.

50
00:01:40,666 --> 00:01:44,100
Only two main arguments
and then some additional arguments.

51
00:01:44,333 --> 00:01:46,800
But that we. Will not focus on.

52
00:01:46,800 --> 00:01:49,400
Instead, we will focus on the two
main arguments here.

53
00:01:49,400 --> 00:01:52,400
Which are the object. And. Mu data.

54
00:01:52,466 --> 00:01:54,100
So the first thing we can see.

55
00:01:54,100 --> 00:01:56,966
Is that there is no type. Argument.

56
00:01:56,966 --> 00:01:57,900
So here simply.

57
00:01:57,900 --> 00:02:00,000
We will remove. This type equals.

58
00:02:00,000 --> 00:02:02,233
Response argument and input.

59
00:02:02,233 --> 00:02:05,033
Because we actually. Don't need it.

60
00:02:05,033 --> 00:02:05,466
All right.

61
00:02:05,466 --> 00:02:07,133
And now we are left. With the.

62
00:02:07,133 --> 00:02:09,533
Two arguments we. Are required. To input.

63
00:02:09,533 --> 00:02:12,533
That is the object
which is our classifier here.

64
00:02:12,533 --> 00:02:13,300
That is the A.

65
00:02:13,300 --> 00:02:15,700
And model. Itself that we have just built.

66
00:02:15,700 --> 00:02:17,000
On the training set.

67
00:02:17,000 --> 00:02:19,300
And then the second argument, new data.

68
00:02:19,300 --> 00:02:21,333
And this. New data. Argument is expecting.

69
00:02:21,333 --> 00:02:22,400
Of course, the.

70
00:02:22,400 --> 00:02:24,600
Observations of which it has to make the.

71
00:02:24,600 --> 00:02:25,800
Predictions.

72
00:02:25,800 --> 00:02:27,833
All right. So that's exactly our test set.

73
00:02:27,833 --> 00:02:29,200
And here we remove.

74
00:02:29,200 --> 00:02:30,433
The dependent variable.

75
00:02:30,433 --> 00:02:33,433
Column thanks to this minus three. Here.

76
00:02:33,566 --> 00:02:36,000
But we need to replace this three because.

77
00:02:36,000 --> 00:02:37,733
This number three. Here corresponds.

78
00:02:37,733 --> 00:02:38,966
To the index. Of the.

79
00:02:38,966 --> 00:02:39,333
Dependent.

80
00:02:39,333 --> 00:02:42,766
Variable of the data
set that we worked with in part three.

81
00:02:42,766 --> 00:02:43,900
Classification.

82
00:02:43,900 --> 00:02:46,900
And here of course the index of our
dependent variable is not three.

83
00:02:47,133 --> 00:02:49,066
But is. 11.

84
00:02:49,066 --> 00:02:50,500
Remember we. Already replaced.

85
00:02:50,500 --> 00:02:52,266
The index three here in this.

86
00:02:52,266 --> 00:02:53,766
Feature scaling part.

87
00:02:53,766 --> 00:02:58,400
So we replaced the four indexes three
that were here by 11.

88
00:02:58,400 --> 00:03:00,066
And so here we need to do the same.

89
00:03:00,066 --> 00:03:03,100
We will replace this three index here by.

90
00:03:03,333 --> 00:03:05,200
Index. 11.

91
00:03:05,200 --> 00:03:05,733
All right.

92
00:03:05,733 --> 00:03:06,900
So now this.

93
00:03:06,900 --> 00:03:09,400
Is taking the test set observations
as. New data.

94
00:03:09,400 --> 00:03:13,666
That is it will predict the probabilities
that the dependent variable

95
00:03:13,666 --> 00:03:15,066
exited equals one.

96
00:03:15,066 --> 00:03:17,100
For the observations in the test set.

97
00:03:17,100 --> 00:03:17,833
And therefore.

98
00:03:17,833 --> 00:03:19,766
It will predict for each customer.
In the test.

99
00:03:19,766 --> 00:03:22,800
Set the probability
that this customer leaves the.

100
00:03:22,800 --> 00:03:24,166
Bank. And since we.

101
00:03:24,166 --> 00:03:28,800
Have the real results of whether the
customers of the test set left or stayed.

102
00:03:28,800 --> 00:03:32,100
In the bank, well,
we will compare our predictions to.

103
00:03:32,100 --> 00:03:36,400
These real results, these actual results,
and that's how we'll get the accuracy.

104
00:03:36,600 --> 00:03:38,166
By computing the number of correct

105
00:03:38,166 --> 00:03:41,733
predictions divided by the total number
of observations in the test set.

106
00:03:42,000 --> 00:03:43,800
That is, two. Thousand.

107
00:03:43,800 --> 00:03:46,000
And then if we get. A good accuracy.

108
00:03:46,000 --> 00:03:48,266
Then maybe we'll get a good
and powerful model.

109
00:03:48,266 --> 00:03:50,433
And if that's the. Case,
we will give it to.

110
00:03:50,433 --> 00:03:54,233
The bank on the plate and tell the bank,
okay, now you can rank.

111
00:03:54,533 --> 00:03:57,000
All your customers,
all. The customers in the bank.

112
00:03:57,000 --> 00:03:59,300
By their probability to leave the bank.

113
00:03:59,300 --> 00:04:01,200
That is, for each of your customers.

114
00:04:01,200 --> 00:04:03,800
You can predict with a good accuracy
and will be.

115
00:04:03,800 --> 00:04:06,566
Able to tell them precisely.
Where this accuracy is.

116
00:04:06,566 --> 00:04:07,533
You'll be able to predict

117
00:04:07,533 --> 00:04:11,100
with a good accuracy the probability
that the customer leaves the bank.

118
00:04:11,333 --> 00:04:12,566
And then you can add.

119
00:04:12,566 --> 00:04:14,766
Therefore, I. Can give you a ranking of.

120
00:04:14,766 --> 00:04:18,300
All your customers ranked
by their probability to leave the bank.

121
00:04:18,300 --> 00:04:18,600
From.

122
00:04:18,600 --> 00:04:21,566
The highest probability
to the lowest probability.

123
00:04:21,566 --> 00:04:25,466
And therefore you can do some customers
segmentation and consider, for example,

124
00:04:25,466 --> 00:04:27,233
the top 10%. Probabilities.

125
00:04:27,233 --> 00:04:29,100
That the customers leave the bank.

126
00:04:29,100 --> 00:04:31,333
And in this segment, you can analyze.

127
00:04:31,333 --> 00:04:34,366
Deeper
the factors that lead the customers.

128
00:04:34,366 --> 00:04:35,400
To leave the bank.

129
00:04:35,400 --> 00:04:37,266
By. Using some data mining techniques.

130
00:04:37,266 --> 00:04:40,266
Like for example,
doing a chi square test or.

131
00:04:40,266 --> 00:04:43,400
Applying the step summary function
on your independent variables

132
00:04:43,400 --> 00:04:46,533
to understand which independent variables
have the most impact

133
00:04:46,766 --> 00:04:47,766
on the dependent variable.

134
00:04:47,766 --> 00:04:48,833
That is, which.

135
00:04:48,833 --> 00:04:51,600
Independent variable explains the most.

136
00:04:51,600 --> 00:04:53,400
Why customers are leaving?

137
00:04:53,400 --> 00:04:55,033
Well, you know how to do that.

138
00:04:55,033 --> 00:04:57,500
That's exactly what. We did in part
two and three.

139
00:04:57,500 --> 00:04:59,700
When we use this summary function to get.

140
00:04:59,700 --> 00:05:00,166
The p.

141
00:05:00,166 --> 00:05:03,033
Values
and statistical significance levels.

142
00:05:03,033 --> 00:05:05,700
To see which. Independent variables
are the. Most.

143
00:05:05,700 --> 00:05:08,866
Optimistically significant and therefore
explain the best, the dependent.

144
00:05:08,866 --> 00:05:11,533
Variable.
That is why customers are leaving.

145
00:05:11,533 --> 00:05:12,966
So that's the purpose.

146
00:05:12,966 --> 00:05:15,300
Behind
making these predictions on the test. Set.

147
00:05:15,300 --> 00:05:16,400
It's just to get the.

148
00:05:16,400 --> 00:05:19,466
Accuracy on your observations
to validate the model. So.

149
00:05:19,466 --> 00:05:21,600
That we can give this model to the bank.

150
00:05:21,600 --> 00:05:21,900
All right.

151
00:05:21,900 --> 00:05:24,133
So now let's make the predictions.

152
00:05:24,133 --> 00:05:27,133
So we are almost done here.

153
00:05:27,166 --> 00:05:29,333
We just need to add. One more thing.

154
00:05:29,333 --> 00:05:30,233
Which is.

155
00:05:30,233 --> 00:05:31,200
Again related.

156
00:05:31,200 --> 00:05:34,066
To the fact
that we are using the H2O package.

157
00:05:34,066 --> 00:05:35,600
And as you can see in.

158
00:05:35,600 --> 00:05:40,033
This new data argument well this new data
is of course the test set.

159
00:05:40,366 --> 00:05:42,300
But this test set is expected to.

160
00:05:42,300 --> 00:05:44,333
Be an. H2O frame.

161
00:05:44,333 --> 00:05:46,233
Right now it is a standard data frame.

162
00:05:46,233 --> 00:05:48,066
But our H2O.

163
00:05:48,066 --> 00:05:51,066
Predict function
is expecting an H2O frame.

164
00:05:51,466 --> 00:05:53,500
So how can we convert. This test a data.

165
00:05:53,500 --> 00:05:56,000
Frame into a needs to frame?

166
00:05:56,000 --> 00:05:59,533
Well,
by doing exactly the same as what we did.

167
00:05:59,700 --> 00:06:00,800
To convert.

168
00:06:00,800 --> 00:06:01,766
This training.

169
00:06:01,766 --> 00:06:04,800
Set data frame into this. H2O. Frame.

170
00:06:05,100 --> 00:06:06,666
That is, by applying. On the test.

171
00:06:06,666 --> 00:06:10,666
Set the as dot h to O.

172
00:06:11,166 --> 00:06:12,433
Function.

173
00:06:12,433 --> 00:06:12,800
All right.

174
00:06:12,800 --> 00:06:15,333
So I'm putting. The test.
Set in the function.

175
00:06:15,333 --> 00:06:16,700
Like that.

176
00:06:16,700 --> 00:06:18,700
And here we. Go. Now I think.

177
00:06:18,700 --> 00:06:19,866
Everything is ready.

178
00:06:19,866 --> 00:06:24,166
We are ready to make the predictions
which so far will be the.

179
00:06:24,166 --> 00:06:27,066
Prediction of the. Probabilities
that the class equals one.

180
00:06:27,066 --> 00:06:30,066
That is, the probabilities
that the customers leave the bank.

181
00:06:30,366 --> 00:06:33,266
So let's select this.

182
00:06:33,266 --> 00:06:36,033
And get the. Predicted probabilities.

183
00:06:37,066 --> 00:06:38,266
And here we go.

184
00:06:38,266 --> 00:06:41,133
We now. Have the prob pred vector.

185
00:06:41,133 --> 00:06:43,533
Containing all the.
Predicted probabilities.

186
00:06:43,533 --> 00:06:45,966
In the form of an environment.

187
00:06:45,966 --> 00:06:46,633
So that's good.

188
00:06:46,633 --> 00:06:47,666
But we. Cannot have.

189
00:06:47,666 --> 00:06:50,000
A look at these predicted
probabilities yet.

190
00:06:50,000 --> 00:06:51,500
We will need. To convert. It.

191
00:06:51,500 --> 00:06:53,700
Back into a standard. Vector.

192
00:06:53,700 --> 00:06:54,833
But before we do that.

193
00:06:54,833 --> 00:06:56,633
Convert it into a. Vector.

194
00:06:56,633 --> 00:07:01,100
Well we need to apply this line as well,
which will, you know.

195
00:07:01,266 --> 00:07:02,266
Transform the.

196
00:07:02,266 --> 00:07:04,433
Probabilities into the.

197
00:07:04,433 --> 00:07:07,033
Predictions in the form one. Or zero.

198
00:07:07,033 --> 00:07:07,966
That is exactly the.

199
00:07:07,966 --> 00:07:09,233
Predictions. Of the.

200
00:07:09,233 --> 00:07:11,133
Dependent variable. Exited.

201
00:07:11,133 --> 00:07:13,700
And to do this
we're using this ifelse function.

202
00:07:13,700 --> 00:07:17,700
And basically what we do is
we choose a threshold such that

203
00:07:17,700 --> 00:07:22,300
if the predicted probability is above
the threshold, then we predict one.

204
00:07:22,500 --> 00:07:27,400
And if the predicted probability is below
the threshold, then we predict zero.

205
00:07:27,866 --> 00:07:29,000
So that's a natural.

206
00:07:29,000 --> 00:07:31,200
Threshold to take
when we get our predictions.

207
00:07:31,200 --> 00:07:32,833
In terms. Of probabilities.

208
00:07:32,833 --> 00:07:33,933
No that it is not.

209
00:07:33,933 --> 00:07:36,933
Necessarily always 50. Percent 0.5.

210
00:07:37,000 --> 00:07:37,766
That's the case.

211
00:07:37,766 --> 00:07:40,800
For example, in medicine
when we have to predict some sensitive

212
00:07:40,800 --> 00:07:44,300
informations, like for example, predicting
if a tumor is malignant.

213
00:07:44,433 --> 00:07:46,000
Well that's. More sensitive.

214
00:07:46,000 --> 00:07:47,033
So in that case with.

215
00:07:47,033 --> 00:07:48,900
Better be sure of. Our predictions.

216
00:07:48,900 --> 00:07:52,733
And therefore we would choose
a higher threshold like for example 80%.

217
00:07:53,266 --> 00:07:54,033
But here we are.

218
00:07:54,033 --> 00:07:55,833
Predicting if a customer leaves the bank.

219
00:07:55,833 --> 00:07:58,466
So we are fine with the 50% threshold.

220
00:07:58,466 --> 00:07:59,500
So that's okay.

221
00:07:59,500 --> 00:08:03,666
And by the way there is a more simple way
to get these.

222
00:08:04,033 --> 00:08:06,133
Predictions in the form 0 or 1.

223
00:08:06,133 --> 00:08:07,200
Without using.

224
00:08:07,200 --> 00:08:10,133
This if else function. It's by simply.

225
00:08:10,133 --> 00:08:12,900
Removing this one and zero. Here and.

226
00:08:12,900 --> 00:08:14,366
Removing this. If else.

227
00:08:15,666 --> 00:08:17,033
And by using this.

228
00:08:17,033 --> 00:08:19,100
Prop, read. Larger than 0.5.

229
00:08:19,100 --> 00:08:24,133
Because this will return a boolean,
which will be true if prop read is.

230
00:08:24,133 --> 00:08:25,500
Larger than 0.5.

231
00:08:25,500 --> 00:08:28,466
And false if prop read is below.

232
00:08:28,466 --> 00:08:32,266
0.5 and wipe read in the form
of this boolean, true and false.

233
00:08:32,366 --> 00:08:33,366
Will be accepted.

234
00:08:33,366 --> 00:08:35,266
In this confusion matrix here.

235
00:08:35,266 --> 00:08:37,100
So that's more simple. And now.

236
00:08:37,100 --> 00:08:38,933
Let's get this.

237
00:08:38,933 --> 00:08:41,233
Predictions in the form of booleans.

238
00:08:41,233 --> 00:08:42,566
All right. So I'm going to.

239
00:08:42,566 --> 00:08:45,066
Select this. Line and. Execute it.

240
00:08:45,066 --> 00:08:45,433
All right.

241
00:08:45,433 --> 00:08:48,266
So now we have our white.
Print in the form of booleans.

242
00:08:48,266 --> 00:08:50,066
But it is still.

243
00:08:50,066 --> 00:08:52,866
An H2O. Object because it is the result.

244
00:08:52,866 --> 00:08:53,500
In the first.

245
00:08:53,500 --> 00:08:56,266
Place of this H2 dot predict. Function.

246
00:08:56,266 --> 00:08:58,200
So it still needs to object.

247
00:08:58,200 --> 00:09:00,533
And therefore. Now what we. Have to do is.

248
00:09:00,533 --> 00:09:01,500
To convert.

249
00:09:01,500 --> 00:09:04,500
This H2 object back. Into a. Vector.

250
00:09:04,600 --> 00:09:05,466
Because this table.

251
00:09:05,466 --> 00:09:08,066
Function here will only. Accept a vector.

252
00:09:08,066 --> 00:09:09,566
A standard vector.

253
00:09:09,566 --> 00:09:12,600
And of course will never accept this
H2 object.

254
00:09:13,033 --> 00:09:15,233
So let's convert it back into a vector.

255
00:09:15,233 --> 00:09:16,800
And that's actually really simple.

256
00:09:16,800 --> 00:09:21,166
It's actually kind of the same
as converting a data frame into an H

257
00:09:21,166 --> 00:09:21,766
two frame.

258
00:09:21,766 --> 00:09:25,000
But instead of using H2 here
we will use vector.

259
00:09:25,400 --> 00:09:28,066
So here we simply need to type y pred

260
00:09:29,566 --> 00:09:32,633
equals as dot vector.

261
00:09:33,233 --> 00:09:33,533
And in.

262
00:09:33,533 --> 00:09:36,533
Parentheses of course y print.

263
00:09:37,100 --> 00:09:38,733
All right. So let's check it out.

264
00:09:38,733 --> 00:09:41,733
I'm going to select this line
and. Execute.

265
00:09:42,000 --> 00:09:43,066
And now as you can.

266
00:09:43,066 --> 00:09:44,266
See y pred.

267
00:09:44,266 --> 00:09:48,100
Became this vector of integers
containing 2000 elements.

268
00:09:48,400 --> 00:09:49,433
And that's the standard.

269
00:09:49,433 --> 00:09:52,100
Vector of r we were used to working. With.

270
00:09:52,100 --> 00:09:54,166
So we can actually have a look at the.

271
00:09:54,166 --> 00:09:56,166
Predictions of. The test.

272
00:09:56,166 --> 00:09:59,000
Observations
by typing here in the. Console.

273
00:09:59,000 --> 00:10:00,433
Why pred?

274
00:10:00,433 --> 00:10:01,566
Here we go. That's all.

275
00:10:01,566 --> 00:10:02,866
The predictions of the. Tested.

276
00:10:02,866 --> 00:10:05,666
Observations 2000 predictions.

277
00:10:05,666 --> 00:10:06,533
So here we go.

278
00:10:06,533 --> 00:10:08,900
According to the model,
the first. Customer stayed.

279
00:10:08,900 --> 00:10:11,700
In the bank.
The second customer stayed in the bank.

280
00:10:11,700 --> 00:10:14,533
The third customer left the bank.

281
00:10:14,533 --> 00:10:16,866
The fourth one stayed, the fifth one
stayed.

282
00:10:16,866 --> 00:10:17,900
Etc..

283
00:10:17,900 --> 00:10:20,633
So if you want, you can actually compare
these predictions with the.

284
00:10:20,633 --> 00:10:24,400
Real results
that are in the last column of test set.

285
00:10:24,900 --> 00:10:26,033
This come here.

286
00:10:26,033 --> 00:10:30,466
So for example, 001000 are the real.

287
00:10:30,466 --> 00:10:31,800
Outcomes of the.

288
00:10:31,800 --> 00:10:33,000
First customers.

289
00:10:33,000 --> 00:10:35,233
And if.
We compare that with the predictions.

290
00:10:35,233 --> 00:10:36,800
Well we see that the.

291
00:10:36,800 --> 00:10:37,466
Predictions.

292
00:10:37,466 --> 00:10:42,066
Are quite correct
because here we get as well zero. 010.

293
00:10:42,066 --> 00:10:42,966
Zero zero.

294
00:10:42,966 --> 00:10:45,633
So the five first.
Predictions. Are correct.

295
00:10:45,633 --> 00:10:47,766
So that smells pretty good
for our accuracy that.

296
00:10:47,766 --> 00:10:49,066
We were about to compute.

297
00:10:49,066 --> 00:10:51,866
Because when we look
at the first observations we can only see.

298
00:10:51,866 --> 00:10:53,033
Correct. Predictions.

299
00:10:53,033 --> 00:10:55,500
So now actually
I can't wait. To see the accuracy.

300
00:10:55,500 --> 00:10:57,300
So let's computed right now.

301
00:10:57,300 --> 00:10:59,300
We will start.
By making the confusion matrix.

302
00:10:59,300 --> 00:11:01,666
And of course. Here we need to replace.

303
00:11:01,666 --> 00:11:02,700
This index.

304
00:11:02,700 --> 00:11:04,033
Three here by 11.

305
00:11:04,033 --> 00:11:07,033
Because this corresponds
to the index of the dependent variable.

306
00:11:07,066 --> 00:11:09,033
And so now. We are ready to.

307
00:11:09,033 --> 00:11:11,200
Make this confusion matrix.

308
00:11:11,200 --> 00:11:14,266
So I'm going to select this line
and execute.

309
00:11:14,700 --> 00:11:17,100
Here we go. Confusion matrix. Created.

310
00:11:17,100 --> 00:11:18,600
So now let's have a look.

311
00:11:18,600 --> 00:11:19,066
I'm going.

312
00:11:19,066 --> 00:11:22,533
To. Type
cmd here in the console and press enter.

313
00:11:23,100 --> 00:11:25,133
That's our confusion matrix.

314
00:11:25,133 --> 00:11:27,133
We can see. A lot of correct. Predictions.

315
00:11:27,133 --> 00:11:28,033
That's good.

316
00:11:28,033 --> 00:11:31,033
1500. And 36 correct.

317
00:11:31,033 --> 00:11:33,066
Predictions. Of customers who stayed.

318
00:11:33,066 --> 00:11:37,000
In the bank,
and 195 correct predictions of.

319
00:11:37,000 --> 00:11:39,033
Customers who left the bank.

320
00:11:39,033 --> 00:11:41,066
And then we have 212. Plus.

321
00:11:41,066 --> 00:11:43,300
57 incorrect predictions.

322
00:11:43,300 --> 00:11:45,966
Of customers who either left or stayed.

323
00:11:45,966 --> 00:11:47,133
In the bank.

324
00:11:47,133 --> 00:11:48,533
So this looks pretty good.

325
00:11:48,533 --> 00:11:50,466
And now let's. Not wait anymore.

326
00:11:50,466 --> 00:11:52,200
Let's compute the accuracy.

327
00:11:52,200 --> 00:11:54,533
So the accuracy is the total. Number of.

328
00:11:54,533 --> 00:11:55,633
Correct. Predictions.

329
00:11:55,633 --> 00:11:57,566
That is 105,030.

330
00:11:57,566 --> 00:11:59,966
Six plus.

331
00:11:59,966 --> 00:12:02,400
190. Five divided.

332
00:12:02,400 --> 00:12:05,233
By the total number of observations
in the. Test set.

333
00:12:05,233 --> 00:12:07,766
That is the total number of predictions
actually.

334
00:12:07,766 --> 00:12:10,666
Which is 2000.

335
00:12:10,666 --> 00:12:12,966
All right. So let's check it out.

336
00:12:12,966 --> 00:12:15,966
Let's see if we can offer
this model to the bank.

337
00:12:15,966 --> 00:12:17,866
Let's see if we'll. Get the bonus.

338
00:12:17,866 --> 00:12:19,000
Let's find out about this.

339
00:12:19,000 --> 00:12:21,566
Accuracy on 332.

340
00:12:21,566 --> 00:12:24,966
One. Go 80. 6.5.

341
00:12:24,966 --> 00:12:25,800
Percent.

342
00:12:25,800 --> 00:12:29,100
That's actually not bad at all 86.5%.

343
00:12:29,100 --> 00:12:29,900
Well, let's say 80.

344
00:12:29,900 --> 00:12:34,133
787% means that on 100 observations.

345
00:12:34,366 --> 00:12:36,566
87. Predictions should. Be correct.

346
00:12:36,566 --> 00:12:37,766
So this is pretty good.

347
00:12:37,766 --> 00:12:40,533
And besides, we.
Haven't done any parameter tuning.

348
00:12:40,533 --> 00:12:42,500
And you will see that
by doing some parameter

349
00:12:42,500 --> 00:12:45,966
tuning using some techniques
like k fold cross-validation.

350
00:12:45,966 --> 00:12:48,900
Well we can get an even better
accuracy score.

351
00:12:48,900 --> 00:12:51,133
No worries, we will. Do that in person.

352
00:12:51,133 --> 00:12:52,900
You can actually already. Practice.

353
00:12:52,900 --> 00:12:55,633
To improve. This accuracy score.

354
00:12:55,633 --> 00:12:58,066
And please
let me know if you get an awesome one.

355
00:12:58,066 --> 00:12:59,633
And now just one last thing.

356
00:12:59,633 --> 00:13:00,966
Since we were connected to. This.

357
00:13:00,966 --> 00:13:02,466
H2O instance.

358
00:13:02,466 --> 00:13:04,566
It's better to. Disconnect from it now.

359
00:13:04,566 --> 00:13:05,800
And to do. This we.

360
00:13:05,800 --> 00:13:07,000
Only need to.

361
00:13:07,000 --> 00:13:08,100
Apply a last.

362
00:13:08,100 --> 00:13:08,766
Function of.

363
00:13:08,766 --> 00:13:14,933
H2O, which is the H2O dot shut down.

364
00:13:14,933 --> 00:13:17,466
Here it is with no arguments inside.

365
00:13:17,466 --> 00:13:19,600
You just need to select this.

366
00:13:19,600 --> 00:13:21,866
And this will disconnect you
from the server.

367
00:13:21,866 --> 00:13:23,500
So let's execute.

368
00:13:23,500 --> 00:13:24,200
Are you sure.

369
00:13:24,200 --> 00:13:27,033
You want to shut down the H2O instance
running at this.

370
00:13:27,033 --> 00:13:27,900
Address?

371
00:13:27,900 --> 00:13:29,966
Then you just need to type here capital.

372
00:13:29,966 --> 00:13:32,000
Y and then enter.

373
00:13:32,000 --> 00:13:33,766
And now we are disconnected.

374
00:13:33,766 --> 00:13:36,500
True means yes we did disconnect.

375
00:13:36,500 --> 00:13:37,800
So congratulations.

376
00:13:37,800 --> 00:13:39,933
You have built your first artificial.

377
00:13:39,933 --> 00:13:43,000
Neural network
on. R using the H2O package.

378
00:13:43,366 --> 00:13:47,000
I was very happy to build this first deep
learning model with you, and we are.

379
00:13:47,000 --> 00:13:49,233
Getting to the end of this section.

380
00:13:49,233 --> 00:13:53,866
Next section will be about Convolutional
Neural networks, another branch of machine

381
00:13:53,866 --> 00:13:56,966
learning specialized for computer vision,
because it will consider.

382
00:13:57,000 --> 00:13:58,100
Spatial. Structure in the.

383
00:13:58,100 --> 00:14:01,100
Data 
exactly as it is the case. For images.

384
00:14:01,200 --> 00:14:03,500
Where the.
Position of the pixels. Matters.

385
00:14:03,500 --> 00:14:04,066
So we will.

386
00:14:04,066 --> 00:14:05,400
See that in the next section.

387
00:14:05,400 --> 00:14:07,200
And until then, enjoy machine learning.