1
00:00:00,566 --> 00:00:00,900
All right.

2
00:00:00,900 --> 00:00:03,500
So that's
the only thing we want to input here.

3
00:00:03,500 --> 00:00:05,733
Or the rest
we're going to keep the default values.

4
00:00:05,733 --> 00:00:08,066
Feel free to read them
if you need to learn more.

5
00:00:08,066 --> 00:00:09,966
But that's the main parameter.

6
00:00:09,966 --> 00:00:12,533
That's mostly what we have to select here.

7
00:00:12,533 --> 00:00:16,600
And then remember that we will also add
a random state parameter

8
00:00:16,633 --> 00:00:20,666
to make sure that we have the same results
displayed in our notebook.

9
00:00:20,933 --> 00:00:22,200
All right. So let's do this.

10
00:00:22,200 --> 00:00:26,533
Criterion equals end quote entropy.

11
00:00:27,200 --> 00:00:28,000
Perfect.

12
00:00:28,000 --> 00:00:34,300
And then the second one random
state parameter that we set equal to zero.

13
00:00:34,600 --> 00:00:35,233
Great.

14
00:00:35,233 --> 00:00:38,233
And now final step
you know exactly what to do.

15
00:00:38,266 --> 00:00:40,533
We take our classifier.

16
00:00:40,533 --> 00:00:44,066
And from this classifier
we call the fit method

17
00:00:44,300 --> 00:00:47,166
to train our decision tree classifier.

18
00:00:47,166 --> 00:00:51,066
On the training set that is composed
as is expected by the fit method

19
00:00:51,433 --> 00:00:54,566
of x train and y

20
00:00:55,000 --> 00:00:58,066
train, exactly the same as before.

21
00:00:58,266 --> 00:01:01,266
And now once again, we're done
very efficiently

22
00:01:01,366 --> 00:01:05,033
with this implementation,
so I can't wait to see the results.

23
00:01:05,033 --> 00:01:08,700
I don't think we will beat
the accuracy record, but let's see.

24
00:01:08,700 --> 00:01:09,600
We never know.

25
00:01:09,600 --> 00:01:12,066
So let's click this folder button here.

26
00:01:12,066 --> 00:01:15,533
And then you know right now
it is connecting to a runtime to enable

27
00:01:15,533 --> 00:01:19,833
file browsing so that, you know,
we can access your files on your machine.

28
00:01:20,100 --> 00:01:23,033
And in a second we should be able
to get the upload button.

29
00:01:23,033 --> 00:01:26,033
There we go. As usual upload.

30
00:01:26,133 --> 00:01:27,833
And so that's the right data set.

31
00:01:27,833 --> 00:01:29,500
Let me show you the path again.

32
00:01:29,500 --> 00:01:31,200
That's the whole machinery is that folder.

33
00:01:31,200 --> 00:01:32,833
Please find it on your machine.

34
00:01:32,833 --> 00:01:35,100
And then we're going to go to part
three classification.

35
00:01:35,100 --> 00:01:36,900
Then decision tree classification.

36
00:01:36,900 --> 00:01:40,733
Then Python
and then social network ads dot CSV.

37
00:01:41,500 --> 00:01:43,000
All right let's press okay.

38
00:01:43,000 --> 00:01:44,166
And now here we go.

39
00:01:44,166 --> 00:01:49,066
We are ready to run all the cells
by clicking this runtime button.

40
00:01:49,066 --> 00:01:51,900
And then run oh all right.

41
00:01:51,900 --> 00:01:54,133
And now it is training the decision tree
classification model.

42
00:01:54,133 --> 00:01:55,200
Here we go.

43
00:01:55,200 --> 00:01:58,933
We have it now you know with all the
default values of the parameters

44
00:01:58,933 --> 00:02:01,933
except criterion
which we set equal to entropy.

45
00:02:02,366 --> 00:02:04,400
Then what about that new result. Great.

46
00:02:04,400 --> 00:02:05,600
We got the right prediction.

47
00:02:05,600 --> 00:02:11,666
Remember that customer of age 30
and estimated salary $87,000 didn't buy.

48
00:02:11,666 --> 00:02:15,000
In reality, the end was predicted
not to buy it either.

49
00:02:15,266 --> 00:02:16,333
So perfect.

50
00:02:16,333 --> 00:02:20,666
Then when predicting the test results,
we indeed get a lot of good predictions

51
00:02:21,100 --> 00:02:23,600
except some incorrect ones
here, for example.

52
00:02:23,600 --> 00:02:26,333
And then, well,
it looks actually pretty good.

53
00:02:26,333 --> 00:02:28,200
Maybe, you know we will be the accuracy.

54
00:02:28,200 --> 00:02:31,200
That's another one.
All right. Another one.

55
00:02:31,800 --> 00:02:33,800
And okay let's see okay.

56
00:02:33,800 --> 00:02:38,300
Because you know there's actually also
when you scroll up some more prediction.

57
00:02:38,300 --> 00:02:39,933
But let's see I'm very curious actually.

58
00:02:39,933 --> 00:02:41,600
Maybe I spoke too fast.

59
00:02:41,600 --> 00:02:44,700
We're about to find out right now
with the confusion matrix.

60
00:02:44,700 --> 00:02:45,666
Are you ready?

61
00:02:45,666 --> 00:02:48,900
The accuracy of the decision tree
classification model

62
00:02:49,200 --> 00:02:52,700
is 91%. Wow.

63
00:02:52,700 --> 00:02:57,166
Okay, so it's actually in the podium,
you know, right after K and N

64
00:02:57,166 --> 00:03:02,300
and a kernel SVM
which got the best accuracy of 93%. Wow.

65
00:03:02,300 --> 00:03:03,000
So that's really good.

66
00:03:03,000 --> 00:03:07,266
Actually this is really a good sign
for Random Forest because random forest

67
00:03:07,266 --> 00:03:10,900
is basically a team of decision trees
making the predictions.

68
00:03:10,900 --> 00:03:14,133
And you know how team spirit
always improves the results.

69
00:03:14,300 --> 00:03:18,966
So we might have a chance to beat
the record accuracy with Random Forest.

70
00:03:19,433 --> 00:03:20,733
So that's pretty exciting.

71
00:03:20,733 --> 00:03:23,733
And now when visualizing the training
set results which we already got,

72
00:03:23,833 --> 00:03:25,600
no, the execution was not too long.

73
00:03:25,600 --> 00:03:27,100
Let's see what it looks like.

74
00:03:27,100 --> 00:03:30,400
Wow. Okay,
so that's pretty different as before.

75
00:03:30,400 --> 00:03:33,466
And no wonder why
it got a pretty good accuracy

76
00:03:33,900 --> 00:03:34,500
because indeed it

77
00:03:34,500 --> 00:03:38,333
looks like it was able to catch,
you know, the little observation points

78
00:03:38,333 --> 00:03:41,400
that were really hard to catch
by either a straight line, you know,

79
00:03:41,400 --> 00:03:46,833
with linear classifiers or a nice curve
like with kernel SVM or Naive Bayes.

80
00:03:47,133 --> 00:03:51,333
Here we actually splitted this whole grid
into smaller subgrid.

81
00:03:51,566 --> 00:03:53,866
And that's because, you know,
we have all these splits

82
00:03:53,866 --> 00:03:56,866
in the decision tree
classification algorithm.

83
00:03:56,866 --> 00:04:00,800
So no wonder why we get all these subgrid
and therefore we get separate

84
00:04:00,900 --> 00:04:03,100
prediction
regions. It's really interesting.

85
00:04:03,100 --> 00:04:06,300
That captures
very well the observation points.

86
00:04:06,600 --> 00:04:10,966
So it catches all the red customers here
who didn't buy in reality the SUV.

87
00:04:11,266 --> 00:04:15,700
It catches also all these green customers
who but in reality the SUV

88
00:04:16,033 --> 00:04:20,066
and it catches you know these very hard
to catch customers here

89
00:04:20,266 --> 00:04:25,333
by creating these sub grids of the grid
with the right predicted regions.

90
00:04:25,333 --> 00:04:27,766
So you see how it got that good accuracy.

91
00:04:27,766 --> 00:04:31,233
It really tried to catch everything,
even for example, these green points

92
00:04:31,233 --> 00:04:33,900
that were cut among all these red points
okay.

93
00:04:33,900 --> 00:04:35,800
These red customers okay.

94
00:04:35,800 --> 00:04:37,400
But let's be careful.

95
00:04:37,400 --> 00:04:40,933
The training set,
you know, on which the model was trained.

96
00:04:41,200 --> 00:04:43,000
Let's see what happens with the test set.

97
00:04:43,000 --> 00:04:45,033
And we already know
that we will get good results

98
00:04:45,033 --> 00:04:47,900
because we already know
that the accuracy on the test set is 90%.

99
00:04:47,900 --> 00:04:50,000
But still, let's see what we get

100
00:04:50,000 --> 00:04:53,300
with new observations
on which the model wasn't trained.

101
00:04:53,933 --> 00:04:54,433
All right.

102
00:04:54,433 --> 00:04:55,466
This is what we get.

103
00:04:55,466 --> 00:04:58,133
And actually here
we see things more clearly.

104
00:04:58,133 --> 00:05:01,566
This is the prediction region
you know which funnily was a good fit

105
00:05:01,566 --> 00:05:04,033
for the training set.
But here it is not catching anything.

106
00:05:04,033 --> 00:05:04,833
You know, neither

107
00:05:04,833 --> 00:05:09,600
red customers or green customers
here seem to be two incorrect predictions.

108
00:05:09,600 --> 00:05:11,500
You know,
because they fall in the green region.

109
00:05:12,466 --> 00:05:12,966
then here

110
00:05:12,966 --> 00:05:16,200
that's all good, you know,
that's all the customers of small age

111
00:05:16,200 --> 00:05:17,533
and small estimated salary,

112
00:05:17,533 --> 00:05:21,666
which therefore won't be likely
to buy the SUV as it is the case here.

113
00:05:21,933 --> 00:05:24,933
And then all these green points
are correctly predicted.

114
00:05:25,033 --> 00:05:26,800
This one is incorrectly predicted.

115
00:05:26,800 --> 00:05:30,300
So indeed we have our ten incorrect
predictions in all this.

116
00:05:30,766 --> 00:05:32,066
But there you go.

117
00:05:32,066 --> 00:05:32,366
You know,

118
00:05:32,366 --> 00:05:34,033
if I didn't see the accuracy first,

119
00:05:34,033 --> 00:05:36,800
I would be afraid
that we have some overfitting here.

120
00:05:36,800 --> 00:05:39,033
But no, it doesn't seem to be the case.

121
00:05:39,033 --> 00:05:41,500
Even with new observations
of the test set.

122
00:05:41,500 --> 00:05:44,500
You know, we get great predictions.

123
00:05:44,566 --> 00:05:46,966
But now what I really want to see

124
00:05:46,966 --> 00:05:51,366
is the final accuracy
of our final classification model.

125
00:05:51,600 --> 00:05:54,600
Let's find out about this
in next practical activity.

126
00:05:54,833 --> 00:05:56,566
And until then, enjoy machine learning.