1
00:00:00,133 --> 00:00:01,700
Let's do it.

2
00:00:01,700 --> 00:00:02,733
So it's ready to be built.

3
00:00:02,733 --> 00:00:06,933
And that means all
our code is ready to be executed.

4
00:00:06,933 --> 00:00:10,166
Because we don't need
to change anything more.

5
00:00:10,566 --> 00:00:13,500
So let's execute these sections
one by one,

6
00:00:13,500 --> 00:00:16,500
and let's see what happens with decision
tree regression.

7
00:00:16,766 --> 00:00:19,766
So I'm going to select the first section.

8
00:00:19,800 --> 00:00:22,533
Execute data sets one imported.

9
00:00:22,533 --> 00:00:23,966
Here it is.

10
00:00:23,966 --> 00:00:24,533
All right.

11
00:00:24,533 --> 00:00:28,900
So then no need to split the data
set into a training set and a test set.

12
00:00:28,900 --> 00:00:31,900
Because as you can see
this is a very small data set.

13
00:00:32,100 --> 00:00:35,133
Then no need for feature scaling
because for decision trees

14
00:00:35,266 --> 00:00:37,000
we don't need to do any feature scaling.

15
00:00:37,000 --> 00:00:40,066
Because the way this model is built
is based on conditions

16
00:00:40,066 --> 00:00:43,533
on the independent variable
and not on Euclidean distances.

17
00:00:43,900 --> 00:00:45,333
So we're fine with that.

18
00:00:45,333 --> 00:00:49,433
We definitely don't need to apply
feature scaling, and we can move on

19
00:00:49,433 --> 00:00:52,433
to the next step
which is to create our model.

20
00:00:52,433 --> 00:00:55,566
So let's create it executing all right.

21
00:00:55,566 --> 00:00:58,100
Perfect regressor is created.

22
00:00:58,100 --> 00:01:01,100
And now let's get our final verdict.

23
00:01:01,233 --> 00:01:01,600
Okay.

24
00:01:01,600 --> 00:01:04,766
So 160 K according to this person.

25
00:01:04,766 --> 00:01:08,366
And now let's see the predicted salary
according to our model.

26
00:01:08,900 --> 00:01:14,433
So executing this and we get a $249,000.

27
00:01:14,433 --> 00:01:18,666
Well much higher than the salary mentioned
by this person.

28
00:01:18,733 --> 00:01:20,666
But let's not drop

29
00:01:20,666 --> 00:01:24,100
hasty conclusions and let's
see what's happening on the graph here.

30
00:01:24,566 --> 00:01:27,200
So I'm going to select all this

31
00:01:27,200 --> 00:01:30,600
and let's see what's happening
with the decision tree regression results.

32
00:01:32,233 --> 00:01:32,733
All right.

33
00:01:32,733 --> 00:01:34,400
That's what I thought okay.

34
00:01:34,400 --> 00:01:37,666
So we don't need to zoom in to clearly see
what's happening here.

35
00:01:37,900 --> 00:01:40,700
It's plotting a straight horizontal line.

36
00:01:40,700 --> 00:01:44,433
Exactly
like we saw in SVR with for Python.

37
00:01:44,433 --> 00:01:45,100
For those of you

38
00:01:45,100 --> 00:01:47,100
who didn't follow the Python tutorial,

39
00:01:47,100 --> 00:01:49,400
note that we already encountered
this situation.

40
00:01:49,400 --> 00:01:52,400
When we get a straight horizontal line.

41
00:01:52,500 --> 00:01:54,533
And actually in SVR.

42
00:01:54,533 --> 00:01:59,766
This was due to the fact that we didn't
apply feature scaling to our data set.

43
00:02:00,466 --> 00:02:03,466
So what do you think the problem is here?

44
00:02:03,533 --> 00:02:06,466
Do you think it's due to the fact
that we didn't apply feature scaling

45
00:02:06,466 --> 00:02:10,766
like for SVR, and we need to apply feature
scaling to get a model fitting properly.

46
00:02:10,766 --> 00:02:12,133
The data set.

47
00:02:12,133 --> 00:02:15,133
Well, as I mentioned
in the beginning of this tutorial,

48
00:02:15,133 --> 00:02:18,366
we definitely don't need to apply feature
scaling for decision trees because

49
00:02:18,366 --> 00:02:22,933
decision tree regression models are based
on condition on the independent variable.

50
00:02:22,933 --> 00:02:25,700
That has nothing to do
with Euclidean distances.

51
00:02:25,700 --> 00:02:27,966
And you know, when we need to apply
feature scaling, it's

52
00:02:27,966 --> 00:02:31,566
because the machine learning models
are based on Euclidean distances,

53
00:02:31,566 --> 00:02:35,500
and we need to put all the independent
variables on the same scale so that one

54
00:02:35,500 --> 00:02:38,500
independent
variable is not dominating another one.

55
00:02:38,533 --> 00:02:40,600
But this is not the problem here.

56
00:02:40,600 --> 00:02:42,666
This is not about feature scaling.

57
00:02:42,666 --> 00:02:46,200
You can try to apply feature
scaling here and re-execute this,

58
00:02:46,466 --> 00:02:49,466
but you'll get the same problem
with a straight horizontal line.

59
00:02:49,466 --> 00:02:52,433
And of course
this is actually the decision tree model.

60
00:02:52,433 --> 00:02:55,200
This is actually one model
of decision tree.

61
00:02:55,200 --> 00:02:56,566
But this is of course

62
00:02:56,566 --> 00:02:59,866
not the best version of decision tree
regression we want to get.

63
00:03:00,300 --> 00:03:03,000
So can you start seeing
what's the problem here?

64
00:03:03,000 --> 00:03:07,366
And especially after watching
the intuition tutorial made by Kirill,

65
00:03:07,400 --> 00:03:08,500
can you spot the problem?

66
00:03:09,533 --> 00:03:10,966
Okay, I'm going to tell you

67
00:03:10,966 --> 00:03:14,400
this problem is related
to the number of splits.

68
00:03:14,700 --> 00:03:18,266
Because you know, the way
the decision tree regression model is made

69
00:03:18,533 --> 00:03:21,833
is that it's making some splits
based on different conditions.

70
00:03:21,833 --> 00:03:23,166
So the more conditions you have

71
00:03:23,166 --> 00:03:25,466
in your independent variables,
the more you have splits.

72
00:03:25,466 --> 00:03:28,633
And here we clearly have no split here
because you know,

73
00:03:28,800 --> 00:03:32,666
all the predictions are equal to $250,000.

74
00:03:32,700 --> 00:03:34,766
So, you know,
it took all the different salaries

75
00:03:34,766 --> 00:03:37,766
for the different
ten levels here and made an average

76
00:03:37,800 --> 00:03:40,533
and just gave the average
for all the levels.

77
00:03:40,533 --> 00:03:43,300
So no conditions here, no splits.

78
00:03:43,300 --> 00:03:46,366
And therefore
that's absolutely not interesting,

79
00:03:46,666 --> 00:03:49,766
especially for the potential
the decision tree can have.

80
00:03:50,000 --> 00:03:51,133
So what we'll do now

81
00:03:51,133 --> 00:03:55,666
is add a parameter here
that will set a condition on the splits.

82
00:03:55,900 --> 00:03:57,333
You know that's what I was telling you.

83
00:03:57,333 --> 00:03:59,900
We have several parameters
in this output library.

84
00:03:59,900 --> 00:04:04,366
And we can use these optional parameters
to improve our model

85
00:04:04,366 --> 00:04:05,700
and make it more robust.

86
00:04:05,700 --> 00:04:07,833
Well this is exactly
what we're going to do now.

87
00:04:07,833 --> 00:04:10,600
We are going to get back to our part.

88
00:04:10,600 --> 00:04:12,866
So I'm going to press F1 here.

89
00:04:12,866 --> 00:04:15,500
Oh actually this time
our part is showing up okay.

90
00:04:15,500 --> 00:04:17,566
So grades are parts here.

91
00:04:17,566 --> 00:04:21,200
And as I mentioned
our part has several parameters

92
00:04:21,200 --> 00:04:24,200
that we can use
to make our model more robust.

93
00:04:24,300 --> 00:04:26,900
And the one we're interested in right now
is one parameter

94
00:04:26,900 --> 00:04:29,900
that will correct this problem
we had with the splits.

95
00:04:30,300 --> 00:04:33,733
So this parameter
is actually the control parameter.

96
00:04:33,733 --> 00:04:37,800
And right now I'm going to give you
a little trick to solve this problem.

97
00:04:37,800 --> 00:04:39,933
On the splits we had just obtained here.

98
00:04:39,933 --> 00:04:43,500
So I'm going to add
this third optional argument.

99
00:04:43,500 --> 00:04:45,266
You know to improve our model.

100
00:04:45,266 --> 00:04:47,900
Right now we doing some model performance
improvement.

101
00:04:47,900 --> 00:04:51,666
So that's a thing that machine learning
scientists do very often in their job.

102
00:04:51,666 --> 00:04:53,833
So don't worry we'll get more advanced
sections on it,

103
00:04:53,833 --> 00:04:57,800
especially when we cover cross-validation
to find the best models.

104
00:04:57,800 --> 00:04:59,700
Selecting the best parameters.

105
00:04:59,700 --> 00:05:02,666
But here we'll just do some simple model
performance

106
00:05:02,666 --> 00:05:05,700
improvement
and we'll just add this control parameter.

107
00:05:06,066 --> 00:05:08,066
And then I'm going to give you
this little trick.

108
00:05:08,066 --> 00:05:13,000
So this little trick
is to take the R part library again.

109
00:05:13,733 --> 00:05:16,733
So we're taking our part control here
which is a function.

110
00:05:16,900 --> 00:05:19,200
And in this function
we're going to add an argument.

111
00:05:19,200 --> 00:05:22,433
As you can see here
on this yellow rectangle here

112
00:05:22,500 --> 00:05:24,766
we have the first argument
that is min split.

113
00:05:24,766 --> 00:05:26,666
And that's exactly
what we're interested in.

114
00:05:26,666 --> 00:05:29,033
And that's what will solve our problem.

115
00:05:29,033 --> 00:05:30,466
Because you know

116
00:05:30,466 --> 00:05:33,466
actually we didn't have any split here
because it just took the average.

117
00:05:33,600 --> 00:05:37,300
So it's like we had no conditions on
the independent variables and no splits.

118
00:05:37,666 --> 00:05:40,966
So to make sure we have some splits
and some conditions

119
00:05:40,966 --> 00:05:44,000
on the dependent variables,
we will actually set

120
00:05:44,300 --> 00:05:48,733
min splits to one
and that will solve the problem.