1
00:00:00,266 --> 00:00:03,300
So now let's move on to the next code
template.

2
00:00:03,500 --> 00:00:04,933
Polynomial regression.

3
00:00:04,933 --> 00:00:08,800
Well here that's the same,
you know the same data preprocessing phase

4
00:00:08,800 --> 00:00:11,833
with first importing the libraries,
then importing the data sets

5
00:00:11,833 --> 00:00:14,833
where you only have to enter
the name of your data set here,

6
00:00:14,866 --> 00:00:17,866
and then splitting the data
set into the training set and test set.

7
00:00:18,200 --> 00:00:22,066
Then of course, we train the polynomial
regression model on the training set.

8
00:00:22,533 --> 00:00:25,466
So that's exactly
like what we did in this part two.

9
00:00:25,466 --> 00:00:28,200
You know when we built it
you recognize degree equals four.

10
00:00:28,200 --> 00:00:30,266
You know that's exactly the same code.

11
00:00:30,266 --> 00:00:34,766
Then we predict some test results
just to compare our predictions

12
00:00:34,766 --> 00:00:36,000
and the real results.

13
00:00:36,000 --> 00:00:39,000
And finally
we will evaluate the model performance.

14
00:00:39,200 --> 00:00:41,833
And I will reveal very soon
how to do that.

15
00:00:41,833 --> 00:00:42,166
Okay.

16
00:00:42,166 --> 00:00:44,133
So that's for polynomial regression.

17
00:00:44,133 --> 00:00:45,933
Once again very generic.

18
00:00:45,933 --> 00:00:48,600
You just have to enter here
the name of your data set.

19
00:00:48,600 --> 00:00:51,566
And then this code
template is ready to be deployed.

20
00:00:51,566 --> 00:00:54,100
All right then support vector regression.

21
00:00:54,100 --> 00:00:55,533
So here that's the same.

22
00:00:55,533 --> 00:00:59,433
First the data preprocessing phase
where we import the libraries.

23
00:00:59,433 --> 00:01:01,066
Then we import the data set.

24
00:01:01,066 --> 00:01:02,933
But then remember we have to reshape

25
00:01:02,933 --> 00:01:06,000
our dependent variable vector y
because we have two features.

26
00:01:06,000 --> 00:01:07,966
Kill it. Right.
Because we were doing regression.

27
00:01:07,966 --> 00:01:11,733
So the dependent variable vector
has continuous numerical values.

28
00:01:11,933 --> 00:01:15,600
And therefore for SVR we need to scale
the dependent variable vector.

29
00:01:15,900 --> 00:01:19,666
That's exactly the same as what we saw
together when building the SVR model.

30
00:01:20,100 --> 00:01:21,400
Then I added this.

31
00:01:21,400 --> 00:01:24,833
Of course, in order to split the data
set into the training set and test set

32
00:01:24,866 --> 00:01:26,533
so that we can indeed evaluate

33
00:01:26,533 --> 00:01:30,300
the performance of SVR
and compare it to the other models,

34
00:01:30,800 --> 00:01:34,800
then of course, we have feature
scaling compulsory for the SVR

35
00:01:35,033 --> 00:01:37,200
with remember our two scalars, one

36
00:01:37,200 --> 00:01:40,200
for the matrix of features
and one for the dependent variable vector.

37
00:01:40,366 --> 00:01:43,966
Then we train, of course
the SVR model on the training set.

38
00:01:44,133 --> 00:01:46,566
You know this very well.
We did it together.

39
00:01:46,566 --> 00:01:49,766
Then we predicted test results
just to compare and have an idea

40
00:01:49,766 --> 00:01:52,766
of how good are the predictions
of new observations.

41
00:01:52,933 --> 00:01:57,533
And finally we will evaluate
the model performance with r squared.

42
00:01:58,033 --> 00:01:58,666
No worries.

43
00:01:58,666 --> 00:02:00,633
We'll get to that very very soon.

44
00:02:00,633 --> 00:02:04,166
So that's for the SVR
then for decision tree regression.

45
00:02:04,166 --> 00:02:05,700
Well exactly the same.

46
00:02:05,700 --> 00:02:09,966
You know the data preprocessing phase
first with no feature scaling right.

47
00:02:09,966 --> 00:02:12,700
Remember we don't need feature
scaling for decision trees.

48
00:02:12,700 --> 00:02:15,800
So once again we only have to change
the name of the data set here.

49
00:02:15,800 --> 00:02:18,800
Then we split the data
set into the training set and test it.

50
00:02:18,866 --> 00:02:21,866
Then we train the decision tree regression
model on the training set,

51
00:02:22,000 --> 00:02:24,933
exactly the same as we did
in our implementation.

52
00:02:24,933 --> 00:02:26,433
When we built it together.

53
00:02:26,433 --> 00:02:28,200
Then we predict the test result

54
00:02:28,200 --> 00:02:32,200
in order to compare our predictions
to the real result in Y test.

55
00:02:32,200 --> 00:02:35,100
And that's in order to have a first idea
of the performance.

56
00:02:35,100 --> 00:02:38,700
And then of course, we will evaluate
the model performance with R squared.

57
00:02:39,000 --> 00:02:42,300
And finally we have the exact same data
preprocessing

58
00:02:42,300 --> 00:02:45,300
phase where you only have to enter
the name of your data set here.

59
00:02:45,366 --> 00:02:48,633
And then we train the random forest
regression model on the training

60
00:02:48,633 --> 00:02:52,400
set with the exact same implementation
as how we did it together.

61
00:02:52,633 --> 00:02:56,366
Then we predict the test result in order
to get a first idea of the performance.

62
00:02:56,533 --> 00:02:59,533
And finally
we evaluate the model performance.

63
00:02:59,766 --> 00:03:04,300
All right, so as I told you,
you have purely generic code templates

64
00:03:04,300 --> 00:03:07,233
which you can deploy
for any of your future data sets

65
00:03:07,233 --> 00:03:10,666
as long as they have first of features
and last, the dependent variable.

66
00:03:10,800 --> 00:03:13,800
And as long as they don't
have missing data or categorical data,

67
00:03:13,800 --> 00:03:15,133
in which case it's still fine.

68
00:03:15,133 --> 00:03:18,633
You can use your data
preprocessing toolkit, but there you go.

69
00:03:18,666 --> 00:03:22,400
You have this code template, and now I'm
going to show you how to evaluate

70
00:03:22,400 --> 00:03:25,566
your regression
models using the R-squared coefficient.

71
00:03:26,300 --> 00:03:28,033
All right. So let's start with r squared.

72
00:03:28,033 --> 00:03:29,133
You know that

73
00:03:29,133 --> 00:03:33,600
final sale in each of the implementations
evaluating the model performance.

74
00:03:33,833 --> 00:03:36,266
Let's see how we're going to do this.

75
00:03:36,266 --> 00:03:41,300
Well as I also want to train you on
how to be independent in machine learning.

76
00:03:41,433 --> 00:03:45,033
We're going to pretend once again
that I actually have no idea on how

77
00:03:45,033 --> 00:03:48,800
to evaluate the model performance
of regression models, and therefore that

78
00:03:48,800 --> 00:03:52,833
I have to go to the documentation online
to figure out how to do it.

79
00:03:52,833 --> 00:03:53,500
All right.

80
00:03:53,500 --> 00:03:57,900
I'm just training you to be independent
and quickly find an information

81
00:03:57,900 --> 00:03:58,966
whenever you need it.