1 00:00:00,066 --> 00:00:02,500 Hello and welcome to this art tutorial. 2 00:00:02,500 --> 00:00:05,866 In the previous sections, we already introduced two nonlinear regression models 3 00:00:05,866 --> 00:00:09,066 the polynomial regression model and the SVR regression model. 4 00:00:09,366 --> 00:00:12,700 And today we're going to introduce a new nonlinear regression model 5 00:00:12,700 --> 00:00:15,700 which is the decision tree regression model. 6 00:00:16,033 --> 00:00:18,433 So far the best model we had for our data set 7 00:00:18,433 --> 00:00:21,500 and our specific problem was the polynomial regression model. 8 00:00:21,700 --> 00:00:25,366 Let's see how decision tree regression will do and compare it 9 00:00:25,366 --> 00:00:26,833 to the previous ones. 10 00:00:26,833 --> 00:00:28,633 So let's start with the basics. 11 00:00:28,633 --> 00:00:30,566 Let's set the right folder as working directory. 12 00:00:30,566 --> 00:00:33,266 So right now I'm going to the regression folder. 13 00:00:33,266 --> 00:00:36,266 And we are now in decision tree regression. 14 00:00:36,266 --> 00:00:38,033 I'm just getting to the end. 15 00:00:38,033 --> 00:00:41,933 So this is the right folder containing the position salary CSV file. 16 00:00:41,933 --> 00:00:46,333 Make sure of that and click on this more button here and set as working directory. 17 00:00:46,666 --> 00:00:47,333 All good. 18 00:00:47,333 --> 00:00:51,600 And now let's take our regression template to build this model efficiently. 19 00:00:51,933 --> 00:00:54,533 So we take everything from here to here 20 00:00:55,533 --> 00:00:58,200 and pasting it here right. 21 00:00:58,200 --> 00:01:00,266 And now we just need to change a few things. 22 00:01:00,266 --> 00:01:01,900 So let's change the basics. 23 00:01:01,900 --> 00:01:05,400 Let's replace the regression model by decision tree. 24 00:01:07,466 --> 00:01:09,766 Regression. 25 00:01:09,766 --> 00:01:10,766 All right. 26 00:01:10,766 --> 00:01:15,300 We're going to copy that and put that same title here. 27 00:01:15,500 --> 00:01:15,966 All right. 28 00:01:15,966 --> 00:01:18,500 Visualizing the decision tree regression results. 29 00:01:18,500 --> 00:01:20,333 And same here. 30 00:01:20,333 --> 00:01:21,000 All right. 31 00:01:21,000 --> 00:01:22,066 And now let's change 32 00:01:22,066 --> 00:01:25,966 the most important thing which is the part where we create our regressor. 33 00:01:25,966 --> 00:01:28,633 So I'm going to remove that comment line here. 34 00:01:28,633 --> 00:01:31,533 And now let's build our decision tree regression model. 35 00:01:31,533 --> 00:01:36,000 So it's going to be the same thing as anytime we're going to import a package. 36 00:01:36,200 --> 00:01:39,600 And then use a function from this package to build our regressor. 37 00:01:40,000 --> 00:01:42,000 So this package for decision tree regression 38 00:01:42,000 --> 00:01:44,666 in R will be the R part package. 39 00:01:44,666 --> 00:01:49,400 For those of you who don't have the R part package in your packages here, 40 00:01:49,700 --> 00:01:52,866 I will type this command line so that you can install it. 41 00:01:53,200 --> 00:01:55,466 So as you can see, mine is already here our part. 42 00:01:55,466 --> 00:01:56,533 Just check if you have it 43 00:01:56,533 --> 00:02:00,800 and if you don't have it just type here install dot packages. 44 00:02:01,200 --> 00:02:05,133 And then in the parenthesis in quotes you type the name of the package, which is 45 00:02:05,400 --> 00:02:05,966 our part. 46 00:02:07,033 --> 00:02:07,500 All right. 47 00:02:07,500 --> 00:02:09,900 And then you just select this line execute. 48 00:02:09,900 --> 00:02:11,966 And this one cell package. 49 00:02:11,966 --> 00:02:12,300 All right. 50 00:02:12,300 --> 00:02:15,300 But I'm going to put this line as comment. 51 00:02:15,733 --> 00:02:17,933 Just press command to shift plus C. 52 00:02:17,933 --> 00:02:21,266 And now let's also add this line library. 53 00:02:21,900 --> 00:02:25,333 And in parenthesis the name of the package we want to select. 54 00:02:25,333 --> 00:02:28,333 Well import our port right. 55 00:02:28,500 --> 00:02:31,333 And as you can see our part is not selected here. 56 00:02:31,333 --> 00:02:33,733 And when this line of code will be executed 57 00:02:33,733 --> 00:02:35,766 then this will be selected here. 58 00:02:35,766 --> 00:02:36,300 All right. 59 00:02:36,300 --> 00:02:39,100 And now we are ready to start building the model. 60 00:02:39,100 --> 00:02:43,266 So as I just said we're going to take a function from this part library. 61 00:02:43,733 --> 00:02:46,200 And this function is actually our part as well. 62 00:02:46,200 --> 00:02:47,733 So let's use this function. 63 00:02:47,733 --> 00:02:50,633 And as usual we're going to call our regressor regressor. 64 00:02:50,633 --> 00:02:52,300 So here it is. 65 00:02:52,300 --> 00:02:55,066 So that's our decision tree regression regressor. 66 00:02:55,066 --> 00:02:58,066 And let's now use our part function. 67 00:02:58,433 --> 00:02:59,133 All right. 68 00:02:59,133 --> 00:03:01,666 So now let's see what arguments we need to input. 69 00:03:01,666 --> 00:03:05,966 The best way to do that is to press F1 here 70 00:03:06,266 --> 00:03:09,000 so that we get the info about the R part library. 71 00:03:09,000 --> 00:03:10,633 So usually we have it here. 72 00:03:10,633 --> 00:03:13,433 But then we just need to click on this link here. 73 00:03:13,433 --> 00:03:16,800 And that will give us info about this artboard library okay. 74 00:03:16,800 --> 00:03:18,300 So let's see what we have. 75 00:03:18,300 --> 00:03:19,733 The first argument is formula. 76 00:03:19,733 --> 00:03:21,200 So you know what to do here. 77 00:03:21,200 --> 00:03:25,000 We just need to write formula equals the dependent variable. 78 00:03:25,000 --> 00:03:25,800 Then a tilde. 79 00:03:25,800 --> 00:03:29,266 And then a dot that represents all your independent variables. 80 00:03:29,500 --> 00:03:32,200 All right. But that we know perfectly then data. 81 00:03:32,200 --> 00:03:34,733 So data is the data set on which we want to build 82 00:03:34,733 --> 00:03:36,600 our decision tree regression model. 83 00:03:36,600 --> 00:03:40,966 So since we did not build any training set data is going to be the data 84 00:03:40,966 --> 00:03:44,700 set exactly like polynomial regression and SVR regression. 85 00:03:45,233 --> 00:03:48,066 All right then weights is an optional argument. 86 00:03:48,066 --> 00:03:51,300 You can add some weights to make your model more advanced. 87 00:03:51,300 --> 00:03:52,833 But you know this is more advanced. 88 00:03:52,833 --> 00:03:55,000 So we will not cover that right now. 89 00:03:55,000 --> 00:03:58,566 And then you have some other arguments that are optional that that can help you 90 00:03:58,600 --> 00:04:02,533 make your model even more robust or, you know, include some regularization 91 00:04:02,533 --> 00:04:06,733 techniques, urbanization to prevent overfitting, that sort of things. 92 00:04:06,733 --> 00:04:09,766 But right now we just want to build a simple decision 93 00:04:09,766 --> 00:04:12,766 tree regression model because we have a simple data set. 94 00:04:12,766 --> 00:04:15,166 And so we only need formula and data. 95 00:04:15,166 --> 00:04:19,600 So let's go back to R and let's input these arguments. 96 00:04:19,600 --> 00:04:21,600 So first argument was formula. 97 00:04:22,966 --> 00:04:24,433 And then as I mentioned 98 00:04:24,433 --> 00:04:28,100 it's formula equals salary because it's our dependent variable. 99 00:04:28,866 --> 00:04:31,333 Then tilde just pressing alt. 100 00:04:31,333 --> 00:04:32,866 And here 101 00:04:32,866 --> 00:04:35,866 and then a dot or actually level. 102 00:04:36,366 --> 00:04:39,666 But you know I want to make this script that can be also 103 00:04:39,666 --> 00:04:42,300 applied to your data sets. So I'll use a dot. 104 00:04:42,300 --> 00:04:42,966 All right. 105 00:04:42,966 --> 00:04:44,666 So that's the first argument. 106 00:04:44,666 --> 00:04:46,166 And now let's give the second one. 107 00:04:46,166 --> 00:04:49,166 The second one was data. 108 00:04:49,333 --> 00:04:52,433 The data equals data set. 109 00:04:53,100 --> 00:04:56,100 And now our regressor is ready to be built.