1 00:00:00,133 --> 00:00:01,700 Let's do it. 2 00:00:01,700 --> 00:00:02,733 So it's ready to be built. 3 00:00:02,733 --> 00:00:06,933 And that means all our code is ready to be executed. 4 00:00:06,933 --> 00:00:10,166 Because we don't need to change anything more. 5 00:00:10,566 --> 00:00:13,500 So let's execute these sections one by one, 6 00:00:13,500 --> 00:00:16,500 and let's see what happens with decision tree regression. 7 00:00:16,766 --> 00:00:19,766 So I'm going to select the first section. 8 00:00:19,800 --> 00:00:22,533 Execute data sets one imported. 9 00:00:22,533 --> 00:00:23,966 Here it is. 10 00:00:23,966 --> 00:00:24,533 All right. 11 00:00:24,533 --> 00:00:28,900 So then no need to split the data set into a training set and a test set. 12 00:00:28,900 --> 00:00:31,900 Because as you can see this is a very small data set. 13 00:00:32,100 --> 00:00:35,133 Then no need for feature scaling because for decision trees 14 00:00:35,266 --> 00:00:37,000 we don't need to do any feature scaling. 15 00:00:37,000 --> 00:00:40,066 Because the way this model is built is based on conditions 16 00:00:40,066 --> 00:00:43,533 on the independent variable and not on Euclidean distances. 17 00:00:43,900 --> 00:00:45,333 So we're fine with that. 18 00:00:45,333 --> 00:00:49,433 We definitely don't need to apply feature scaling, and we can move on 19 00:00:49,433 --> 00:00:52,433 to the next step which is to create our model. 20 00:00:52,433 --> 00:00:55,566 So let's create it executing all right. 21 00:00:55,566 --> 00:00:58,100 Perfect regressor is created. 22 00:00:58,100 --> 00:01:01,100 And now let's get our final verdict. 23 00:01:01,233 --> 00:01:01,600 Okay. 24 00:01:01,600 --> 00:01:04,766 So 160 K according to this person. 25 00:01:04,766 --> 00:01:08,366 And now let's see the predicted salary according to our model. 26 00:01:08,900 --> 00:01:14,433 So executing this and we get a $249,000. 27 00:01:14,433 --> 00:01:18,666 Well much higher than the salary mentioned by this person. 28 00:01:18,733 --> 00:01:20,666 But let's not drop 29 00:01:20,666 --> 00:01:24,100 hasty conclusions and let's see what's happening on the graph here. 30 00:01:24,566 --> 00:01:27,200 So I'm going to select all this 31 00:01:27,200 --> 00:01:30,600 and let's see what's happening with the decision tree regression results. 32 00:01:32,233 --> 00:01:32,733 All right. 33 00:01:32,733 --> 00:01:34,400 That's what I thought okay. 34 00:01:34,400 --> 00:01:37,666 So we don't need to zoom in to clearly see what's happening here. 35 00:01:37,900 --> 00:01:40,700 It's plotting a straight horizontal line. 36 00:01:40,700 --> 00:01:44,433 Exactly like we saw in SVR with for Python. 37 00:01:44,433 --> 00:01:45,100 For those of you 38 00:01:45,100 --> 00:01:47,100 who didn't follow the Python tutorial, 39 00:01:47,100 --> 00:01:49,400 note that we already encountered this situation. 40 00:01:49,400 --> 00:01:52,400 When we get a straight horizontal line. 41 00:01:52,500 --> 00:01:54,533 And actually in SVR. 42 00:01:54,533 --> 00:01:59,766 This was due to the fact that we didn't apply feature scaling to our data set. 43 00:02:00,466 --> 00:02:03,466 So what do you think the problem is here? 44 00:02:03,533 --> 00:02:06,466 Do you think it's due to the fact that we didn't apply feature scaling 45 00:02:06,466 --> 00:02:10,766 like for SVR, and we need to apply feature scaling to get a model fitting properly. 46 00:02:10,766 --> 00:02:12,133 The data set. 47 00:02:12,133 --> 00:02:15,133 Well, as I mentioned in the beginning of this tutorial, 48 00:02:15,133 --> 00:02:18,366 we definitely don't need to apply feature scaling for decision trees because 49 00:02:18,366 --> 00:02:22,933 decision tree regression models are based on condition on the independent variable. 50 00:02:22,933 --> 00:02:25,700 That has nothing to do with Euclidean distances. 51 00:02:25,700 --> 00:02:27,966 And you know, when we need to apply feature scaling, it's 52 00:02:27,966 --> 00:02:31,566 because the machine learning models are based on Euclidean distances, 53 00:02:31,566 --> 00:02:35,500 and we need to put all the independent variables on the same scale so that one 54 00:02:35,500 --> 00:02:38,500 independent variable is not dominating another one. 55 00:02:38,533 --> 00:02:40,600 But this is not the problem here. 56 00:02:40,600 --> 00:02:42,666 This is not about feature scaling. 57 00:02:42,666 --> 00:02:46,200 You can try to apply feature scaling here and re-execute this, 58 00:02:46,466 --> 00:02:49,466 but you'll get the same problem with a straight horizontal line. 59 00:02:49,466 --> 00:02:52,433 And of course this is actually the decision tree model. 60 00:02:52,433 --> 00:02:55,200 This is actually one model of decision tree. 61 00:02:55,200 --> 00:02:56,566 But this is of course 62 00:02:56,566 --> 00:02:59,866 not the best version of decision tree regression we want to get. 63 00:03:00,300 --> 00:03:03,000 So can you start seeing what's the problem here? 64 00:03:03,000 --> 00:03:07,366 And especially after watching the intuition tutorial made by Kirill, 65 00:03:07,400 --> 00:03:08,500 can you spot the problem? 66 00:03:09,533 --> 00:03:10,966 Okay, I'm going to tell you 67 00:03:10,966 --> 00:03:14,400 this problem is related to the number of splits. 68 00:03:14,700 --> 00:03:18,266 Because you know, the way the decision tree regression model is made 69 00:03:18,533 --> 00:03:21,833 is that it's making some splits based on different conditions. 70 00:03:21,833 --> 00:03:23,166 So the more conditions you have 71 00:03:23,166 --> 00:03:25,466 in your independent variables, the more you have splits. 72 00:03:25,466 --> 00:03:28,633 And here we clearly have no split here because you know, 73 00:03:28,800 --> 00:03:32,666 all the predictions are equal to $250,000. 74 00:03:32,700 --> 00:03:34,766 So, you know, it took all the different salaries 75 00:03:34,766 --> 00:03:37,766 for the different ten levels here and made an average 76 00:03:37,800 --> 00:03:40,533 and just gave the average for all the levels. 77 00:03:40,533 --> 00:03:43,300 So no conditions here, no splits. 78 00:03:43,300 --> 00:03:46,366 And therefore that's absolutely not interesting, 79 00:03:46,666 --> 00:03:49,766 especially for the potential the decision tree can have. 80 00:03:50,000 --> 00:03:51,133 So what we'll do now 81 00:03:51,133 --> 00:03:55,666 is add a parameter here that will set a condition on the splits. 82 00:03:55,900 --> 00:03:57,333 You know that's what I was telling you. 83 00:03:57,333 --> 00:03:59,900 We have several parameters in this output library. 84 00:03:59,900 --> 00:04:04,366 And we can use these optional parameters to improve our model 85 00:04:04,366 --> 00:04:05,700 and make it more robust. 86 00:04:05,700 --> 00:04:07,833 Well this is exactly what we're going to do now. 87 00:04:07,833 --> 00:04:10,600 We are going to get back to our part. 88 00:04:10,600 --> 00:04:12,866 So I'm going to press F1 here. 89 00:04:12,866 --> 00:04:15,500 Oh actually this time our part is showing up okay. 90 00:04:15,500 --> 00:04:17,566 So grades are parts here. 91 00:04:17,566 --> 00:04:21,200 And as I mentioned our part has several parameters 92 00:04:21,200 --> 00:04:24,200 that we can use to make our model more robust. 93 00:04:24,300 --> 00:04:26,900 And the one we're interested in right now is one parameter 94 00:04:26,900 --> 00:04:29,900 that will correct this problem we had with the splits. 95 00:04:30,300 --> 00:04:33,733 So this parameter is actually the control parameter. 96 00:04:33,733 --> 00:04:37,800 And right now I'm going to give you a little trick to solve this problem. 97 00:04:37,800 --> 00:04:39,933 On the splits we had just obtained here. 98 00:04:39,933 --> 00:04:43,500 So I'm going to add this third optional argument. 99 00:04:43,500 --> 00:04:45,266 You know to improve our model. 100 00:04:45,266 --> 00:04:47,900 Right now we doing some model performance improvement. 101 00:04:47,900 --> 00:04:51,666 So that's a thing that machine learning scientists do very often in their job. 102 00:04:51,666 --> 00:04:53,833 So don't worry we'll get more advanced sections on it, 103 00:04:53,833 --> 00:04:57,800 especially when we cover cross-validation to find the best models. 104 00:04:57,800 --> 00:04:59,700 Selecting the best parameters. 105 00:04:59,700 --> 00:05:02,666 But here we'll just do some simple model performance 106 00:05:02,666 --> 00:05:05,700 improvement and we'll just add this control parameter. 107 00:05:06,066 --> 00:05:08,066 And then I'm going to give you this little trick. 108 00:05:08,066 --> 00:05:13,000 So this little trick is to take the R part library again. 109 00:05:13,733 --> 00:05:16,733 So we're taking our part control here which is a function. 110 00:05:16,900 --> 00:05:19,200 And in this function we're going to add an argument. 111 00:05:19,200 --> 00:05:22,433 As you can see here on this yellow rectangle here 112 00:05:22,500 --> 00:05:24,766 we have the first argument that is min split. 113 00:05:24,766 --> 00:05:26,666 And that's exactly what we're interested in. 114 00:05:26,666 --> 00:05:29,033 And that's what will solve our problem. 115 00:05:29,033 --> 00:05:30,466 Because you know 116 00:05:30,466 --> 00:05:33,466 actually we didn't have any split here because it just took the average. 117 00:05:33,600 --> 00:05:37,300 So it's like we had no conditions on the independent variables and no splits. 118 00:05:37,666 --> 00:05:40,966 So to make sure we have some splits and some conditions 119 00:05:40,966 --> 00:05:44,000 on the dependent variables, we will actually set 120 00:05:44,300 --> 00:05:48,733 min splits to one and that will solve the problem.