1 00:00:00,066 --> 00:00:01,733 And now let's check it out. 2 00:00:01,733 --> 00:00:02,866 The model is actually ready. 3 00:00:02,866 --> 00:00:05,900 We can actually replace that by decision tree 4 00:00:05,900 --> 00:00:08,900 regression. 5 00:00:09,866 --> 00:00:12,266 And now let's check it out and you will see 6 00:00:12,266 --> 00:00:16,233 what the real decision tree regression model looks like in 1D. 7 00:00:16,500 --> 00:00:18,000 So let's do it. 8 00:00:18,000 --> 00:00:21,000 Let's select all this and execute. 9 00:00:21,533 --> 00:00:23,266 And here it is. 10 00:00:23,266 --> 00:00:24,500 That's what it looks like. 11 00:00:24,500 --> 00:00:25,200 And actually 12 00:00:25,200 --> 00:00:28,733 since it's a non continuous model we should even have some strict 13 00:00:28,800 --> 00:00:29,933 vertical line here. 14 00:00:29,933 --> 00:00:32,933 You know to represent better than non continuity. 15 00:00:32,933 --> 00:00:35,866 And to do this we just need to increase the resolution. 16 00:00:35,866 --> 00:00:39,000 So I'm going to put 0.01 and you'll see that 17 00:00:39,000 --> 00:00:42,800 we'll get the real representation of the decision tree regression results. 18 00:00:43,366 --> 00:00:45,766 So let's do it. 19 00:00:45,766 --> 00:00:47,700 Here it is now almost vertical. 20 00:00:47,700 --> 00:00:51,166 And that's a clear representation of the decision tree regression model. 21 00:00:51,466 --> 00:00:53,033 So now let's zoom on it. 22 00:00:53,033 --> 00:00:57,700 And now it makes much more sense because as Carol explained 23 00:00:57,700 --> 00:01:00,800 based on the entropy and the information gain, it splits 24 00:01:00,800 --> 00:01:04,333 the whole range of your independent variable into different intervals. 25 00:01:04,500 --> 00:01:07,166 So here we can clearly see where the intervals are. 26 00:01:07,166 --> 00:01:10,400 The first interval is from 1 to 6.5. 27 00:01:10,666 --> 00:01:14,533 The second interval is from 6.5 to 8.5. 28 00:01:14,566 --> 00:01:18,866 Then the third interval is from 8.5 to 9.5, 29 00:01:19,033 --> 00:01:22,533 and finally the last interval from 9.5 to 10. 30 00:01:22,833 --> 00:01:23,700 So there we go. 31 00:01:23,700 --> 00:01:25,633 We can clearly see the intervals now. 32 00:01:25,633 --> 00:01:30,133 And as Carol explained in the intuition tutorial, the decision tree regression 33 00:01:30,133 --> 00:01:33,900 model is considering the average of the dependent variable values 34 00:01:34,000 --> 00:01:35,766 in each of the intervals. 35 00:01:35,766 --> 00:01:38,833 This one, this one, this one, and this one. 36 00:01:39,000 --> 00:01:39,733 So for example, 37 00:01:39,733 --> 00:01:44,100 if we consider this interval here, the average of the salaries in this interval. 38 00:01:44,100 --> 00:01:45,100 Well that's very simple. 39 00:01:45,100 --> 00:01:47,966 It's actually 250,000. 40 00:01:47,966 --> 00:01:52,500 And so for each of the level between 6.5 and 8.5, 41 00:01:52,600 --> 00:01:56,766 the salaries will be predicted to be $250,000. 42 00:01:56,833 --> 00:02:02,400 So we already know what our model will predict for our 6.5 level here. 43 00:02:02,400 --> 00:02:05,233 It's going to predict 250 K. 44 00:02:05,233 --> 00:02:06,666 So speaking of this prediction. 45 00:02:06,666 --> 00:02:10,433 And now that we get the decision tree regression graphic results very well, 46 00:02:10,600 --> 00:02:14,666 let's actually check that the previous salary of this employee 47 00:02:14,666 --> 00:02:19,933 that had a 6.5 level in its previous company is actually 250 K. 48 00:02:20,366 --> 00:02:21,600 Let's check it out. 49 00:02:21,600 --> 00:02:24,200 Let's select this line and execute. 50 00:02:24,200 --> 00:02:29,066 And here it is 250,000 exactly like we predicted 51 00:02:29,100 --> 00:02:32,000 because we can clearly see that on this plot. 52 00:02:32,000 --> 00:02:34,333 So now we just want to say two things to conclude. 53 00:02:34,333 --> 00:02:38,100 The decision tree regression model is not an interesting model in 1D, 54 00:02:38,333 --> 00:02:42,233 but it can be a very interesting and very powerful model in more dimensions. 55 00:02:42,566 --> 00:02:45,600 So that's why you can use this code here for your data set. 56 00:02:45,700 --> 00:02:47,633 You have the code here that builds the model. 57 00:02:47,633 --> 00:02:49,666 And this one here then makes a prediction. 58 00:02:49,666 --> 00:02:51,433 But then you won't be able to use this code 59 00:02:51,433 --> 00:02:52,800 because you will have probably 60 00:02:52,800 --> 00:02:55,733 a lot of independent variables and therefore a lot of dimensions. 61 00:02:55,733 --> 00:02:58,066 But you know, care will give you the explanation in 2D. 62 00:02:58,066 --> 00:03:01,200 I'm giving you the explanation of 1D so that now in your mind, 63 00:03:01,200 --> 00:03:03,400 you can perfectly represent in your head 64 00:03:03,400 --> 00:03:06,000 the decision tree regression and how it works. 65 00:03:06,000 --> 00:03:08,800 And now I would like to end this tutorial by an enigma. 66 00:03:08,800 --> 00:03:12,633 In the next section you will see random forests and a random 67 00:03:12,633 --> 00:03:14,133 forest is actually really simple. 68 00:03:14,133 --> 00:03:17,100 It's just a team of several decision trees. 69 00:03:17,100 --> 00:03:20,100 So knowing that this is the result of one tree, 70 00:03:20,300 --> 00:03:23,500 what do you think will get with a team of ten trees 71 00:03:23,833 --> 00:03:26,766 or even 100 trees or 500 trees? 72 00:03:26,766 --> 00:03:29,200 The first question is, do you think we will get this shape 73 00:03:29,200 --> 00:03:30,600 of some stairs here? 74 00:03:30,600 --> 00:03:35,000 And the second question is, do you think we'll get a much more accurate prediction, 75 00:03:35,000 --> 00:03:39,333 like a prediction that is getting very close to the 160 K 76 00:03:39,333 --> 00:03:41,633 that is supposed to be the real salary. 77 00:03:41,633 --> 00:03:44,633 So these are the two questions I will let you think about that. 78 00:03:44,733 --> 00:03:47,733 And I look forward to giving you the solution in the next section. 79 00:03:48,200 --> 00:03:49,866 Until then, enjoy machine learning.