1 00:00:00,033 --> 00:00:00,966 Hello, my friends. 2 00:00:00,966 --> 00:00:02,100 All right, let's do this. 3 00:00:02,100 --> 00:00:06,133 Let's build and train the decision tree regression model on the whole data set. 4 00:00:06,366 --> 00:00:09,266 So I actually could ask you to do it yourself. 5 00:00:09,266 --> 00:00:12,866 You know, even without giving you the hint of what class you would use. 6 00:00:13,166 --> 00:00:17,233 Because I also would like you to practice to research the documentation. 7 00:00:17,466 --> 00:00:20,733 So if I were to ask you to try to do this before me, you know, 8 00:00:20,733 --> 00:00:24,500 to build and train this decision tree regression model on this data set. 9 00:00:24,800 --> 00:00:28,200 Well, what you would have to do would go to Google or Bing 10 00:00:28,200 --> 00:00:32,666 and then type in the search bar decision tree regression class of scikit learn. 11 00:00:32,966 --> 00:00:34,533 Then you would find the name of the class. 12 00:00:34,533 --> 00:00:36,400 It would probably be on the first link. 13 00:00:36,400 --> 00:00:40,500 And then based on all the work we did before you know, on how to build 14 00:00:40,500 --> 00:00:42,766 and train a machine learning model with scikit learn, 15 00:00:42,766 --> 00:00:45,766 you would totally be able to do this whole step on your own. 16 00:00:45,866 --> 00:00:46,200 All right. 17 00:00:46,200 --> 00:00:48,966 So if you want to do this, please press pause on this video 18 00:00:48,966 --> 00:00:52,066 and do the exercise and I'm sure you'll be proud to find the solution. 19 00:00:52,566 --> 00:00:53,100 All right. 20 00:00:53,100 --> 00:00:55,700 And now let's build a solution together. 21 00:00:55,700 --> 00:00:58,500 Starting with the creation of a new code cell. 22 00:00:58,500 --> 00:00:59,600 And here we go. 23 00:00:59,600 --> 00:01:02,966 The name of the class that can build a decision tree regression 24 00:01:02,966 --> 00:01:07,233 model in scikit learn is called decision tree regressor. 25 00:01:07,633 --> 00:01:12,000 It is a class that belongs to the tree module of scikit learn. 26 00:01:12,166 --> 00:01:16,133 And therefore here we're going to start from scikit learn, 27 00:01:16,533 --> 00:01:20,333 from which we're going to call that tree module 28 00:01:20,466 --> 00:01:24,233 and from which we're going to import that decision. 29 00:01:24,833 --> 00:01:27,300 Google collab will guess it. There we go. 30 00:01:27,300 --> 00:01:30,066 So be careful. It's not decision tree classifier. 31 00:01:30,066 --> 00:01:34,633 That will be for for three classification but decision tree regressor. 32 00:01:34,900 --> 00:01:35,466 All right. 33 00:01:35,466 --> 00:01:37,800 This is based on the same model. Decision trees. 34 00:01:37,800 --> 00:01:41,933 But decision tree regressor will predict a continuous numerical value. 35 00:01:41,933 --> 00:01:44,933 And decision tree classification will predict a category. 36 00:01:45,166 --> 00:01:50,100 All right then the next natural step here is of course to create an object 37 00:01:50,266 --> 00:01:53,466 or an instance of that decision tree regressor class. 38 00:01:53,700 --> 00:01:56,700 And we're going to call that as usual regressor, 39 00:01:57,200 --> 00:02:01,700 which will be equal of course to the call of this class. 40 00:02:01,700 --> 00:02:06,700 So I'm copying this and pasting that right here and adding some parenthesis. 41 00:02:07,266 --> 00:02:07,566 All right. 42 00:02:07,566 --> 00:02:10,600 So now the question is do we have to input anything here in the parenthesis. 43 00:02:10,800 --> 00:02:12,133 Well actually no you know 44 00:02:12,133 --> 00:02:16,133 there are not many parameters to tune in the decision tree regression model. 45 00:02:16,233 --> 00:02:20,100 I don't recommend to spend too much time tuning it and just to try it 46 00:02:20,200 --> 00:02:22,200 among your other regression models. 47 00:02:22,200 --> 00:02:26,166 But if you really want to tune it, please note that there is this part 48 00:02:26,166 --> 00:02:30,233 ten of the course which covers all the techniques of parameter tuning, 49 00:02:30,233 --> 00:02:34,533 which allows you to improve and optimize the performance of a single model. 50 00:02:34,800 --> 00:02:37,566 And so don't worry, you'll know how to deploy these techniques 51 00:02:37,566 --> 00:02:38,800 to and handsome model. 52 00:02:38,800 --> 00:02:39,366 But here 53 00:02:39,366 --> 00:02:43,133 we just want to learn how to build and train the decision tree regression model. 54 00:02:43,133 --> 00:02:43,900 And therefore 55 00:02:43,900 --> 00:02:48,266 we will only input one parameter, but which is just for training purposes. 56 00:02:48,300 --> 00:02:51,000 It is, you know, that random state parameter 57 00:02:51,000 --> 00:02:54,433 which will allow us to get the same result in the end, because indeed 58 00:02:54,666 --> 00:02:58,200 there are some random factors happening when you build 59 00:02:58,200 --> 00:03:00,100 and train your decision tree regressor. 60 00:03:00,100 --> 00:03:03,966 And therefore if we don't fix the seed, we will get slightly different 61 00:03:03,966 --> 00:03:05,033 results again. 62 00:03:05,033 --> 00:03:05,933 And you know, it's 63 00:03:05,933 --> 00:03:09,466 nicer to have the same results so that we can all be on the same page. 64 00:03:09,466 --> 00:03:13,366 So we're just going to input random underscore 65 00:03:13,366 --> 00:03:17,633 state parameter and set that equal to zero. 66 00:03:17,700 --> 00:03:18,100 Right. 67 00:03:18,100 --> 00:03:22,600 We're fixing the seed here with that zero value for the random state parameter. 68 00:03:23,133 --> 00:03:24,466 And now final step. 69 00:03:24,466 --> 00:03:26,966 Well you totally know how to finish this. 70 00:03:26,966 --> 00:03:32,100 We just need to take our regressor and then call the fit method 71 00:03:32,400 --> 00:03:37,066 which takes as input of course the matrix of features x the whole matrix 72 00:03:37,333 --> 00:03:40,700 and then the dependent variable vector y'all. 73 00:03:40,700 --> 00:03:40,866 Right. 74 00:03:40,866 --> 00:03:45,133 So this will actually train your decision tree regressor 75 00:03:45,466 --> 00:03:49,200 to understand the correlations between the position levels here 76 00:03:49,400 --> 00:03:53,800 and the salaries, after which you will have a trained model 77 00:03:54,133 --> 00:03:58,333 which you will be able to deploy in production to predict a new result 78 00:03:58,333 --> 00:04:02,633 and especially that salary of the position level 6.5. 79 00:04:03,000 --> 00:04:03,500 All right. 80 00:04:03,500 --> 00:04:05,033 So that's what we'll do in the next tutorial. 81 00:04:05,033 --> 00:04:08,933 But first let's not forget to run the cell to indeed 82 00:04:09,166 --> 00:04:12,166 build and train that decision tree regression model. 83 00:04:12,300 --> 00:04:14,600 And also I want to say congratulations. 84 00:04:14,600 --> 00:04:17,600 If you, you know, press pause at the beginning of this tutorial 85 00:04:17,700 --> 00:04:19,833 to code this by yourself first. 86 00:04:19,833 --> 00:04:23,100 And also congratulations to those who tried because that's really 87 00:04:23,100 --> 00:04:23,600 what matters. 88 00:04:23,600 --> 00:04:26,733 You know, to take action to at least try and practice. 89 00:04:26,966 --> 00:04:27,633 Okay. 90 00:04:27,633 --> 00:04:28,866 So all good here. 91 00:04:28,866 --> 00:04:31,200 That's great. We have our regression model. 92 00:04:31,200 --> 00:04:35,800 And now in that same spirit of, you know, taking action to practice 93 00:04:35,800 --> 00:04:39,900 and try to implement things on your own, well, I would like you to try 94 00:04:39,900 --> 00:04:43,733 to predict that salary of the position level number 6.5. 95 00:04:44,066 --> 00:04:47,700 And there is absolutely no trap here or no difficulty. 96 00:04:47,700 --> 00:04:50,700 So I have no doubt you will actually smash this. 97 00:04:50,700 --> 00:04:51,033 All right. 98 00:04:51,033 --> 00:04:55,266 So please try to do this and will implement the easy solution 99 00:04:55,266 --> 00:04:57,400 together in the next tutorial. 100 00:04:57,400 --> 00:04:58,966 Until then, enjoy machine learning.