1 00:00:00,300 --> 00:00:01,133 Are you ready? 2 00:00:01,133 --> 00:00:02,033 Let's do this. 3 00:00:02,033 --> 00:00:06,700 Let's start implementing our multiple linear regression model. 4 00:00:07,000 --> 00:00:08,366 So I just double clicked on it. 5 00:00:08,366 --> 00:00:11,733 If you like Google Collaboratory, feel free to open this link with me. 6 00:00:11,733 --> 00:00:13,366 Open with Google Collaboratory. 7 00:00:13,366 --> 00:00:15,700 And if you don't like Google Collaboratory, that's totally fine. 8 00:00:15,700 --> 00:00:18,266 You can open this file with Jupyter Notebook, 9 00:00:18,266 --> 00:00:21,266 but from your folder, which you downloaded on your machine. 10 00:00:21,333 --> 00:00:22,866 All right, choose your favorite. 11 00:00:22,866 --> 00:00:24,266 And here we go. 12 00:00:24,266 --> 00:00:27,600 Let's implement our multiple linear regression model. 13 00:00:28,800 --> 00:00:29,333 All right. 14 00:00:29,333 --> 00:00:33,066 So first remember that this file is in 15 00:00:33,066 --> 00:00:36,466 read only mode which means that we can't modify it. 16 00:00:36,566 --> 00:00:41,033 But no worries we're going to create a copy right away by going to file here. 17 00:00:41,033 --> 00:00:44,033 And then click save a copy in drive. 18 00:00:44,166 --> 00:00:48,833 This will as you can see, create a copy in which we will be able to re-implement 19 00:00:49,000 --> 00:00:52,100 this multiple linear regression model from scratch. 20 00:00:52,266 --> 00:00:56,600 Because indeed, I remind that this course is an action based course 21 00:00:56,833 --> 00:00:59,733 in which I want you to take action as much as you can. 22 00:00:59,733 --> 00:01:01,433 And therefore, we are going to re-implement 23 00:01:01,433 --> 00:01:04,400 this whole model from scratch together, step by step. 24 00:01:04,400 --> 00:01:07,933 And I really want you to code with me at the same time so that, you know, 25 00:01:08,033 --> 00:01:11,466 the practical skill can really be well integrated in your head. 26 00:01:11,666 --> 00:01:13,200 Okay, so let's do this. 27 00:01:13,200 --> 00:01:18,466 Let's first start by removing old code cells here and only the code cells, 28 00:01:18,466 --> 00:01:23,100 not the text cells because I want to keep that well highlighted structure 29 00:01:23,866 --> 00:01:26,166 for this implementation. 30 00:01:26,166 --> 00:01:29,066 So here I'm just removing all the code cells. 31 00:01:29,066 --> 00:01:30,900 There we go. And perfect. 32 00:01:30,900 --> 00:01:33,100 So this is the whole structure of this implementation. 33 00:01:33,100 --> 00:01:34,666 We can have a look at it here. 34 00:01:34,666 --> 00:01:38,800 And indeed that's why you know I wanted to brainstorm 35 00:01:38,800 --> 00:01:43,033 on the dataset with you first before showing you the structure 36 00:01:43,166 --> 00:01:46,866 in order to make you think what we must do in the data preprocessing phase. 37 00:01:46,866 --> 00:01:47,766 And as we said, 38 00:01:47,766 --> 00:01:52,100 we need first to import the libraries, then import the data set, that's for sure. 39 00:01:52,100 --> 00:01:53,133 And then here we go. 40 00:01:53,133 --> 00:01:55,966 We need to encode the categorical data. 41 00:01:55,966 --> 00:02:00,833 And more specifically that state column which contains needs three categories. 42 00:02:01,366 --> 00:02:04,366 And then of course we split the data set into the training set and test it. 43 00:02:04,566 --> 00:02:05,633 That's a must. 44 00:02:05,633 --> 00:02:08,366 And that will close the data preprocessing phase. 45 00:02:08,366 --> 00:02:12,266 And we will all be ready to start training while first building 46 00:02:12,266 --> 00:02:16,200 and then training the multiple linear regression model on the training set. 47 00:02:16,500 --> 00:02:19,500 And by doing this, our model will understand the correlations 48 00:02:19,500 --> 00:02:24,500 between all these spans of, you know, those stories and their generated profit. 49 00:02:24,700 --> 00:02:25,633 And so there we go. 50 00:02:25,633 --> 00:02:31,200 We will get a smart model which we will be able to use on new observations. 51 00:02:31,366 --> 00:02:35,133 And that's exactly what we'll do as a last step here to predict the test 52 00:02:35,133 --> 00:02:35,566 set results.