1 00:00:00,960 --> 00:00:04,970 So now let us learn how to run a simple linear regression model in order. 2 00:00:07,960 --> 00:00:08,800 It is very simple. 3 00:00:09,070 --> 00:00:12,540 We just need to write two lines of code and we'll get out as eight. 4 00:00:13,730 --> 00:00:15,800 For this, we are going to use the L. 5 00:00:15,830 --> 00:00:17,870 M function with Stanford linear model. 6 00:00:19,080 --> 00:00:23,820 So we'll assign a new variable, which is let's call it simple model. 7 00:00:25,360 --> 00:00:27,600 So we'll create a variable name, simple model. 8 00:00:31,750 --> 00:00:42,520 And this will get devalue from a limb function, so it is L.M. and within bracket Foster's D dependent 9 00:00:42,520 --> 00:00:43,550 variable, does it? 10 00:00:43,550 --> 00:00:43,950 Debatable. 11 00:00:43,960 --> 00:00:46,150 We want to predict which is host place. 12 00:00:48,340 --> 00:00:49,120 Then Atilla. 13 00:00:51,810 --> 00:00:57,150 And after this, we will write all of these independent variables that we want to run in this linear 14 00:00:57,150 --> 00:00:57,540 model. 15 00:00:58,470 --> 00:01:04,170 So since there's a simple integration and we will take only one variable, we will choose a room known 16 00:01:04,170 --> 00:01:07,980 as the variable that we are going to put an end with to invade. 17 00:01:08,090 --> 00:01:08,670 Room num. 18 00:01:11,990 --> 00:01:13,040 And their days be off. 19 00:01:13,280 --> 00:01:14,740 So commodities are going to be of. 20 00:01:17,940 --> 00:01:22,040 So the first is dependent variable, which is price. 21 00:01:22,900 --> 00:01:28,800 The Stelazine is used to say on the left side of it, it will be dependent variable. 22 00:01:28,890 --> 00:01:31,590 On the right side of it will be the independent variables. 23 00:01:32,280 --> 00:01:34,650 So do numbers, the independent variable we are using her. 24 00:01:34,860 --> 00:01:39,180 And it is Dee Dee of data that we have created earlier, that Senate. 25 00:01:41,890 --> 00:01:43,650 Today is a new variable. 26 00:01:44,520 --> 00:01:45,760 Simple, more nuclear today. 27 00:01:47,230 --> 00:01:49,110 So let us see what is in the symbol morning. 28 00:01:49,330 --> 00:01:50,830 So will write somebody. 29 00:01:53,230 --> 00:01:54,420 And within brackets, right? 30 00:01:54,640 --> 00:01:55,080 Simple-Minded. 31 00:02:06,280 --> 00:02:08,940 So that's why he said you can see the whole result. 32 00:02:10,710 --> 00:02:13,530 This is the dessert that we wanted from us in Berlin, integration. 33 00:02:15,190 --> 00:02:18,490 Here you can see this columnist named Estimate. 34 00:02:18,740 --> 00:02:21,460 This is giving us the values of B, does it all be done? 35 00:02:21,500 --> 00:02:25,020 And any other bits that we will have to intercept? 36 00:02:25,730 --> 00:02:26,960 Is the constant value. 37 00:02:27,170 --> 00:02:28,230 So this is B to zero. 38 00:02:29,440 --> 00:02:32,150 Room them is that it is our variable. 39 00:02:32,900 --> 00:02:35,060 This is beta one for this variable. 40 00:02:36,410 --> 00:02:40,130 Standard at it, as we told you, and the two related. 41 00:02:40,550 --> 00:02:42,620 These are values, a standard errors. 42 00:02:43,930 --> 00:02:51,370 Using this B.W. and the standard revenue we can do to value, if you remember P-value, is BITA minus 43 00:02:51,370 --> 00:02:52,810 zero divided by standard error. 44 00:02:53,410 --> 00:02:54,550 So this is the key value. 45 00:02:55,830 --> 00:03:00,270 Corresponding to P-value, the software has given us the program to value. 46 00:03:01,410 --> 00:03:07,800 P-value is actually telling us whether the variable is significantly impacting the way variable, which 47 00:03:07,800 --> 00:03:08,550 is house price. 48 00:03:09,300 --> 00:03:17,340 So these three stars, these mean you can look at this index here, three star means zero point zero 49 00:03:17,340 --> 00:03:24,840 zero one level, which means that we are ninety nine point nine percent confident that it is significantly 50 00:03:24,840 --> 00:03:26,550 impacting the house price. 51 00:03:27,520 --> 00:03:34,480 So room number is a variable which is significantly impacting and its relationship is this with the. 52 00:03:36,110 --> 00:03:46,220 What does vitamin D, if you increase Ranum value by one unit, are whole sprays will increase by nine 53 00:03:46,220 --> 00:03:47,480 point zero nine units. 54 00:03:49,690 --> 00:03:53,980 Apart from that, we have discussed it as a double standard, it is E! 55 00:03:55,630 --> 00:04:00,630 We're just coming out to six point five, nine seven on 504 degrees of freedom. 56 00:04:00,810 --> 00:04:05,790 It is 504 because we took N minus two degrees of freedom. 57 00:04:07,070 --> 00:04:08,780 And N is free under six. 58 00:04:09,080 --> 00:04:11,300 So this is 504 degrees of freedom. 59 00:04:13,060 --> 00:04:15,380 This values R-squared just coming out. 60 00:04:15,420 --> 00:04:16,450 Two point forty eight. 61 00:04:16,810 --> 00:04:23,500 Which means that nearly forty eight percent of the variance in the values of house price. 62 00:04:24,880 --> 00:04:27,540 Is explained by this simple model. 63 00:04:29,070 --> 00:04:35,820 Or we can increase this R-squared by introducing new variables which will explain the other parts of 64 00:04:35,820 --> 00:04:38,230 the variants which are not explained by this variable alone. 65 00:04:39,470 --> 00:04:45,370 A district R-squared is something we will discuss when we do multinomial division, and this statistic 66 00:04:45,460 --> 00:04:47,800 will also be discussed in the coming lecture's. 67 00:04:49,660 --> 00:04:56,170 So let me summarize it with one simple linear regression model by simple linear regression, I mean 68 00:04:56,170 --> 00:04:58,060 that we had only one variable. 69 00:04:58,150 --> 00:05:03,070 So price is being predicted using only two known variable. 70 00:05:03,910 --> 00:05:09,880 We ran this model using the single lane and we saw the summary of this variable that we created, simple 71 00:05:09,880 --> 00:05:10,270 model. 72 00:05:10,720 --> 00:05:13,300 And in this somebody we get all the information that we wanted. 73 00:05:14,140 --> 00:05:15,460 These are the values of redoes. 74 00:05:16,480 --> 00:05:19,270 These are the property values you're looking at this. 75 00:05:19,300 --> 00:05:23,580 We understand that variable is significantly impacting the house price. 76 00:05:25,420 --> 00:05:32,540 And this is the relationship of this variable, and these are the Odyssey and R-squared values or this 77 00:05:32,540 --> 00:05:32,860 model. 78 00:05:34,440 --> 00:05:40,710 Now, if you want to also applaud the relationship of these two variables, since these are only two 79 00:05:40,710 --> 00:05:43,770 variables and we can have a two dimensional graph to represent it. 80 00:05:45,340 --> 00:05:52,660 Let's first plodder scatterplot of these two variables to plot that will rate plot and within bracket 81 00:05:52,890 --> 00:05:56,260 we will wait D.F. dollar room no comma. 82 00:06:00,750 --> 00:06:05,790 The Abdollah price, this is X variable and then divide a variable. 83 00:06:06,390 --> 00:06:07,140 Let us run this. 84 00:06:08,870 --> 00:06:14,480 You can see in the plot it is scatterplot, each point is represented by these small circles. 85 00:06:15,540 --> 00:06:18,310 And it has a linear kind of relationship. 86 00:06:19,940 --> 00:06:21,790 And what is the predicted lane? 87 00:06:23,020 --> 00:06:28,950 Do right, that will light a bee line and within bracket rule, write simple model. 88 00:06:32,890 --> 00:06:36,060 So this whole plot, that line on the same stuff. 89 00:06:36,880 --> 00:06:42,580 And here's that line that we have predicted after running these Amberly mitigation model, which is 90 00:06:42,610 --> 00:06:44,850 approximately predicting the values of place. 91 00:06:45,250 --> 00:06:49,720 So you can also see the values of Beadle's dealing with down here. 92 00:06:50,490 --> 00:06:56,500 To be the one meant that if they increase the value of rule number one, unit price should increase 93 00:06:56,500 --> 00:06:58,210 by nearly 10 units. 94 00:06:58,750 --> 00:06:59,830 If you look at this here. 95 00:07:00,010 --> 00:07:06,070 If I'm increasing it from five to six, it is increasing from nearly 10 to 20. 96 00:07:08,090 --> 00:07:11,370 So we're just 19 unit also beat. 97 00:07:11,570 --> 00:07:15,100 Zero is the intercept on the Y axis. 98 00:07:15,620 --> 00:07:19,300 So when when X is zero, what is the value of Y? 99 00:07:20,240 --> 00:07:26,240 If I extrapolate this line, you can probably imagine what the other 340 units on x axis, if I meet 100 00:07:26,240 --> 00:07:33,320 it at the zero level of X, it'll probably be coming at nearly minus thirtyfold value of Y axis. 101 00:07:33,740 --> 00:07:35,060 So that's the Y intercept. 102 00:07:36,500 --> 00:07:37,430 So this is all we're done. 103 00:07:37,610 --> 00:07:38,960 Simple linear regression model.