1 00:00:01,680 --> 00:00:05,550 Now, let's revisit our observations from your duty. 2 00:00:10,230 --> 00:00:12,490 We have already corrected missing values. 3 00:00:12,930 --> 00:00:15,330 We have already corrected old Naiades. 4 00:00:17,100 --> 00:00:21,070 Now, let's change and transform the crime rate variable. 5 00:00:22,960 --> 00:00:28,960 Before starting, let's block the scatterplot of prime rib and thighs. 6 00:00:31,030 --> 00:00:31,780 So we will write. 7 00:00:32,720 --> 00:00:33,580 On this joint. 8 00:00:33,760 --> 00:00:34,080 Lord. 9 00:00:37,670 --> 00:00:39,560 And then record will, right? 10 00:00:39,870 --> 00:00:41,250 Exequatur to crime rate. 11 00:00:47,160 --> 00:00:48,780 I'd like to place. 12 00:00:58,000 --> 00:01:00,160 And data is the. 13 00:01:03,870 --> 00:01:10,180 If you are reopening the existing non book, you have to first import the libraries. 14 00:01:10,920 --> 00:01:14,340 Let's import seabourne number and find pundits. 15 00:01:18,220 --> 00:01:19,200 We'll run this again. 16 00:01:21,450 --> 00:01:22,900 Now, let's be executed. 17 00:01:23,000 --> 00:01:23,420 This. 18 00:01:26,760 --> 00:01:31,650 Now, imagine the relations seems to be like this. 19 00:01:34,590 --> 00:01:37,160 The good looks like logarithmic curve. 20 00:01:37,800 --> 00:01:39,370 And we need to transform this. 21 00:01:39,440 --> 00:01:43,180 Good to have a linear relationship between X and Y. 22 00:01:45,270 --> 00:01:48,750 The one way I can think of is to take all of this. 23 00:01:50,380 --> 00:01:53,640 If we take Lolo forward expedition, which is primary. 24 00:01:55,320 --> 00:02:01,170 We can get the kind of linear relationship of Bambrick with what wavery but. 25 00:02:03,970 --> 00:02:07,240 Since most of the values are near zero. 26 00:02:08,650 --> 00:02:10,830 And logoff zero is not defined. 27 00:02:11,470 --> 00:02:13,840 And then Staats minus infinity. 28 00:02:16,540 --> 00:02:21,130 So to remove this, we will add a value of one to our crime rate. 29 00:02:22,500 --> 00:02:24,610 So let us transform this video. 30 00:02:24,960 --> 00:02:28,070 Will it be if the same day? 31 00:02:31,260 --> 00:02:32,530 Equate it to. 32 00:02:33,850 --> 00:02:39,010 And we don't know, since Lock is a function from the library. 33 00:02:39,680 --> 00:02:40,010 Right. 34 00:02:40,270 --> 00:02:42,640 And we don't log and record. 35 00:02:42,670 --> 00:02:49,510 We'll mention one less thing if I remember to put this one. 36 00:02:53,290 --> 00:03:00,940 Since low, both zero tends towards minus infinity and no love one is zero. 37 00:03:01,390 --> 00:03:03,640 So it's better to start from low of one. 38 00:03:05,720 --> 00:03:07,190 If we execute this. 39 00:03:08,340 --> 00:03:12,510 Now, let's again loto joint law between crime data and price. 40 00:03:12,990 --> 00:03:13,460 Copy this. 41 00:03:13,500 --> 00:03:13,850 Come on. 42 00:03:16,780 --> 00:03:17,810 Next year, this. 43 00:03:19,720 --> 00:03:23,260 You can see now the relationship is looking more linear. 44 00:03:23,640 --> 00:03:27,360 Earlier we were getting a girl like Lord. 45 00:03:28,150 --> 00:03:31,960 But now we are getting the linear, somewhat linear plot. 46 00:03:33,850 --> 00:03:35,020 I suggest you to. 47 00:03:36,070 --> 00:03:39,490 Try some other transformation on your own. 48 00:03:39,670 --> 00:03:43,850 If you find some better transformation, go ahead with that transformation. 49 00:03:45,050 --> 00:03:50,390 But always tried to make relationship between X and Y very well, a linear relationship. 50 00:03:52,950 --> 00:03:54,990 And one more thing to note, serious. 51 00:03:55,530 --> 00:03:59,730 You can see there are no visible outliers or here. 52 00:04:00,240 --> 00:04:04,840 There is no point, which seems like all office space or although this graph. 53 00:04:07,810 --> 00:04:12,110 So as I mentioned earlier, this is another way to treat outlets. 54 00:04:12,830 --> 00:04:19,000 We have just transform a logo with the relationship between our X and Y video, but now the outlets 55 00:04:19,030 --> 00:04:20,140 are already treating. 56 00:04:25,230 --> 00:04:27,280 Esther discussed in our theory lecture. 57 00:04:27,700 --> 00:04:30,960 We have four variables for distances. 58 00:04:31,150 --> 00:04:33,820 This one is to be street best for. 59 00:04:40,430 --> 00:04:43,500 All these variables are conveying the same information. 60 00:04:44,960 --> 00:04:48,050 Which is the distance from the employment hub. 61 00:04:48,830 --> 00:04:57,530 So let's just create a average variable of this for distances to convey the same information in a single 62 00:04:57,530 --> 00:04:57,960 variable. 63 00:05:00,380 --> 00:05:01,640 OK, we're done you very well. 64 00:05:01,910 --> 00:05:02,410 All right. 65 00:05:03,860 --> 00:05:06,850 The year and then speed records. 66 00:05:06,930 --> 00:05:08,330 We'll mention of you. 67 00:05:10,640 --> 00:05:12,990 But even then, it is the best. 68 00:05:15,120 --> 00:05:21,710 And we wanted this really want to be evidence values of this one, this will be seen this fall. 69 00:05:22,290 --> 00:05:22,580 Right. 70 00:05:28,960 --> 00:05:29,530 This one. 71 00:05:32,140 --> 00:05:33,420 It's Boobies three. 72 00:05:34,180 --> 00:05:38,560 Does this fall and winter divided by four? 73 00:05:38,810 --> 00:05:41,760 Again, the average really good this. 74 00:05:44,760 --> 00:05:45,150 No. 75 00:05:47,430 --> 00:05:49,950 Let's run the duty of this. 76 00:05:50,400 --> 00:05:51,330 We want to f. 77 00:05:52,000 --> 00:05:52,230 B. 78 00:05:52,420 --> 00:05:53,310 But despite. 79 00:05:58,890 --> 00:06:00,040 We executed this. 80 00:06:04,130 --> 00:06:10,250 You can see them being well, wouldn't you, a table, which is evidence distance. 81 00:06:13,020 --> 00:06:15,720 Now, the FTC on average spends. 82 00:06:16,740 --> 00:06:23,940 We can also take a minimum of this four or maximum of this for the census to represent the same information. 83 00:06:24,570 --> 00:06:30,810 But for now, we have taken on the average variable, you can try those variations on your own. 84 00:06:31,940 --> 00:06:38,620 So maybe people in the city looked for the nearest employments and that instead of that evidence, the 85 00:06:38,630 --> 00:06:40,340 sense of the employment center. 86 00:06:41,730 --> 00:06:48,690 So depending on your business and depending on your business knowledge, try to evaluate all such variations 87 00:06:48,900 --> 00:06:51,220 and choose the appropriate dummy variable. 88 00:06:52,630 --> 00:06:58,200 Or more, we are on the losing average of this, for instance. 89 00:06:59,640 --> 00:07:02,050 We will be using this evidence this distance very. 90 00:07:02,160 --> 00:07:02,370 But. 91 00:07:03,600 --> 00:07:09,420 And we will be moving this forward with EBOs because average distance is, in a sense, representing 92 00:07:09,480 --> 00:07:10,860 all of this for the variables. 93 00:07:13,560 --> 00:07:19,280 So let's build this photo every once distance, one distance built since three. 94 00:07:19,550 --> 00:07:21,860 And this is for the late. 95 00:07:23,080 --> 00:07:25,120 Come on, it's dealt with, right? 96 00:07:25,200 --> 00:07:25,500 Ben? 97 00:07:28,420 --> 00:07:29,350 And we'll mention. 98 00:07:31,100 --> 00:07:32,450 But call them names, so be it. 99 00:07:33,160 --> 00:07:34,350 And screw that record. 100 00:07:34,870 --> 00:07:35,100 Right. 101 00:07:35,750 --> 00:07:39,630 But it's done this now. 102 00:07:39,760 --> 00:07:40,390 That's done that. 103 00:07:40,390 --> 00:07:41,100 You do again. 104 00:07:42,070 --> 00:07:47,830 Whether we actually going the Liberty Bell or not, you can say you dispense. 105 00:07:47,890 --> 00:07:49,060 One is not here. 106 00:07:49,510 --> 00:07:50,530 We have been ordered to do so. 107 00:07:50,910 --> 00:07:55,830 Well ahead in the nation's history, this fourth as well. 108 00:08:12,600 --> 00:08:14,910 That's the video game. 109 00:08:22,420 --> 00:08:25,380 You can see a little distance with able on one. 110 00:08:25,990 --> 00:08:29,320 And be on the bridge distance, no data. 111 00:08:31,750 --> 00:08:37,990 Now, if you remember, we had four observations from our unity. 112 00:08:38,530 --> 00:08:46,000 And one of them was still believe the bus terminal variable since it was undertaking one single value. 113 00:08:47,680 --> 00:08:51,900 And it was not providing any useful information or mobile. 114 00:08:53,350 --> 00:08:55,280 Let's dilute that variable, Esman. 115 00:09:08,210 --> 00:09:11,290 Let's look at the first five goals of our data. 116 00:09:16,270 --> 00:09:20,160 You can see, but Steadman is not a part of or does it?