1 00:00:00,930 --> 00:00:02,830 Now, let's continue with our project. 2 00:00:02,850 --> 00:00:09,680 In this video, I want to do some scouting, so analyzing the rapid loss we have not not enough. 3 00:00:09,690 --> 00:00:13,750 I was happy for when the predictors have different ranges. 4 00:00:14,070 --> 00:00:20,310 Namba on respon variables by feature, having a greater numeric range could be more than the one having 5 00:00:20,310 --> 00:00:21,980 a less dramatic range. 6 00:00:22,620 --> 00:00:25,980 And this could in turn impact the prediction accuracy. 7 00:00:26,580 --> 00:00:32,790 Our goal is to improve predictive accuracy and not allow a particular feature to impact the prediction 8 00:00:32,790 --> 00:00:35,160 due to a large enumeration value range. 9 00:00:35,820 --> 00:00:43,050 Hence, we may need to scale values and the different features that they fall under a common range. 10 00:00:44,030 --> 00:00:51,800 Throughout this statistical procedure, it's impossible to compare identical variables belonging to 11 00:00:51,800 --> 00:00:58,410 different distributions and different variables, all variables expressed in different units. 12 00:00:59,810 --> 00:01:03,980 So remember, it's a good practice to raise carolita before training. 13 00:01:03,980 --> 00:01:12,020 A deep learning algorithm with rescaling data units are eliminated, allowing you to compare data from 14 00:01:12,020 --> 00:01:13,460 different location easily. 15 00:01:14,090 --> 00:01:21,920 So in this case, we will do that mean maximize our usual con feature scaling to get. 16 00:01:22,920 --> 00:01:30,600 On the scale data in the range from zero to one, so that try some formula for the. 17 00:01:32,610 --> 00:01:35,020 Before that, I want to test. 18 00:01:36,460 --> 00:01:38,500 So I scale. 19 00:01:40,590 --> 00:01:41,280 Equal. 20 00:01:42,590 --> 00:01:45,530 I mean, us, I mean. 21 00:01:54,030 --> 00:01:54,960 Divide by. 22 00:02:00,850 --> 00:02:02,080 Expedition. 23 00:02:04,420 --> 00:02:07,630 X marks my nose I made. 24 00:02:12,110 --> 00:02:21,610 So to perform features, clearly, we can't use the phrase preprocessing package available in the law 25 00:02:21,670 --> 00:02:29,030 library, so the ascalon not preprocessed processing the package provides several common utility functions 26 00:02:29,360 --> 00:02:36,540 and informal classes to modify the features available in our present representation that best our needs. 27 00:02:37,340 --> 00:02:37,880 So. 28 00:02:38,960 --> 00:02:40,670 We need to import our vacation. 29 00:02:44,090 --> 00:02:49,010 So from Ashqelon Dautry processing. 30 00:02:51,860 --> 00:02:52,700 In Paul. 31 00:02:54,630 --> 00:02:55,080 I mean. 32 00:02:56,400 --> 00:02:57,420 Mark Skoda. 33 00:02:59,850 --> 00:03:07,800 So to scale up each features between a given minimum and maximum value, in our case between zero and 34 00:03:07,800 --> 00:03:13,770 one, so that the maximum absolute value of each feature is scale to units I. 35 00:03:14,690 --> 00:03:18,890 The main Muscala function can be used by using. 36 00:03:21,100 --> 00:03:24,070 Skalla equa mean. 37 00:03:25,210 --> 00:03:25,600 Mark. 38 00:03:28,150 --> 00:03:28,960 Skalla. 39 00:03:32,240 --> 00:03:40,310 So now, just to have confirmation of what we are doing, we bring the barometer that we use for the 40 00:03:40,310 --> 00:03:41,300 next resizing. 41 00:03:42,310 --> 00:03:43,240 So that bring. 42 00:03:46,130 --> 00:03:49,460 Skalla Duffett. 43 00:03:58,620 --> 00:04:00,720 I got the arrow, so. 44 00:04:01,790 --> 00:04:03,020 Data is not defy. 45 00:04:20,260 --> 00:04:24,460 So I fast forward because I take quite a bit of time. 46 00:04:28,050 --> 00:04:31,200 So it is a bit tornadoes, hail and. 47 00:04:32,340 --> 00:04:35,670 Now, we got an error here, so is actually data. 48 00:04:40,040 --> 00:04:46,060 So now with that result, sort of it might hurt, can boost the minimum and maximum to be used for later 49 00:04:46,070 --> 00:04:46,670 scaling. 50 00:04:47,980 --> 00:04:48,550 So. 51 00:04:49,820 --> 00:04:53,900 Now we can scale the features so that. 52 00:04:54,910 --> 00:04:56,880 Make a note of caution here. 53 00:04:58,160 --> 00:05:00,890 And we your data scale. 54 00:05:02,390 --> 00:05:10,970 Iqua skalla don't fit on a train form, and in here I judge you the. 55 00:05:12,770 --> 00:05:13,940 Data that we got. 56 00:05:15,510 --> 00:05:16,500 Yet during this hail. 57 00:05:17,860 --> 00:05:27,010 And there is no mistyped sort of fit to inform our fish that feed to data and to information on a mobile 58 00:05:27,020 --> 00:05:35,080 array of chip is written so easily advisable to report the results in the starting form us the bandos 59 00:05:35,080 --> 00:05:38,360 data frame, at least for comparison purposes. 60 00:05:39,130 --> 00:05:40,230 So let's do it. 61 00:05:43,770 --> 00:05:44,880 So is it got me. 62 00:05:45,980 --> 00:05:46,970 July and then. 63 00:05:51,220 --> 00:05:53,260 And in here, do you believe that? 64 00:05:55,050 --> 00:05:56,850 Data from. 65 00:05:58,120 --> 00:05:58,690 So. 66 00:06:00,450 --> 00:06:06,000 Data scale and then column. 67 00:06:07,330 --> 00:06:08,050 Equa. 68 00:06:09,990 --> 00:06:10,520 P. 69 00:06:15,510 --> 00:06:16,120 Named. 70 00:06:17,760 --> 00:06:18,180 The. 71 00:06:20,670 --> 00:06:21,060 No. 72 00:06:25,000 --> 00:06:29,160 So let me run this hail and I got. 73 00:06:30,440 --> 00:06:35,410 So what the name I think. 74 00:06:50,390 --> 00:06:57,680 You should be shooting them and then we don't get any error, so to verify this defamation is carried. 75 00:06:57,830 --> 00:07:02,110 We need to bring the basic statistics that we already calculated. 76 00:07:04,190 --> 00:07:06,600 So let's bring this. 77 00:07:08,030 --> 00:07:12,020 Statistics show summary Ucore. 78 00:07:13,300 --> 00:07:20,530 Data scale, not this drive and then that brain. 79 00:07:21,870 --> 00:07:22,590 Samori. 80 00:07:25,620 --> 00:07:29,290 And think you and we got our results. 81 00:07:30,300 --> 00:07:37,050 So how is one thousand twenty nine one thousand twenty nine one thousand twenty nine, so. 82 00:07:39,280 --> 00:07:47,170 Now, every variable is included in the range between zero and one on features have value between zero 83 00:07:47,170 --> 00:07:47,630 and one. 84 00:07:48,460 --> 00:07:54,040 So as you say that they own him, but the value between zero and one. 85 00:07:55,200 --> 00:08:04,530 And now we check the results by floating the variable box again, so as an astronaut box plot. 86 00:08:07,270 --> 00:08:08,860 Data equal. 87 00:08:10,340 --> 00:08:11,360 Data scale. 88 00:08:15,740 --> 00:08:23,100 So this is our new book, Laws, and this Josh makes visual analysis much easier. 89 00:08:23,570 --> 00:08:25,270 Now everything is clearer. 90 00:08:25,530 --> 00:08:32,640 Is it possible to make a comparison between the predictors in what we can say, which features a rate 91 00:08:32,660 --> 00:08:35,970 of variability in the data? 92 00:08:36,890 --> 00:08:40,920 In addition, the possible outliners isolated pause. 93 00:08:41,450 --> 00:08:45,300 I'm more highlighted and that is on this video. 94 00:08:45,650 --> 00:08:46,850 I hope you enjoy it. 95 00:08:47,210 --> 00:08:49,520 And I will see you in the next video.