1
00:00:00,930 --> 00:00:02,830
Now, let's continue with our project.

2
00:00:02,850 --> 00:00:09,680
In this video, I want to do some scouting, so analyzing the rapid loss we have not not enough.

3
00:00:09,690 --> 00:00:13,750
I was happy for when the predictors have different ranges.

4
00:00:14,070 --> 00:00:20,310
Namba on respon variables by feature, having a greater numeric range could be more than the one having

5
00:00:20,310 --> 00:00:21,980
a less dramatic range.

6
00:00:22,620 --> 00:00:25,980
And this could in turn impact the prediction accuracy.

7
00:00:26,580 --> 00:00:32,790
Our goal is to improve predictive accuracy and not allow a particular feature to impact the prediction

8
00:00:32,790 --> 00:00:35,160
due to a large enumeration value range.

9
00:00:35,820 --> 00:00:43,050
Hence, we may need to scale values and the different features that they fall under a common range.

10
00:00:44,030 --> 00:00:51,800
Throughout this statistical procedure, it's impossible to compare identical variables belonging to

11
00:00:51,800 --> 00:00:58,410
different distributions and different variables, all variables expressed in different units.

12
00:00:59,810 --> 00:01:03,980
So remember, it's a good practice to raise carolita before training.

13
00:01:03,980 --> 00:01:12,020
A deep learning algorithm with rescaling data units are eliminated, allowing you to compare data from

14
00:01:12,020 --> 00:01:13,460
different location easily.

15
00:01:14,090 --> 00:01:21,920
So in this case, we will do that mean maximize our usual con feature scaling to get.

16
00:01:22,920 --> 00:01:30,600
On the scale data in the range from zero to one, so that try some formula for the.

17
00:01:32,610 --> 00:01:35,020
Before that, I want to test.

18
00:01:36,460 --> 00:01:38,500
So I scale.

19
00:01:40,590 --> 00:01:41,280
Equal.

20
00:01:42,590 --> 00:01:45,530
I mean, us, I mean.

21
00:01:54,030 --> 00:01:54,960
Divide by.

22
00:02:00,850 --> 00:02:02,080
Expedition.

23
00:02:04,420 --> 00:02:07,630
X marks my nose I made.

24
00:02:12,110 --> 00:02:21,610
So to perform features, clearly, we can't use the phrase preprocessing package available in the law

25
00:02:21,670 --> 00:02:29,030
library, so the ascalon not preprocessed processing the package provides several common utility functions

26
00:02:29,360 --> 00:02:36,540
and informal classes to modify the features available in our present representation that best our needs.

27
00:02:37,340 --> 00:02:37,880
So.

28
00:02:38,960 --> 00:02:40,670
We need to import our vacation.

29
00:02:44,090 --> 00:02:49,010
So from Ashqelon Dautry processing.

30
00:02:51,860 --> 00:02:52,700
In Paul.

31
00:02:54,630 --> 00:02:55,080
I mean.

32
00:02:56,400 --> 00:02:57,420
Mark Skoda.

33
00:02:59,850 --> 00:03:07,800
So to scale up each features between a given minimum and maximum value, in our case between zero and

34
00:03:07,800 --> 00:03:13,770
one, so that the maximum absolute value of each feature is scale to units I.

35
00:03:14,690 --> 00:03:18,890
The main Muscala function can be used by using.

36
00:03:21,100 --> 00:03:24,070
Skalla equa mean.

37
00:03:25,210 --> 00:03:25,600
Mark.

38
00:03:28,150 --> 00:03:28,960
Skalla.

39
00:03:32,240 --> 00:03:40,310
So now, just to have confirmation of what we are doing, we bring the barometer that we use for the

40
00:03:40,310 --> 00:03:41,300
next resizing.

41
00:03:42,310 --> 00:03:43,240
So that bring.

42
00:03:46,130 --> 00:03:49,460
Skalla Duffett.

43
00:03:58,620 --> 00:04:00,720
I got the arrow, so.

44
00:04:01,790 --> 00:04:03,020
Data is not defy.

45
00:04:20,260 --> 00:04:24,460
So I fast forward because I take quite a bit of time.

46
00:04:28,050 --> 00:04:31,200
So it is a bit tornadoes, hail and.

47
00:04:32,340 --> 00:04:35,670
Now, we got an error here, so is actually data.

48
00:04:40,040 --> 00:04:46,060
So now with that result, sort of it might hurt, can boost the minimum and maximum to be used for later

49
00:04:46,070 --> 00:04:46,670
scaling.

50
00:04:47,980 --> 00:04:48,550
So.

51
00:04:49,820 --> 00:04:53,900
Now we can scale the features so that.

52
00:04:54,910 --> 00:04:56,880
Make a note of caution here.

53
00:04:58,160 --> 00:05:00,890
And we your data scale.

54
00:05:02,390 --> 00:05:10,970
Iqua skalla don't fit on a train form, and in here I judge you the.

55
00:05:12,770 --> 00:05:13,940
Data that we got.

56
00:05:15,510 --> 00:05:16,500
Yet during this hail.

57
00:05:17,860 --> 00:05:27,010
And there is no mistyped sort of fit to inform our fish that feed to data and to information on a mobile

58
00:05:27,020 --> 00:05:35,080
array of chip is written so easily advisable to report the results in the starting form us the bandos

59
00:05:35,080 --> 00:05:38,360
data frame, at least for comparison purposes.

60
00:05:39,130 --> 00:05:40,230
So let's do it.

61
00:05:43,770 --> 00:05:44,880
So is it got me.

62
00:05:45,980 --> 00:05:46,970
July and then.

63
00:05:51,220 --> 00:05:53,260
And in here, do you believe that?

64
00:05:55,050 --> 00:05:56,850
Data from.

65
00:05:58,120 --> 00:05:58,690
So.

66
00:06:00,450 --> 00:06:06,000
Data scale and then column.

67
00:06:07,330 --> 00:06:08,050
Equa.

68
00:06:09,990 --> 00:06:10,520
P.

69
00:06:15,510 --> 00:06:16,120
Named.

70
00:06:17,760 --> 00:06:18,180
The.

71
00:06:20,670 --> 00:06:21,060
No.

72
00:06:25,000 --> 00:06:29,160
So let me run this hail and I got.

73
00:06:30,440 --> 00:06:35,410
So what the name I think.

74
00:06:50,390 --> 00:06:57,680
You should be shooting them and then we don't get any error, so to verify this defamation is carried.

75
00:06:57,830 --> 00:07:02,110
We need to bring the basic statistics that we already calculated.

76
00:07:04,190 --> 00:07:06,600
So let's bring this.

77
00:07:08,030 --> 00:07:12,020
Statistics show summary Ucore.

78
00:07:13,300 --> 00:07:20,530
Data scale, not this drive and then that brain.

79
00:07:21,870 --> 00:07:22,590
Samori.

80
00:07:25,620 --> 00:07:29,290
And think you and we got our results.

81
00:07:30,300 --> 00:07:37,050
So how is one thousand twenty nine one thousand twenty nine one thousand twenty nine, so.

82
00:07:39,280 --> 00:07:47,170
Now, every variable is included in the range between zero and one on features have value between zero

83
00:07:47,170 --> 00:07:47,630
and one.

84
00:07:48,460 --> 00:07:54,040
So as you say that they own him, but the value between zero and one.

85
00:07:55,200 --> 00:08:04,530
And now we check the results by floating the variable box again, so as an astronaut box plot.

86
00:08:07,270 --> 00:08:08,860
Data equal.

87
00:08:10,340 --> 00:08:11,360
Data scale.

88
00:08:15,740 --> 00:08:23,100
So this is our new book, Laws, and this Josh makes visual analysis much easier.

89
00:08:23,570 --> 00:08:25,270
Now everything is clearer.

90
00:08:25,530 --> 00:08:32,640
Is it possible to make a comparison between the predictors in what we can say, which features a rate

91
00:08:32,660 --> 00:08:35,970
of variability in the data?

92
00:08:36,890 --> 00:08:40,920
In addition, the possible outliners isolated pause.

93
00:08:41,450 --> 00:08:45,300
I'm more highlighted and that is on this video.

94
00:08:45,650 --> 00:08:46,850
I hope you enjoy it.

95
00:08:47,210 --> 00:08:49,520
And I will see you in the next video.