1 00:00:00,600 --> 00:00:04,220 In this section, we are going to cover moving average smoothly. 2 00:00:04,980 --> 00:00:10,680 You got a glimpse of this when we saw the window type features in the feature engineering lecture. 3 00:00:12,270 --> 00:00:15,360 Now, we will cover this in a more structured format. 4 00:00:17,340 --> 00:00:20,100 So we will cover these three things first. 5 00:00:20,610 --> 00:00:24,450 What is moving average and why we do it? 6 00:00:25,070 --> 00:00:31,020 Then we will see these two types of moving averages centered, window moving average and grilling window 7 00:00:31,020 --> 00:00:31,770 moving average. 8 00:00:33,770 --> 00:00:40,760 Lastly, we will see how moving average is used for feature engineering and how it is used, what prediction 9 00:00:40,760 --> 00:00:41,330 purposes. 10 00:00:42,050 --> 00:00:42,770 So let's start. 11 00:00:45,230 --> 00:00:53,330 Moving out, it's moving is simply creating a new series where the values are averages of the draw observations 12 00:00:54,080 --> 00:00:55,520 in the original time cities. 13 00:00:57,050 --> 00:01:01,230 Let us see an example to understand this here. 14 00:01:01,730 --> 00:01:06,380 I have Monterey's average temperature data in the month of January. 15 00:01:06,860 --> 00:01:10,030 We had average temperature of 39 degrees Fahrenheit. 16 00:01:11,450 --> 00:01:14,660 In fact, it was 42 in the month of March. 17 00:01:14,930 --> 00:01:16,670 It was 50 degrees Fahrenheit. 18 00:01:18,200 --> 00:01:23,390 Now, in March, we can find out the average of last three months. 19 00:01:24,620 --> 00:01:28,880 The average temperature of last three months would be 44 degrees Fahrenheit. 20 00:01:30,890 --> 00:01:33,110 We can do this for other months. 21 00:01:33,350 --> 00:01:35,130 And this is also true. 22 00:01:35,420 --> 00:01:41,400 If we have April, we can find the average of February, March and April. 23 00:01:43,250 --> 00:01:49,520 In the month of May, we can average the of March, April and May. 24 00:01:49,850 --> 00:01:54,050 This value comes out to be sixty in this way. 25 00:01:54,650 --> 00:02:00,680 We are averaging the last three values of temperature and creating a new series. 26 00:02:02,140 --> 00:02:09,410 The three values which we are averaging are moving along the cities as we are moving forward in the 27 00:02:09,410 --> 00:02:19,280 cities and the new cities that we get by finding the average of last few months is the new moving average 28 00:02:19,280 --> 00:02:19,790 cities. 29 00:02:21,290 --> 00:02:22,670 But why did we do this? 30 00:02:25,040 --> 00:02:29,300 Let's compare a graph of raw observations and moving averages. 31 00:02:31,700 --> 00:02:37,500 Here you can see that the moving average graph of stock prices is more smooth. 32 00:02:37,820 --> 00:02:38,770 Then the raw data. 33 00:02:41,130 --> 00:02:44,790 It does remove defined green variations between timestamps. 34 00:02:47,010 --> 00:02:47,790 Our agenda. 35 00:02:47,880 --> 00:02:57,030 In doing so is that in this process, we are trying to remove the noise and trying to find the underlying 36 00:02:57,030 --> 00:02:57,660 process. 37 00:02:59,560 --> 00:03:05,050 So you can see in this graph, by averaging out devalues our stock price. 38 00:03:06,610 --> 00:03:12,220 We have identified this underlying process in these stock prices. 39 00:03:15,450 --> 00:03:24,640 Now, in not monthly temperature data, for example, we were averaging last three values, these three 40 00:03:24,640 --> 00:03:29,110 values constitute that window in which we do the averaging. 41 00:03:30,280 --> 00:03:31,960 And this window is moon. 42 00:03:32,230 --> 00:03:40,240 As we move along this series, this window has two parameters that we need to fix before we start doing 43 00:03:40,780 --> 00:03:41,980 moving average modeling. 44 00:03:43,780 --> 00:03:46,110 The first parameter is the window bit. 45 00:03:46,660 --> 00:03:51,980 That is the number of values that we want to average over here. 46 00:03:52,150 --> 00:03:52,690 The window. 47 00:03:52,810 --> 00:03:56,170 It is three because we are averaging three values. 48 00:03:57,400 --> 00:04:01,510 If you want to average last five values window, it becomes five. 49 00:04:04,110 --> 00:04:07,610 The second part of me that is very you want to position this window. 50 00:04:09,990 --> 00:04:17,160 So right now, at the third observation, we position this window so that we average the last three 51 00:04:17,160 --> 00:04:17,730 values. 52 00:04:18,360 --> 00:04:25,020 That is we are averaging the first, second and third value to get the moving average value for third 53 00:04:25,020 --> 00:04:25,620 observation. 54 00:04:27,060 --> 00:04:31,470 So for them, month of March, we got the moving average value of 44. 55 00:04:31,650 --> 00:04:36,210 By averaging temperature values for January, February and March. 56 00:04:38,330 --> 00:04:40,820 This is called really moving average. 57 00:04:42,710 --> 00:04:49,730 On the other hand, if you please, the window in such a way that the current timestep is at its center, 58 00:04:50,750 --> 00:04:53,480 that will be called centered moving average. 59 00:04:55,360 --> 00:05:02,560 So if we find the average of Jan FIB and March and a sign that moving average value to decentered value, 60 00:05:02,830 --> 00:05:07,030 which is February, then this is called centered moving average. 61 00:05:11,520 --> 00:05:16,140 But as you can note, that for centered moving average. 62 00:05:16,380 --> 00:05:20,850 You need to know the next value of the future value. 63 00:05:21,120 --> 00:05:21,630 Already. 64 00:05:23,270 --> 00:05:27,090 And since and discourse, we are trying to predict those future values. 65 00:05:27,390 --> 00:05:30,000 We will not be using these standard moving average. 66 00:05:33,190 --> 00:05:36,400 Now, trailing, moving average can be used in two ways. 67 00:05:38,050 --> 00:05:42,580 One is feature engineering and the other is forecasting. 68 00:05:44,770 --> 00:05:46,840 The context behind both is simple. 69 00:05:47,530 --> 00:05:56,500 If you want to forecast the city's value at time C, D, D, you can use the average value of the previous 70 00:05:56,500 --> 00:05:57,430 few time cities. 71 00:05:58,660 --> 00:06:05,020 If you use these average values as a new feature, then it is feature engineering. 72 00:06:06,790 --> 00:06:13,870 On the other hand, you can simply assign this average value also as the forecasted value for time P 73 00:06:13,870 --> 00:06:14,350 plus one. 74 00:06:16,330 --> 00:06:23,830 So, for example, if you want to forecast for the month of June, you can simply use this moving average 75 00:06:23,830 --> 00:06:24,310 value. 76 00:06:24,760 --> 00:06:28,690 That is sixty has the forecasted value for the month of June. 77 00:06:31,450 --> 00:06:38,860 Of course, this is a very simple and a naive method of forecasting, and it assumes that there is no 78 00:06:38,860 --> 00:06:40,930 trend and seasonality in the data. 79 00:06:42,340 --> 00:06:46,720 So in terms of accuracy, this method does not perform very well. 80 00:06:47,620 --> 00:06:54,580 But in terms of its simplicity and the ease of implementation, it is often preferred to get a rough 81 00:06:54,580 --> 00:06:58,210 estimate of future values using the moving average method. 82 00:07:00,640 --> 00:07:04,290 So that's all the duty we need to know in this particular session. 83 00:07:04,390 --> 00:07:10,810 We will find out how to get billing values and how we use those trailing values as a new feature.