1 00:00:00,420 --> 00:00:06,330 In this video, we're going to discuss about autocorrelation and partial autocorrelation. 2 00:00:08,260 --> 00:00:14,790 Earlier, we had discussed auto regression model where we tried to find the relationship between the 3 00:00:14,790 --> 00:00:18,910 decent value of the variable and the historical values of the variable. 4 00:00:21,250 --> 00:00:29,260 But did we also had a question that how do we know how many land values or historical values should 5 00:00:29,260 --> 00:00:29,770 we use? 6 00:00:31,060 --> 00:00:35,380 This is where autocorrelation and passell autocorrelation will help us. 7 00:00:37,630 --> 00:00:44,470 Correlation between two variables is basically a relationship or a connection between two numbers. 8 00:00:45,610 --> 00:00:52,180 It is measured by Pearsons correlation coefficient, which ranges between minus one to one. 9 00:00:52,960 --> 00:01:00,670 If the value of this number is one, this means that the relationship between the two variables is positive. 10 00:01:02,410 --> 00:01:08,170 Positive means that if X increases, Y also increases. 11 00:01:09,340 --> 00:01:12,670 And if X decreases, Y also decreases. 12 00:01:14,930 --> 00:01:15,890 On the other extreme. 13 00:01:16,950 --> 00:01:19,080 Is if the value is minus one. 14 00:01:20,520 --> 00:01:29,790 It means that if X increases via decreases and if X decreases, Y increases, that both will move in 15 00:01:29,790 --> 00:01:30,900 the opposite direction. 16 00:01:31,950 --> 00:01:33,330 Both are still correlated. 17 00:01:33,930 --> 00:01:38,190 There is a relationship between the two, but correlation is negative. 18 00:01:39,910 --> 00:01:47,500 But when the correlation coefficient is zero, this means that there is no relationship between these 19 00:01:47,560 --> 00:01:48,430 two variables. 20 00:01:49,770 --> 00:01:57,840 So we cannot say that if X increases, what will happen, the way it may increase or it may decrease. 21 00:01:59,780 --> 00:02:03,230 So this is what correlation is between two variables. 22 00:02:04,400 --> 00:02:11,270 But when it comes to Time series, we are trying to find the correlation of the variable with its own 23 00:02:11,420 --> 00:02:19,940 lag values, which is why, like auto regression, this correlation is called auto correlation. 24 00:02:20,750 --> 00:02:25,100 That is correlation with itself, its own lag values. 25 00:02:26,270 --> 00:02:27,470 Now, how do we use it? 26 00:02:28,720 --> 00:02:33,160 To use it, we first find Decorrelation with all the flag values. 27 00:02:33,400 --> 00:02:35,230 Suppose this is our data. 28 00:02:35,800 --> 00:02:39,320 The number of bullets per day in the first table. 29 00:02:39,430 --> 00:02:40,990 We have the flag when values. 30 00:02:42,850 --> 00:02:47,950 We find the correlation between these two regional values and the lag when values. 31 00:02:49,600 --> 00:02:53,710 This correlation is called autocorrelation of lag one. 32 00:02:55,130 --> 00:02:57,200 This is coming out to point eight. 33 00:02:58,730 --> 00:03:00,770 Then we find love to lose. 34 00:03:02,120 --> 00:03:04,910 This is the additional column, what, like two values? 35 00:03:06,900 --> 00:03:11,580 And then we find the autocorrelation of original values with like two values. 36 00:03:12,150 --> 00:03:14,760 And that is the autocorrelation of like two. 37 00:03:16,400 --> 00:03:18,830 And so on as far as we can go. 38 00:03:19,640 --> 00:03:23,660 So you find autocorrelation values for all the flag values. 39 00:03:25,730 --> 00:03:32,240 Then we plot all of these autocorrelation values against the corresponding lag values. 40 00:03:34,390 --> 00:03:37,990 This is what we get when we plot all of these autocorrelation values. 41 00:03:38,890 --> 00:03:42,430 This is called auto correlation function plot. 42 00:03:44,140 --> 00:03:47,290 So it is HCF plot in this graph. 43 00:03:47,550 --> 00:03:50,590 The lag values are on the x axis. 44 00:03:51,620 --> 00:03:54,010 So this first line is at zero. 45 00:03:54,170 --> 00:03:56,180 That is it is for lag zero. 46 00:03:57,170 --> 00:04:00,170 The second line is for lag one and so on. 47 00:04:02,670 --> 00:04:05,870 On the Y axis, we have correlation coefficient value. 48 00:04:07,660 --> 00:04:11,140 So with lag's zero value, that is with itself. 49 00:04:11,380 --> 00:04:18,040 The correlation coefficient is plus one, which is obvious with lag one value. 50 00:04:18,430 --> 00:04:19,870 It is nearly point eight. 51 00:04:21,760 --> 00:04:22,690 We'd like to. 52 00:04:22,840 --> 00:04:24,760 It is nearly point six and so on. 53 00:04:25,510 --> 00:04:29,860 The colored corn at the bottom, this colored corn. 54 00:04:31,180 --> 00:04:34,970 This is called 95 percent confidence interval corn. 55 00:04:36,850 --> 00:04:45,640 Basically, a point outside this corn means that we are more than 95 percent confident that there is 56 00:04:45,640 --> 00:04:47,830 a correlation between these variables. 57 00:04:48,820 --> 00:04:52,870 And this coalition is not just any ran down statistical fluctuation. 58 00:04:55,090 --> 00:04:59,090 So till that time, these venues are outside this cone. 59 00:04:59,890 --> 00:05:03,580 We can take those many, like values into consideration. 60 00:05:05,320 --> 00:05:07,800 This works well for the moving average better. 61 00:05:08,240 --> 00:05:12,760 Well, we have to find the relationship between lag's of residuals. 62 00:05:13,810 --> 00:05:15,730 So ECF works well, the. 63 00:05:17,180 --> 00:05:23,480 But when we are looking at auto regression, there is some ambiguity in the HCF plot. 64 00:05:24,410 --> 00:05:32,800 For example, when we find out lag one autocorrelation, it is zero point eight. 65 00:05:33,890 --> 00:05:42,650 Now, when we see the lag to autocorrelation lag to already has some impact on the lag one variable. 66 00:05:44,130 --> 00:05:53,070 So the impact of lag, too variable was already accommodated to some extent when we saw lag one autocorrelation. 67 00:05:55,100 --> 00:06:03,350 Ideally, to find the correlation between this series and its lag, two values, we should remove these 68 00:06:03,350 --> 00:06:08,390 direct and indirect impacts of large quantities on the original cities. 69 00:06:10,300 --> 00:06:17,260 Similarly, when we are finding correlation between original series and laggardly values, we should 70 00:06:17,260 --> 00:06:20,260 remove the effect of lag one and like two values. 71 00:06:22,510 --> 00:06:28,360 When we remove these effects of intervening observations, then the correlation coefficient is called 72 00:06:28,540 --> 00:06:30,760 partial autocorrelation coefficient. 73 00:06:34,710 --> 00:06:38,410 And to calculate this also, we do not need to know the maths behind it. 74 00:06:39,570 --> 00:06:45,540 Our software finds only partial autocorrelation coefficient values for all the lag values. 75 00:06:48,290 --> 00:06:51,200 We can also applaud this on a graph to visualize them. 76 00:06:54,420 --> 00:07:02,940 The plot of Passell Autocorrelation Values is called up BATF plot, that is partial autocorrelation 77 00:07:02,940 --> 00:07:03,660 function plot. 78 00:07:05,140 --> 00:07:08,810 PCM gives a much clearer view of Gey. 79 00:07:09,100 --> 00:07:13,390 That is the number of lag values we should use in auto regression. 80 00:07:15,050 --> 00:07:20,660 So when we are trying to find out the number of lagged values which we should use in order regression 81 00:07:20,660 --> 00:07:26,670 model, we have to look at the partial autocorrelation function with Wendy. 82 00:07:26,690 --> 00:07:32,430 The a value goes below the 95 percent confidence interval gone. 83 00:07:33,290 --> 00:07:39,000 That is the point to which we will use the lagged values for auto regression model. 84 00:07:39,990 --> 00:07:46,280 But when we are trying to find out day lag values of the day, as it was for moving average, we will 85 00:07:46,280 --> 00:07:47,600 use the AC a function. 86 00:07:47,960 --> 00:07:55,220 We will see the point where the ACL function value goes beyond goes below the 95 percent confidence 87 00:07:55,220 --> 00:07:55,910 interval level. 88 00:07:56,780 --> 00:08:01,890 And that point will be used as the number of lag values for moving average model. 89 00:08:02,450 --> 00:08:03,060 So that's all. 90 00:08:03,260 --> 00:08:06,770 This is how HCF and PSC of plots are used.