1
00:00:00,960 --> 00:00:08,720
In this lecture, we are going to learn how to implement, walk forward validation for any of the time

2
00:00:08,730 --> 00:00:09,520
series technique.

3
00:00:10,470 --> 00:00:16,110
So here we will implement, walk forward validation for auto regression model.

4
00:00:18,690 --> 00:00:20,250
This is what we are going to do.

5
00:00:21,330 --> 00:00:25,410
We will run a for loop for all the values of test.

6
00:00:30,160 --> 00:00:32,180
So suppose this is our test sites.

7
00:00:32,830 --> 00:00:34,720
We have a test size of five units.

8
00:00:35,530 --> 00:00:41,020
We will create individual Morton for each of the test value prediction.

9
00:00:42,420 --> 00:00:51,480
Then in the next step of fall, Lou, we will increase our training dataset to include the new information.

10
00:00:51,780 --> 00:00:59,760
Then we will again create a new model and forecast for a one time period ahead in the next loop.

11
00:01:00,060 --> 00:01:06,950
We will again include the additional data and the word training data and create a new model.

12
00:01:07,410 --> 00:01:11,670
And then again, forecast for one time period ahead.

13
00:01:12,230 --> 00:01:17,970
We will do this for any number of things where and this the number of test values.

14
00:01:18,000 --> 00:01:23,880
So if we want to predict five best values, we will run this loop for five times.

15
00:01:24,000 --> 00:01:26,070
If we want to predict six values.

16
00:01:26,190 --> 00:01:28,440
We will run this low for six times.

17
00:01:29,610 --> 00:01:34,500
So let's look at a word for loop one small.

18
00:01:36,840 --> 00:01:45,810
Before that, I have loaded my data into beer and I have created a sentry inside that set consists of

19
00:01:45,810 --> 00:01:46,740
seven values.

20
00:01:48,640 --> 00:01:50,440
So this is what we are going to do.

21
00:01:51,460 --> 00:01:54,040
We are creating a new dataset that is data.

22
00:01:54,850 --> 00:02:01,690
This is the data set in which we are going to add the word test values for future predictions.

23
00:02:02,290 --> 00:02:10,930
So we are initiating this data as data and we are also creating a blank list with the name of credit

24
00:02:11,580 --> 00:02:12,400
in this list.

25
00:02:12,430 --> 00:02:16,150
We will install the predicted values of by.

26
00:02:18,550 --> 00:02:25,180
So here I am initiating the for loop we are using for B in test.

27
00:02:25,900 --> 00:02:33,510
So suppose if my test, if someone will use this for loop, will run four to seven times, if my test

28
00:02:33,510 --> 00:02:37,570
has five values, this one loop will run for five things.

29
00:02:39,430 --> 00:02:45,240
So let's analyze the first run for the first run.

30
00:02:45,520 --> 00:02:54,760
My data is equal to cream, will create a model object with auto regulation on our data.

31
00:02:55,450 --> 00:02:58,890
Again, since this is the first in my data is required to train.

32
00:02:59,620 --> 00:03:03,280
And then we had Pettingill more than using Mordialloc it.

33
00:03:04,450 --> 00:03:12,510
Then we are predicting the VI with a start tequilla full length of data and and equate to land of green

34
00:03:12,560 --> 00:03:14,440
plus lente of test minus one.

35
00:03:15,700 --> 00:03:19,120
This is similar to what we did for auto regression model.

36
00:03:21,100 --> 00:03:26,110
So here and the first then my wife will consist of seven values.

37
00:03:26,170 --> 00:03:31,040
Since my Astarte is the start of test and the end of tests.

38
00:03:31,560 --> 00:03:36,170
Some way here will consist of seven values in the first one.

39
00:03:37,970 --> 00:03:41,530
Next, we are picking the first value of this.

40
00:03:41,750 --> 00:03:42,620
Why did the frame?

41
00:03:44,410 --> 00:03:51,970
And then we are displaying that value and we are also appending that value in the word predict the duffing.

42
00:03:52,270 --> 00:03:54,700
So this will be my first value.

43
00:03:55,040 --> 00:03:59,530
If this is the first one, then in the input offer.

44
00:03:59,590 --> 00:04:00,320
What model?

45
00:04:00,460 --> 00:04:01,390
Which is data.

46
00:04:01,750 --> 00:04:04,840
I am finding the value of test data.

47
00:04:06,190 --> 00:04:13,840
So only I suppose my dream data or data had hundred records at the end of this loop.

48
00:04:14,470 --> 00:04:18,160
My dream data that this data will have.

49
00:04:18,190 --> 00:04:19,540
Hundred and one records.

50
00:04:21,040 --> 00:04:26,830
So we are including this amount of data.

51
00:04:27,050 --> 00:04:27,640
And do what?

52
00:04:27,640 --> 00:04:28,330
Training data.

53
00:04:33,950 --> 00:04:35,590
Now, let's analyze the second loop.

54
00:04:37,420 --> 00:04:43,840
So now my data have all littering data, plus the first value of test data.

55
00:04:44,650 --> 00:04:47,410
No, I'm creating my model on that data.

56
00:04:50,020 --> 00:04:55,670
Now we are predicting why using landform data and then the endpoint.

57
00:04:56,050 --> 00:05:00,310
So this time my very variable will contain six values.

58
00:05:02,680 --> 00:05:05,700
We have already predicted one way.

59
00:05:06,130 --> 00:05:07,180
Forty plus one.

60
00:05:07,330 --> 00:05:13,060
So now this time, the Wyvill consists value from P plus two, two peoples seven.

61
00:05:14,200 --> 00:05:21,670
Then again, we are picking the first value of this predicted way and we are also finding this value

62
00:05:21,790 --> 00:05:24,280
and do not predict list.

63
00:05:25,270 --> 00:05:32,770
So now my prediction list at the end of second run for loop will contain P plus one by value, which

64
00:05:32,770 --> 00:05:38,630
was coming from the Force10 and Peoples through a loop, which is coming from the second grade.

65
00:05:41,350 --> 00:05:46,790
And again, I am appending the test value into my dataset.

66
00:05:48,330 --> 00:05:56,320
We will run this same sequence 47 times and at the end of this we will never predict list which will

67
00:05:56,320 --> 00:05:59,350
contain the Vork forward validation y values.

68
00:06:00,220 --> 00:06:03,370
So let's run this.

69
00:06:10,470 --> 00:06:12,090
You can see we are getting this.

70
00:06:12,290 --> 00:06:17,010
Lose US output, as we have also mentioned, the print segment in between.

71
00:06:18,180 --> 00:06:22,440
If we look at the prediction, this will also contain in the same values.

72
00:06:22,650 --> 00:06:28,890
You can see we have a list of seven values which we got from the walk forward validation.

73
00:06:30,150 --> 00:06:34,320
Now, let's look at the masc value of these validations.

74
00:06:34,560 --> 00:06:40,230
So earlier, if you remember four auto regression, we were getting an MNC value of one point five.

75
00:06:41,040 --> 00:06:42,210
Now, let's see.

76
00:06:44,620 --> 00:06:48,520
What value we are going to get from walk forward validation?

77
00:06:49,510 --> 00:06:52,930
You can see the MASC value is one point forty five.

78
00:06:53,500 --> 00:06:58,960
So we have decreased our edit by using the walk forward validation.

79
00:06:59,650 --> 00:07:07,030
If you compare it with knife forecast, in my forecast, we were getting masc value off more than three.

80
00:07:07,600 --> 00:07:09,700
So this is the significant reduction.

81
00:07:10,030 --> 00:07:17,980
And you can also see that we are getting improved accuracy using walk forward validation as compared

82
00:07:17,980 --> 00:07:21,460
to our normal single model for all the values.

83
00:07:23,270 --> 00:07:25,720
Now let's load this on graph as well.

84
00:07:26,260 --> 00:07:32,310
You can see we have actual values in blue and the predicted values in red.

85
00:07:34,470 --> 00:07:42,210
So MSU, when you concluded from work for validation, is much more reliable metrics of evaluating model

86
00:07:42,240 --> 00:07:42,930
performance.

87
00:07:45,470 --> 00:07:52,070
And we should always select Morton, which is giving us better accuracy on walk forward validation instead

88
00:07:52,190 --> 00:07:53,780
of a single model validation.

89
00:07:57,820 --> 00:07:59,350
So that's all we should do.

90
00:07:59,470 --> 00:08:00,790
Walk forward validation.

91
00:08:01,030 --> 00:08:02,200
Four time series data.

92
00:08:04,330 --> 00:08:08,740
You can see that we can use this method for other models as well.

93
00:08:09,340 --> 00:08:17,980
So if we are applying moving average or Atima, we can just edit this model fit and model part to implement

94
00:08:18,040 --> 00:08:19,360
walk forward validation.

95
00:08:19,690 --> 00:08:20,100
Thank you.