1
00:00:01,350 --> 00:00:08,610
In this lecture, we will learn how to create suspense, Martin, or my forecast model in Python.

2
00:00:10,860 --> 00:00:18,980
In my forecast, we can simply assume that the last period value is the forecast for this period.

3
00:00:22,350 --> 00:00:27,690
So let's create this modern on our daily minimum temperature data set.

4
00:00:28,620 --> 00:00:31,060
So first, let's import that does it.

5
00:00:31,380 --> 00:00:34,140
You're creating a data frame, B.F..

6
00:00:35,040 --> 00:00:37,900
On let's look at the first five values of this stuff.

7
00:00:39,820 --> 00:00:41,770
You can see that we have two columns.

8
00:00:42,010 --> 00:00:42,940
First one is their data.

9
00:00:43,060 --> 00:00:44,590
Second one is the temperature.

10
00:00:46,690 --> 00:00:49,030
We have detailed data in the data column.

11
00:00:49,240 --> 00:00:52,360
And we have to see these values in the same data.

12
00:00:54,640 --> 00:01:02,020
Now, since we are creating my forecast model, that means that for this second and very well you our

13
00:01:02,020 --> 00:01:03,930
forecast was previous day value.

14
00:01:04,000 --> 00:01:06,410
So twenty point seven.

15
00:01:06,910 --> 00:01:09,580
But the actual value is seventeen point nine.

16
00:01:10,120 --> 00:01:15,340
Similarly, the forecast for Turn-off Jan is seventeen point nine.

17
00:01:15,460 --> 00:01:18,370
And the actual value is eighteen point eight.

18
00:01:19,930 --> 00:01:27,700
So to clear the forecast said value, we can simply create a lag value of this temperature data.

19
00:01:29,380 --> 00:01:34,150
So let's just feared that we will use Dort shift manteau to do that.

20
00:01:37,550 --> 00:01:42,470
And we are creating a new column that is B, we are naming that piece.

21
00:01:42,980 --> 00:01:48,710
And this temperature value will actis the actual value at D plus one.

22
00:01:50,480 --> 00:02:00,490
Let's spend this and let's again take a look at the herd values so you can see for a second no fap over

23
00:02:00,510 --> 00:02:07,100
the forecasted values, twenty point seven and actually lose seventeen point nine four fifths of gen.

24
00:02:07,700 --> 00:02:10,160
The forecasted value is fourteen point six.

25
00:02:10,400 --> 00:02:13,940
And the actual value is fifteen point eight.

26
00:02:14,630 --> 00:02:23,720
So you can say that this damn one is over via variable and the speed variable is what X variable and

27
00:02:24,500 --> 00:02:29,330
the predicted value or the forecasted value is same as the X value.

28
00:02:31,370 --> 00:02:34,100
So let's divide this data and do test centerin.

29
00:02:35,600 --> 00:02:38,000
We are creating two different data frames.

30
00:02:39,320 --> 00:02:40,290
First one is their train.

31
00:02:40,850 --> 00:02:51,800
And second, when they start test, we are taking last seven values as their test data and the remaining

32
00:02:51,800 --> 00:02:53,390
values as they were train data.

33
00:02:56,330 --> 00:03:03,470
So here we are selecting from index one to be a dot Chaib zero minus seven.

34
00:03:05,090 --> 00:03:09,070
So we are taking all the values out there.

35
00:03:09,330 --> 00:03:18,930
Then this first value and the last seven values stream data and the last seven values as a test data.

36
00:03:21,020 --> 00:03:25,710
We are ignoring the first value because we have an end in the first record.

37
00:03:26,690 --> 00:03:27,950
So let's run this.

38
00:03:28,820 --> 00:03:33,410
Let's look at the first five values of four dream dataset.

39
00:03:38,070 --> 00:03:44,230
You can see we have ignored the first straw and we have the remaining in tell you.

40
00:03:45,930 --> 00:03:50,670
Again, this is our way of a label and this is a works very well on that divide.

41
00:03:50,940 --> 00:03:53,460
This green test and to crane necks.

42
00:03:53,730 --> 00:03:56,160
Greenway and test X.

43
00:03:56,300 --> 00:03:56,940
Test Y.

44
00:03:59,850 --> 00:04:03,510
For trainings, we won this data.

45
00:04:04,320 --> 00:04:12,620
So this is Colombe and for why we won this column that this column time.

46
00:04:14,760 --> 00:04:18,420
Similarly, we will do the same thing for test data centers when.

47
00:04:22,590 --> 00:04:25,500
So now we have divided our data into four parts.

48
00:04:26,190 --> 00:04:28,340
Mix Greenway their specs.

49
00:04:28,560 --> 00:04:29,060
That's way.

50
00:04:32,190 --> 00:04:36,010
Now, usually we train a lot more than on this train data.

51
00:04:36,390 --> 00:04:40,920
And we use test data, evaluate performance of that model.

52
00:04:42,240 --> 00:04:49,530
Now, since we are building a night forecast model, there is no need to create another more than we

53
00:04:49,530 --> 00:04:53,370
can just use X values as our forecasted value.

54
00:04:55,620 --> 00:05:00,440
So let's create another data frame that is predictions.

55
00:05:01,920 --> 00:05:05,550
And this will contain the country as X dataset.

56
00:05:07,990 --> 00:05:13,640
So this fever news will become the production values for their test data.

57
00:05:13,750 --> 00:05:15,580
So let's end this.

58
00:05:16,780 --> 00:05:21,650
Now let's see the predicted values and the Y value or ordinarily.

59
00:05:26,260 --> 00:05:32,890
So you can see first we have the bad values and here we have the actual Waverly's.

60
00:05:33,100 --> 00:05:41,740
So you're also you can see that the actual value for index three six four three is the predicted value

61
00:05:42,070 --> 00:05:44,380
for the record, three, six foot four.

62
00:05:44,890 --> 00:05:49,780
So you can see that we have a naive forecast model for our data.

63
00:05:51,010 --> 00:05:53,410
Now, let's look at data in our predictions.

64
00:05:54,340 --> 00:06:03,340
We will use mean squared error, meaning squared error is just the sum of the squared of differences

65
00:06:03,340 --> 00:06:05,620
between predicted value and Y value.

66
00:06:08,260 --> 00:06:13,330
And we are going to use mean squared error from Escalon dot matrix.

67
00:06:15,280 --> 00:06:19,410
And we are saving this data to a very well named MASC.

68
00:06:20,680 --> 00:06:21,580
Let's run this.

69
00:06:23,080 --> 00:06:26,320
Also, you can notice that we have deployed two series of data.

70
00:06:26,620 --> 00:06:29,440
First is the actual values and then the credit card values.

71
00:06:30,460 --> 00:06:34,210
So the MASC fought over this data set is three point four two.

72
00:06:35,440 --> 00:06:39,400
And you can also plot their predictions and why on the graph.

73
00:06:40,450 --> 00:06:43,630
So this blue line is the actual values.

74
00:06:43,720 --> 00:06:46,150
And this red line is the predicted values.

75
00:06:46,300 --> 00:06:50,050
We are just using by a plot or plot to plot this data.

76
00:06:52,510 --> 00:06:59,010
Now, why this masc value for night forecast is important because we are going to evaluate what our

77
00:06:59,040 --> 00:07:02,820
advance models using this MASC value.

78
00:07:03,640 --> 00:07:11,530
So if what I advanced than is giving us a messy value of greater than this value, then we can say that

79
00:07:11,770 --> 00:07:15,610
our model is not able to extract any information from that data.

80
00:07:16,420 --> 00:07:20,250
And you can consider that Baym Sidis data as a random walk.

81
00:07:20,500 --> 00:07:26,080
Since we are not able to extract any information better than the night forecast model.

82
00:07:27,640 --> 00:07:34,210
So that is where night forecasts masc value is important because it will tell you whether your data

83
00:07:34,600 --> 00:07:36,230
is a random walk or not.

84
00:07:37,510 --> 00:07:45,100
If your advance models such as a model, a model or a rhema model is not able to improve on this MASC

85
00:07:45,100 --> 00:07:50,050
value, then you can say that your time in cities is a random walk.

86
00:07:51,430 --> 00:07:53,380
That's all for this lecture.

87
00:07:53,560 --> 00:07:53,980
Thank you.