1
00:00:00,590 --> 00:00:06,860
Now we're going to discuss the second technique of ensemble method, which is called Random Forest.

2
00:00:09,090 --> 00:00:12,440
Random forest provides an improvement or bag trees.

3
00:00:12,600 --> 00:00:14,910
By way of DeCota letting the trees.

4
00:00:16,640 --> 00:00:19,580
Let me explain what this problem of Correlated Prizzi's.

5
00:00:21,560 --> 00:00:23,540
Remember when we did bagging?

6
00:00:23,930 --> 00:00:29,720
We created multiple data sets and made multiple trees on them using all our predictive edibles.

7
00:00:30,980 --> 00:00:33,170
Suppose that there is one strong predictor.

8
00:00:33,290 --> 00:00:37,520
Indeed, he does it along with a number of other moderately strong predictors.

9
00:00:39,260 --> 00:00:46,220
Then in the collection of bag trees, most or all of the trees will use this strong predictor in the

10
00:00:46,220 --> 00:00:46,830
conflict.

11
00:00:48,860 --> 00:00:53,260
Consequently, all of the batteries will look quite similar to each other.

12
00:00:55,230 --> 00:00:59,190
And the prediction of bad trees will be highly correlated.

13
00:01:01,180 --> 00:01:07,280
And when the quantities are correlated, everything them does not lead to any large reduction in.

14
00:01:09,460 --> 00:01:13,380
So due to the problem of correlated outcomes of bagged trees.

15
00:01:14,380 --> 00:01:17,980
Bagging does not result in significant reduction in radians.

16
00:01:19,600 --> 00:01:22,480
The solution to this problem is building a random forest.

17
00:01:24,560 --> 00:01:27,440
The concept is that we want a group of trees.

18
00:01:28,550 --> 00:01:33,680
And these trees should be different so that we get non correlated outcomes.

19
00:01:34,880 --> 00:01:41,480
So instead of using all the variables, we use a subset of the predictive renewables for each tree.

20
00:01:42,820 --> 00:01:48,040
Therefore, a lot of these players will not even consider these strong predictors.

21
00:01:49,600 --> 00:01:51,160
So here you can see the process.

22
00:01:51,880 --> 00:01:54,460
It is exactly similar to the begging process.

23
00:01:54,640 --> 00:01:56,320
Only one step is added.

24
00:01:57,710 --> 00:02:06,830
The step is selection of MDD and unpredictable variables out of P, these m randomly big variables will

25
00:02:06,830 --> 00:02:09,140
only be used to create that model.

26
00:02:11,030 --> 00:02:18,020
So suppose if we had 15 predictive variables, we will select randomly five of the predictive intervals

27
00:02:18,110 --> 00:02:20,530
to create model one randomly.

28
00:02:20,570 --> 00:02:24,050
Another set of five variables to create model two and so on.

29
00:02:27,340 --> 00:02:32,220
You can see that if we select all the variables here, this becomes bagging.

30
00:02:33,400 --> 00:02:35,050
That is, if M is equal to P.

31
00:02:35,230 --> 00:02:36,880
This is exactly same as bagging.

32
00:02:38,030 --> 00:02:44,040
So we can see that bagging is a special case of random forest in which we use all the predictive variables

33
00:02:44,310 --> 00:02:45,080
to make this place.

34
00:02:45,180 --> 00:02:46,080
And Belltrees.

35
00:02:48,910 --> 00:02:55,510
The last point I want to discuss is how many predictive variables should we choose to create these trees?

36
00:02:57,280 --> 00:03:05,620
Usually this number is depicted by M as a general rule of thumb, you can use M is equal to be by three.

37
00:03:05,770 --> 00:03:06,910
In case of regression.

38
00:03:07,970 --> 00:03:11,240
Will be is detailed a number of critical variables in your dataset.

39
00:03:12,570 --> 00:03:16,860
And M is equal to under Ruby for classification.

40
00:03:17,800 --> 00:03:26,780
So if you have suppose 16 variables for regulation, you should be using M is equal to nearly five and

41
00:03:26,790 --> 00:03:28,090
four classification squared.

42
00:03:28,090 --> 00:03:29,910
Square root 16 will come out to be four.

43
00:03:30,090 --> 00:03:32,280
So you should use M is equal to four.

44
00:03:34,300 --> 00:03:39,460
However, if the predictor variables in your dataset are highly correlated.

45
00:03:40,630 --> 00:03:44,070
In such a scenario, you should use even smaller venues of M..

46
00:03:45,500 --> 00:03:47,690
So that's all the theory you need to know.

47
00:03:48,020 --> 00:03:54,470
Now we can run random forest in our software package and let us see its performance against normal tree

48
00:03:54,590 --> 00:03:55,610
and a Bagdadi.