1 00:00:00,590 --> 00:00:06,860 Now we're going to discuss the second technique of ensemble method, which is called Random Forest. 2 00:00:09,090 --> 00:00:12,440 Random forest provides an improvement or bag trees. 3 00:00:12,600 --> 00:00:14,910 By way of DeCota letting the trees. 4 00:00:16,640 --> 00:00:19,580 Let me explain what this problem of Correlated Prizzi's. 5 00:00:21,560 --> 00:00:23,540 Remember when we did bagging? 6 00:00:23,930 --> 00:00:29,720 We created multiple data sets and made multiple trees on them using all our predictive edibles. 7 00:00:30,980 --> 00:00:33,170 Suppose that there is one strong predictor. 8 00:00:33,290 --> 00:00:37,520 Indeed, he does it along with a number of other moderately strong predictors. 9 00:00:39,260 --> 00:00:46,220 Then in the collection of bag trees, most or all of the trees will use this strong predictor in the 10 00:00:46,220 --> 00:00:46,830 conflict. 11 00:00:48,860 --> 00:00:53,260 Consequently, all of the batteries will look quite similar to each other. 12 00:00:55,230 --> 00:00:59,190 And the prediction of bad trees will be highly correlated. 13 00:01:01,180 --> 00:01:07,280 And when the quantities are correlated, everything them does not lead to any large reduction in. 14 00:01:09,460 --> 00:01:13,380 So due to the problem of correlated outcomes of bagged trees. 15 00:01:14,380 --> 00:01:17,980 Bagging does not result in significant reduction in radians. 16 00:01:19,600 --> 00:01:22,480 The solution to this problem is building a random forest. 17 00:01:24,560 --> 00:01:27,440 The concept is that we want a group of trees. 18 00:01:28,550 --> 00:01:33,680 And these trees should be different so that we get non correlated outcomes. 19 00:01:34,880 --> 00:01:41,480 So instead of using all the variables, we use a subset of the predictive renewables for each tree. 20 00:01:42,820 --> 00:01:48,040 Therefore, a lot of these players will not even consider these strong predictors. 21 00:01:49,600 --> 00:01:51,160 So here you can see the process. 22 00:01:51,880 --> 00:01:54,460 It is exactly similar to the begging process. 23 00:01:54,640 --> 00:01:56,320 Only one step is added. 24 00:01:57,710 --> 00:02:06,830 The step is selection of MDD and unpredictable variables out of P, these m randomly big variables will 25 00:02:06,830 --> 00:02:09,140 only be used to create that model. 26 00:02:11,030 --> 00:02:18,020 So suppose if we had 15 predictive variables, we will select randomly five of the predictive intervals 27 00:02:18,110 --> 00:02:20,530 to create model one randomly. 28 00:02:20,570 --> 00:02:24,050 Another set of five variables to create model two and so on. 29 00:02:27,340 --> 00:02:32,220 You can see that if we select all the variables here, this becomes bagging. 30 00:02:33,400 --> 00:02:35,050 That is, if M is equal to P. 31 00:02:35,230 --> 00:02:36,880 This is exactly same as bagging. 32 00:02:38,030 --> 00:02:44,040 So we can see that bagging is a special case of random forest in which we use all the predictive variables 33 00:02:44,310 --> 00:02:45,080 to make this place. 34 00:02:45,180 --> 00:02:46,080 And Belltrees. 35 00:02:48,910 --> 00:02:55,510 The last point I want to discuss is how many predictive variables should we choose to create these trees? 36 00:02:57,280 --> 00:03:05,620 Usually this number is depicted by M as a general rule of thumb, you can use M is equal to be by three. 37 00:03:05,770 --> 00:03:06,910 In case of regression. 38 00:03:07,970 --> 00:03:11,240 Will be is detailed a number of critical variables in your dataset. 39 00:03:12,570 --> 00:03:16,860 And M is equal to under Ruby for classification. 40 00:03:17,800 --> 00:03:26,780 So if you have suppose 16 variables for regulation, you should be using M is equal to nearly five and 41 00:03:26,790 --> 00:03:28,090 four classification squared. 42 00:03:28,090 --> 00:03:29,910 Square root 16 will come out to be four. 43 00:03:30,090 --> 00:03:32,280 So you should use M is equal to four. 44 00:03:34,300 --> 00:03:39,460 However, if the predictor variables in your dataset are highly correlated. 45 00:03:40,630 --> 00:03:44,070 In such a scenario, you should use even smaller venues of M.. 46 00:03:45,500 --> 00:03:47,690 So that's all the theory you need to know. 47 00:03:48,020 --> 00:03:54,470 Now we can run random forest in our software package and let us see its performance against normal tree 48 00:03:54,590 --> 00:03:55,610 and a Bagdadi.