1 00:00:00,560 --> 00:00:05,840 In this video, we will learn how to create a possible Smardon and Biton. 2 00:00:08,820 --> 00:00:13,660 So let's first import the ACCE boost as X DBE. 3 00:00:16,360 --> 00:00:24,520 So if this Ximo says not installed into your system, you first have to install it to install it. 4 00:00:24,850 --> 00:00:28,420 You have to write up, install extreme boost. 5 00:00:32,200 --> 00:00:33,760 So go to your command prompt. 6 00:00:37,150 --> 00:00:41,340 Use a keyboard shortcut over no art and write CMB and head. 7 00:00:41,440 --> 00:00:41,830 Okay. 8 00:00:44,120 --> 00:00:45,740 And here you can write that. 9 00:00:51,300 --> 00:00:51,650 Doug. 10 00:00:52,830 --> 00:00:54,560 Thanks to boost. 11 00:01:10,890 --> 00:01:13,080 Now, we have installed store loaded cibils. 12 00:01:14,130 --> 00:01:16,560 And if you are using Anaconda. 13 00:01:18,730 --> 00:01:27,000 You can also use an opened up prompt and sort of tip in so you can write KONDA this X reboost. 14 00:01:27,850 --> 00:01:29,110 That will also work. 15 00:01:33,220 --> 00:01:35,720 Let's again import exhibitionist. 16 00:01:39,590 --> 00:01:46,060 So we have important egg symbols and to look at the documentation of symbols. 17 00:01:46,230 --> 00:01:47,550 You can click on this link. 18 00:01:52,760 --> 00:01:56,960 Here you can get all the information you won for excuse. 19 00:01:57,890 --> 00:01:59,720 Let's look at the parameters. 20 00:02:02,150 --> 00:02:05,030 So there are multiple parameters on exit boost. 21 00:02:05,690 --> 00:02:12,980 We can divide this parameters in two, three categories vs. the general parameter, such as learning 22 00:02:12,980 --> 00:02:16,260 rate, learning where it is known as ETR, etc.. 23 00:02:17,150 --> 00:02:25,790 And then there are three related parameters, such as maximum BEB Gamma Gamma is similar to your minimum 24 00:02:25,790 --> 00:02:26,690 sample size. 25 00:02:29,160 --> 00:02:32,640 Then subsample Dror. 26 00:02:37,440 --> 00:02:40,460 And then there are learning task parameters. 27 00:02:41,260 --> 00:02:41,490 Oh. 28 00:02:42,030 --> 00:02:50,850 If you remember, the only difference between and will send extra boost is in the regularisation part. 29 00:02:52,440 --> 00:03:00,690 And if you are aware of anyone, then a two regularisation techniques which are laso and reach in case 30 00:03:00,690 --> 00:03:01,770 of linear regression. 31 00:03:03,010 --> 00:03:09,820 There we have shrinkage parameters for the number of variables we use in our model. 32 00:03:13,300 --> 00:03:16,480 So those parameters are Alpha and Lemina. 33 00:03:18,550 --> 00:03:20,700 So there are many people that I went to to see it. 34 00:03:22,290 --> 00:03:27,020 Go through documentation to look at all the parameters that you want to apply. 35 00:03:27,070 --> 00:03:32,190 Will only covering a subset of those parameters which are important. 36 00:03:35,810 --> 00:03:38,990 Let's first create exhibits classifier. 37 00:03:39,110 --> 00:03:41,720 We are using maximum depth of five. 38 00:03:42,320 --> 00:03:45,350 And number of trees as 10000. 39 00:03:45,710 --> 00:03:47,540 And learning rate of point three. 40 00:03:50,360 --> 00:03:51,320 Let's run this. 41 00:03:53,470 --> 00:03:55,060 And let's for our data. 42 00:03:59,420 --> 00:04:01,100 So our model is ready. 43 00:04:01,550 --> 00:04:03,250 Let's get decorously score. 44 00:04:08,630 --> 00:04:10,100 The accuracy is scored. 45 00:04:10,130 --> 00:04:13,030 We are getting a success expunged six six percent. 46 00:04:13,820 --> 00:04:18,530 So, as you can see, we are getting the highest score using Ximo Smardon. 47 00:04:21,700 --> 00:04:30,310 And usually exhibits gifts, the best result out of random forest GBM or any other classified matter. 48 00:04:34,640 --> 00:04:40,130 The Zimbalist, we also get another matter that this plot underscore the importance. 49 00:04:42,430 --> 00:04:48,880 This my card will give us the graph of the relative importance of variables that we are using for our 50 00:04:48,890 --> 00:04:50,920 modern food on this. 51 00:04:54,330 --> 00:04:59,580 You can see we have our variables listed on the left hand side. 52 00:05:00,900 --> 00:05:05,430 And we have the feature importance on this axis. 53 00:05:06,690 --> 00:05:11,790 So as you can see, the most important feature from our variables is time taker. 54 00:05:12,120 --> 00:05:17,340 The second most important feature is Twitter hashtag, then Greller views and so on. 55 00:05:17,520 --> 00:05:21,570 And the least important feature is Johna Underscore Community. 56 00:05:25,090 --> 00:05:27,510 This is another advantage of agcy boost first. 57 00:05:27,620 --> 00:05:29,850 The prediction accuracy is very high. 58 00:05:30,620 --> 00:05:34,640 And you also get the feature importance graph using X equals. 59 00:05:37,710 --> 00:05:46,110 Now, as we discussed earlier, we can use grid search to provide multiple values of our hyper parameters. 60 00:05:47,370 --> 00:05:52,620 So let's use grid search to optimize our exhibits classifier. 61 00:05:53,550 --> 00:05:59,190 So first, I will create just a classifier with Tyber parameters that I don't want to change. 62 00:05:59,940 --> 00:06:07,640 So I will write exhibit classifier with parameters and as symmetries, let's change it to 250 instead 63 00:06:07,650 --> 00:06:16,200 of 500 so that we can quickly drain this model with learning rate zero point one and our numbers set 64 00:06:16,200 --> 00:06:16,830 off 42. 65 00:06:17,490 --> 00:06:20,040 Now we want to change all of this parameters. 66 00:06:20,490 --> 00:06:22,000 First one is the maximum depth. 67 00:06:22,530 --> 00:06:24,060 This the depth of our tree. 68 00:06:24,540 --> 00:06:28,950 And we want to give values from three to 10 with interval of two. 69 00:06:29,190 --> 00:06:32,790 So we want to give three, five, seven and nine. 70 00:06:34,860 --> 00:06:42,810 We can do this either by providing a list or we can also broyd this in the form of fringe method. 71 00:06:43,080 --> 00:06:48,000 So arrange range starting point and point and the step. 72 00:06:49,320 --> 00:06:55,790 So if you are not comfortable with trends, you can also write three Colma, five comma, seven common 73 00:06:55,820 --> 00:06:56,080 name. 74 00:06:57,330 --> 00:06:58,680 Then we have the gala value. 75 00:06:59,040 --> 00:07:03,840 This is almost same as the minimum sample split. 76 00:07:04,230 --> 00:07:06,540 You can read documentation for more detail. 77 00:07:06,780 --> 00:07:09,790 But this is almost same as minimum simple estate. 78 00:07:10,710 --> 00:07:14,400 We are creating value of zero point one zero point one zero one three. 79 00:07:15,240 --> 00:07:18,150 Now we have another parameter, which is subsample. 80 00:07:19,260 --> 00:07:26,480 So if you remember in begging for each tree, we use a subsample of our data. 81 00:07:28,920 --> 00:07:35,850 So we are seeing that old values of 80 percent and 90 percent for subsample value. 82 00:07:36,750 --> 00:07:40,800 That means for each tree use 80 percent of their data. 83 00:07:41,970 --> 00:07:43,690 This 80 percent is random data. 84 00:07:43,740 --> 00:07:46,800 So each time we will select a portion of our data. 85 00:07:48,240 --> 00:07:51,500 Now the next parameter is Colomb subsample by tree. 86 00:07:52,440 --> 00:07:58,380 So this time and sort of data use 80 percent of feature to create ERP. 87 00:07:58,800 --> 00:08:01,220 So randomly we selected two percent of feature. 88 00:08:01,740 --> 00:08:03,570 And we will create a random tree. 89 00:08:04,890 --> 00:08:06,270 Then we have Greg ULFA. 90 00:08:07,080 --> 00:08:10,080 This is the regularization parameter. 91 00:08:10,950 --> 00:08:15,360 And we are providing three values, zero point zero one zero point one and one. 92 00:08:17,810 --> 00:08:22,550 So we are training our model on all these values of fiber parameter. 93 00:08:23,150 --> 00:08:28,100 And after training or model, we will select the best model out of these models. 94 00:08:30,160 --> 00:08:33,530 So let's run this again. 95 00:08:34,000 --> 00:08:40,930 You can notice that we have created all this hyper parameters in the form of dictionary with our hyper 96 00:08:40,930 --> 00:08:43,510 parameter to name as Ghys. 97 00:08:45,070 --> 00:08:48,310 And the range of hyper, I mean, as values. 98 00:08:50,850 --> 00:08:56,790 And again, in grid search, we first have to mention our classifier, which is exhibit underscores 99 00:08:56,790 --> 00:08:57,290 VLF. 100 00:08:57,570 --> 00:08:58,560 Then we have to. 101 00:08:59,770 --> 00:09:00,970 Mentioned the parameters. 102 00:09:01,000 --> 00:09:03,080 We want to pass in the form of dictionary. 103 00:09:03,810 --> 00:09:09,910 We are using padam underscore test one and then cross-validation. 104 00:09:10,140 --> 00:09:13,060 We want cross-validation will to be five. 105 00:09:13,180 --> 00:09:16,540 And we want the scoring to be of accuracy. 106 00:09:18,010 --> 00:09:19,030 Let's run this. 107 00:09:19,420 --> 00:09:23,820 And now we are fitting our Ekstrand in train down to this model. 108 00:09:26,440 --> 00:09:27,580 It will take some time. 109 00:09:28,150 --> 00:09:33,340 And remember that we are using all our models as a classifier models. 110 00:09:34,380 --> 00:09:36,960 All this are available for regression also. 111 00:09:37,260 --> 00:09:41,940 So if you have a continuous very even, you can use that regression. 112 00:09:41,940 --> 00:09:43,680 What is one of these techniques? 113 00:09:49,140 --> 00:09:51,090 Sylvia fitted out grid search. 114 00:09:51,510 --> 00:09:52,320 So let's. 115 00:09:53,720 --> 00:09:54,480 Look at the pet. 116 00:09:54,540 --> 00:09:54,900 I mean. 117 00:09:55,010 --> 00:09:56,120 So for best, Märta. 118 00:09:57,880 --> 00:10:01,420 So we are getting our best mardon when we are using. 119 00:10:02,960 --> 00:10:09,200 The column simple as Boin date, the garment value of point three, maximum depth of seven. 120 00:10:10,850 --> 00:10:16,070 Regularisation as far as zero point one and subsample of zero point eight. 121 00:10:19,920 --> 00:10:25,220 We'll save this more than that is the best modern and do another witty one. 122 00:10:25,260 --> 00:10:27,680 That is Siwiec Zeehan Discourse VLF. 123 00:10:28,710 --> 00:10:30,720 Then we can find out that curacy. 124 00:10:32,070 --> 00:10:35,190 And there scums, so doobies for physics, but say. 125 00:10:36,610 --> 00:10:43,740 Since our data is less and we have used less number of Greece, we are getting our degrees in accuracy 126 00:10:43,790 --> 00:10:44,120 score. 127 00:10:44,230 --> 00:10:52,120 But if you have a large amount of data and if you select large number of Greece, you will definitely 128 00:10:52,120 --> 00:10:59,350 get higher accuracy as compared to a single random XY Wall Street, because here we are picking a best 129 00:10:59,350 --> 00:11:02,020 model out of all this morten's. 130 00:11:04,780 --> 00:11:06,410 That's all for Sewall's. 131 00:11:06,910 --> 00:11:07,300 Thank you.