1 00:00:01,200 --> 00:00:02,910 Now, let's build a few other models. 2 00:00:09,660 --> 00:00:18,840 That's to feature engineering and modify features of our dataset and build few other models, so deep 3 00:00:18,840 --> 00:00:24,610 learning neural network contains hardcoded data processing, feature extraction and feature engineering. 4 00:00:25,050 --> 00:00:30,240 Those who don't require much future engineering with deep learning models. 5 00:00:31,870 --> 00:00:38,340 So in order to get good accuracy with our regular machine learning models, we need to look at engineering, 6 00:00:39,250 --> 00:00:44,050 so let's modify our feature columns, so. 7 00:00:47,080 --> 00:00:49,030 Let's modify our age column first. 8 00:00:51,400 --> 00:00:55,600 So first, we're creating an empty list with the name. 9 00:00:56,740 --> 00:01:01,210 So here we are creating for loop and opening those values over age. 10 00:01:02,290 --> 00:01:06,290 So for I in range, land of date of. 11 00:01:08,870 --> 00:01:10,970 So we will run Lupa. 12 00:01:12,010 --> 00:01:13,530 To the left to hold it up, Adam. 13 00:01:14,810 --> 00:01:19,970 If your data from of age, I would use that particular rule. 14 00:01:22,220 --> 00:01:29,640 Less than or equal to 45 each dot append your legenda, youth to adult. 15 00:01:32,710 --> 00:01:40,860 So the age forty five, we're appealing that role in age as adult. 16 00:01:42,350 --> 00:01:51,080 And as is what senior, so if the aid is equal to or less than forty five, then it's ordered if not. 17 00:01:53,150 --> 00:01:56,180 Blitzer, Anderson and Anderson. 18 00:01:58,100 --> 00:01:59,980 Non-automotive I them once employed in. 19 00:02:04,150 --> 00:02:10,950 So here we will calculate months employed as months employed, plus years employed. 20 00:02:12,190 --> 00:02:17,520 So using this condition, we are saving that value in one Templer column. 21 00:02:19,890 --> 00:02:24,660 So for this, we are creating an entire list of the name employed. 22 00:02:25,920 --> 00:02:28,050 For I in range of data from. 23 00:02:28,980 --> 00:02:32,370 It's his goal to be a force once employed. 24 00:02:34,520 --> 00:02:44,450 I used that particular rule, plus the four years employed I, which is rule in goodwill, so suppose. 25 00:02:46,210 --> 00:02:47,560 We have this table. 26 00:02:49,140 --> 00:02:50,070 They've got this little. 27 00:02:52,080 --> 00:02:54,780 One, two, three, four, five, six. 28 00:02:55,290 --> 00:02:57,860 So first for I in range, land of the Earth. 29 00:02:57,870 --> 00:03:01,320 So this is the land of the earth for our children to be of. 30 00:03:02,490 --> 00:03:04,650 So here first our I will read this one. 31 00:03:06,400 --> 00:03:14,890 First rule, so it's difficult to deal for months in blind eye, which is months in blood cholesterol. 32 00:03:16,360 --> 00:03:17,350 It just this little. 33 00:03:21,410 --> 00:03:26,840 And we have all feature columns, so in this role, we have one simple column. 34 00:03:30,690 --> 00:03:32,880 Plus, days of years in blood column. 35 00:03:35,900 --> 00:03:36,120 Well. 36 00:03:37,210 --> 00:03:42,520 So these two columns, these two columns plus. 37 00:03:43,950 --> 00:03:44,480 Interruptible. 38 00:03:48,240 --> 00:03:49,560 Plus, an incredible. 39 00:03:50,520 --> 00:03:55,050 So once employed, Alice is under employed. 40 00:03:56,320 --> 00:04:05,460 And so we are doing this for all our roles and we are spending it in blood list. 41 00:04:10,290 --> 00:04:12,780 Little Anderson, a shift in attitude on the undersell. 42 00:04:14,500 --> 00:04:21,270 Here also in personal account, we are doing the same personal account month, less personal account 43 00:04:21,280 --> 00:04:26,080 here in Boothville, and we are bringing that to be list. 44 00:04:27,980 --> 00:04:28,840 Blitzer Anderson. 45 00:04:32,070 --> 00:04:33,240 Nonexplosive for the. 46 00:04:36,140 --> 00:04:44,380 So now let's calculate average of risk score so we have four risk, of course, risk to score three 47 00:04:45,200 --> 00:04:46,880 or four, original score five. 48 00:04:48,130 --> 00:04:50,980 So we are taking advantage of this for risk. 49 00:04:52,150 --> 00:04:57,670 So first, we are creating an empty list with the name average underscore risk, and yet we are creating 50 00:04:57,670 --> 00:05:01,300 a loop for I in range of div. 51 00:05:01,870 --> 00:05:07,180 It is going to be at its core to I, which is first column. 52 00:05:08,380 --> 00:05:16,800 The army arrived in the first column, so when we are hydrating with the first it all, so you're the 53 00:05:17,770 --> 00:05:23,980 first day of a rescue squad, three column, first from the rescue squad for column, first through. 54 00:05:25,330 --> 00:05:27,400 Deal of risk for Fuster. 55 00:05:28,550 --> 00:05:32,840 So we're adding all this and developing it before, so we're taking an average of. 56 00:05:33,850 --> 00:05:39,020 This Aristarchus course prefers to call them and appealing to average risk for. 57 00:05:41,260 --> 00:05:48,880 So when I used to do in this range, then we are doing the same for Rotu and we are bringing that. 58 00:05:50,360 --> 00:05:52,130 As average, it is for the. 59 00:05:53,080 --> 00:05:56,500 So similarly, we are doing it for quality. 60 00:05:57,620 --> 00:05:58,250 Here we have. 61 00:06:00,080 --> 00:06:02,030 Qualities, corridor and qualities good. 62 00:06:02,870 --> 00:06:03,590 So we are. 63 00:06:04,730 --> 00:06:13,190 Taking average of quality and civility and the quality list, let's run these totals. 64 00:06:18,580 --> 00:06:24,640 So now we have created so many feature columns, now let's convert all of those features, modified 65 00:06:24,640 --> 00:06:26,050 features to data from. 66 00:06:28,170 --> 00:06:33,960 So first will convert all the list into data from so ages, pages of data from each. 67 00:06:34,350 --> 00:06:35,820 So we have created a list. 68 00:06:36,120 --> 00:06:38,250 So we are converting that age list into data. 69 00:06:38,250 --> 00:06:42,450 From then we'll complete the converting employed list to employer data from. 70 00:06:44,110 --> 00:06:54,570 And then be a little to be for the average risk to average risk as debt of Adam and Eve to quality data 71 00:06:54,580 --> 00:06:54,880 for from. 72 00:06:56,100 --> 00:07:04,540 And then finally, we're creating the name feature and concatenating all this little of the transition. 73 00:07:05,160 --> 00:07:11,780 So if you have all these features but we don't have names for the columns and here we are trading column 74 00:07:11,880 --> 00:07:17,620 names, feature columns is equal to age and blood being risk and quality. 75 00:07:18,390 --> 00:07:22,500 So these are the column names for our new date of transition. 76 00:07:23,750 --> 00:07:26,570 Let's visualize our new feature data from. 77 00:07:28,570 --> 00:07:36,760 So here we have each peerce quality through each column is category, rather than seeing it for all 78 00:07:36,760 --> 00:07:37,990 other columns are in numbers. 79 00:07:38,000 --> 00:07:41,110 So let's convert over each column, the numbers. 80 00:07:42,310 --> 00:07:49,150 So we're creating a dummy variable and we're taking each column and we are studying it and we want to 81 00:07:49,150 --> 00:07:51,490 visualize this, that we do different. 82 00:07:53,180 --> 00:07:55,180 We will not head until. 83 00:07:56,890 --> 00:07:58,540 So here we have adult. 84 00:07:59,980 --> 00:08:04,400 So the only one we have stored age in numbers. 85 00:08:05,110 --> 00:08:12,760 So now let's remove our age column, which is in categorical format, which isn't saying you're an adult. 86 00:08:13,090 --> 00:08:18,120 Let's remove this age from our future date of rape and add this column. 87 00:08:19,780 --> 00:08:25,980 So first feature the drop age, they're dropping like the column and then feature is going to. 88 00:08:27,240 --> 00:08:34,610 We did the concatenate feature column feature data for them, and then we did offer them endless. 89 00:08:35,860 --> 00:08:41,650 Join this together and visualize the data from the writing feature that lets Handsell. 90 00:08:45,190 --> 00:08:46,540 So here we have convertor. 91 00:08:48,630 --> 00:08:51,260 All of it categorical features into numbers. 92 00:08:52,590 --> 00:08:53,740 Now, let's proceed further. 93 00:08:54,540 --> 00:08:55,680 Let's proceed further. 94 00:08:56,130 --> 00:09:01,770 Now let's update our features and labels after doing feature engineering. 95 00:09:04,330 --> 00:09:08,530 Let's create a variable with the dependent variable. 96 00:09:10,030 --> 00:09:12,250 Registering as a sign. 97 00:09:12,490 --> 00:09:16,030 So this is a string and we are saving this string in the. 98 00:09:18,260 --> 00:09:25,410 And here are creating a well with the name and the eye is called to the columns to. 99 00:09:26,710 --> 00:09:29,920 So, yes, we are storing all the column names. 100 00:09:30,970 --> 00:09:32,580 Of our data from. 101 00:09:35,720 --> 00:09:36,350 Run, Anderson. 102 00:09:38,190 --> 00:09:40,050 I will show you the value of andI. 103 00:09:43,050 --> 00:09:44,750 It's creating little. 104 00:09:51,610 --> 00:09:53,880 So it has all these different names. 105 00:09:55,660 --> 00:10:00,420 So now from this list, we need to remove this sign. 106 00:10:01,330 --> 00:10:08,400 So for that, you need to remove the B so deep we have to sign the string. 107 00:10:09,100 --> 00:10:11,360 So which will remove the underscore to sign the string. 108 00:10:11,650 --> 00:10:13,920 And we also need to remove a. 109 00:10:15,370 --> 00:10:19,420 So we are removing these two columnists from this list. 110 00:10:23,110 --> 00:10:30,700 Now, let's separate our X and Y, so X is according to data from us, and so we are selecting all these 111 00:10:30,700 --> 00:10:31,480 column names. 112 00:10:32,860 --> 00:10:34,090 In our feature column. 113 00:10:35,350 --> 00:10:44,060 In our features did of them and vice custody of BP, which is overdesign, so envie, which is our labels 114 00:10:44,320 --> 00:10:50,210 from we are selecting only one column and an X, we have all columns except our label column. 115 00:10:50,650 --> 00:10:51,730 It's Grandison. 116 00:10:54,220 --> 00:11:03,070 Now, let's concatenate our modified feature columns with our X, so feature these our newly created 117 00:11:03,080 --> 00:11:05,410 data for them with all modified features. 118 00:11:06,590 --> 00:11:09,190 And we realize this, that having extorted. 119 00:11:11,700 --> 00:11:12,470 Israelson. 120 00:11:15,180 --> 00:11:21,030 So here these are the default columns, each basically dual homeowner current address personal. 121 00:11:22,310 --> 00:11:23,570 It is five. 122 00:11:25,370 --> 00:11:25,790 So. 123 00:11:28,530 --> 00:11:35,760 Until this inquiry is last month, these are default columns from your employer to be the quality of 124 00:11:35,970 --> 00:11:37,410 the columns which we have created. 125 00:11:38,630 --> 00:11:45,290 So as we have already modified few columns and created new columns to remove unnecessary columns. 126 00:11:47,230 --> 00:11:56,800 So Exercycle do drop Labor's call to age, once employed, it employed personal account, personal five, 127 00:11:57,550 --> 00:12:01,330 two or three or go to five. 128 00:12:03,100 --> 00:12:12,180 If the court is called to disqualify soldiers to remove all this columnists, let's understand that 129 00:12:12,820 --> 00:12:14,300 let's we knew it. 130 00:12:19,770 --> 00:12:22,500 So here we are, basically all column and. 131 00:12:23,920 --> 00:12:29,460 All the columns are in numbers, but only this particular column is in category, which is in strength. 132 00:12:29,470 --> 00:12:32,270 So we need to convert this column numbers. 133 00:12:33,320 --> 00:12:38,990 So let's create a dummy variable, dummy variable with the name dummy to. 134 00:12:40,320 --> 00:12:41,520 For Piscatella column. 135 00:12:43,600 --> 00:12:51,580 So we did not get the Army's of Piscatella column and it in the middle and in the middle, let's drop 136 00:12:51,580 --> 00:12:52,330 by quickly. 137 00:12:54,440 --> 00:12:57,390 Because you realize this will not hit. 138 00:12:59,570 --> 00:13:03,330 So after dropping by weekly, we have monthly, semi, monthly and weekly. 139 00:13:04,040 --> 00:13:12,160 So now let's remove your name from our exit and add this to. 140 00:13:14,680 --> 00:13:22,300 So first, we are dropping Piscatella column from EXER and we are concatenating our EXER with our newly 141 00:13:22,300 --> 00:13:23,620 created data from the middle. 142 00:13:24,580 --> 00:13:26,780 Which has vandalism and weekly. 143 00:13:28,530 --> 00:13:32,290 And the this will not help, Anderson. 144 00:13:34,160 --> 00:13:36,080 So this is our final data from. 145 00:13:39,850 --> 00:13:41,470 Which is all numerical values. 146 00:13:43,150 --> 00:13:46,150 Now, let's just kill all our values with standards killer. 147 00:13:46,630 --> 00:13:51,160 So first we are importing from Eskil under preprocessing import.