1 00:00:01,180 --> 00:00:07,980 As to the need to re-elect him, we need to change the nonmedical values in that categorical variables 2 00:00:08,210 --> 00:00:10,750 to numerical values by creating deliverables. 3 00:00:12,540 --> 00:00:17,340 Creating a dummy variable in software package like art is very easy. 4 00:00:18,780 --> 00:00:22,020 So we have two variables that are categorical in our dataset. 5 00:00:22,950 --> 00:00:25,440 One is airport, which has categories. 6 00:00:25,740 --> 00:00:26,520 Yes and no. 7 00:00:27,150 --> 00:00:28,490 Let us open the database again. 8 00:00:30,750 --> 00:00:33,090 One is this airport, which has two categories. 9 00:00:33,720 --> 00:00:41,040 The other is water boarding, which has four categories Lake River, Lake none and would not create 10 00:00:41,040 --> 00:00:45,090 dummy variables with numerical values for all the dataset. 11 00:00:46,230 --> 00:00:49,710 That is, for all the categorical variables of your dataset. 12 00:00:49,800 --> 00:00:55,650 In one go, we will install a package called Demis to install a package. 13 00:00:56,370 --> 00:01:04,620 We can write in stalled out package and within Blackard and double quotation marks relate demis. 14 00:01:07,620 --> 00:01:14,250 And this to this package is a start, and you can look at this package if you go to the packages tab 15 00:01:14,730 --> 00:01:15,300 on the date. 16 00:01:16,350 --> 00:01:19,390 There is this dummy's package to Lawder. 17 00:01:19,500 --> 00:01:21,900 You just need to click it on this checkbook's. 18 00:01:23,020 --> 00:01:26,620 So basically, this package makes dummy variable creation easy. 19 00:01:28,090 --> 00:01:39,290 Next, we need to write one single line of code to get the dummy variables relate D.F. get dummy Dort 20 00:01:39,300 --> 00:01:39,790 data 21 00:01:42,430 --> 00:01:45,710 or data within Blackwood's. 22 00:01:45,760 --> 00:01:52,860 We just need to specify our dataset, which is D.F. and we'll run this so long. 23 00:01:52,910 --> 00:01:55,710 Let us click this variable to viewer. 24 00:01:58,570 --> 00:02:02,300 So if you know, scroll to the trade you can see for airport. 25 00:02:02,380 --> 00:02:04,000 We now have two columns. 26 00:02:05,290 --> 00:02:08,500 One where Edward where he will add value. 27 00:02:08,500 --> 00:02:08,900 Yes. 28 00:02:09,170 --> 00:02:10,150 And other for no. 29 00:02:11,260 --> 00:02:13,300 So basically in this Yes. 30 00:02:13,300 --> 00:02:13,870 Variable. 31 00:02:14,470 --> 00:02:17,980 This contains one when the actual value of it for very well was. 32 00:02:18,070 --> 00:02:18,580 Yes. 33 00:02:19,840 --> 00:02:25,600 And in this other column it contains one if the actual value of it what it bought was No. 34 00:02:26,980 --> 00:02:30,250 Similarly in the position of water body variable. 35 00:02:31,900 --> 00:02:35,700 We have four new variables where there was lake. 36 00:02:36,430 --> 00:02:40,710 We have one in this lake column where there was river. 37 00:02:40,780 --> 00:02:44,380 We have one in this river column and so on. 38 00:02:46,300 --> 00:02:52,270 But if you remember, I told you that number of dummy variables is actually one less than number of 39 00:02:52,270 --> 00:02:52,960 categories. 40 00:02:54,070 --> 00:02:57,700 So as it poured, variable has two categories, yes and no. 41 00:02:57,970 --> 00:02:59,980 We need only one dummy variable for this. 42 00:03:01,480 --> 00:03:07,990 So this important yes variable can alone serve the purpose as one will be present. 43 00:03:08,230 --> 00:03:08,480 Yes. 44 00:03:08,530 --> 00:03:10,030 And zero will represent no. 45 00:03:11,350 --> 00:03:12,810 Similarly for water body. 46 00:03:13,600 --> 00:03:19,690 We can keep these three variables which are water, body lake, water boarding lake and river and water 47 00:03:19,690 --> 00:03:22,800 body river and will not need these. 48 00:03:23,460 --> 00:03:23,960 Very well. 49 00:03:23,960 --> 00:03:24,970 Variable waterboarding then. 50 00:03:26,200 --> 00:03:28,840 So now we need to delete these two variables. 51 00:03:29,680 --> 00:03:30,250 Waterboarding. 52 00:03:30,250 --> 00:03:30,640 None. 53 00:03:30,820 --> 00:03:32,890 An airport. 54 00:03:32,920 --> 00:03:33,310 No. 55 00:03:34,360 --> 00:03:35,740 To delete these two variables. 56 00:03:36,490 --> 00:03:38,530 We need to get depletion of these two columns. 57 00:03:39,190 --> 00:03:44,460 So one is at the SO airport is at the ninth column. 58 00:03:45,640 --> 00:03:50,970 You can look at this box, which we get when I hold my mouse over this column name. 59 00:03:52,170 --> 00:03:54,280 It put no is at the ninth column. 60 00:03:54,790 --> 00:03:55,810 And the other one is. 61 00:03:57,930 --> 00:03:59,290 A defined column. 62 00:04:00,250 --> 00:04:02,150 So let's go and delete these two variables. 63 00:04:04,310 --> 00:04:10,080 D.F. Gate, D.F. squared with good karma. 64 00:04:11,780 --> 00:04:12,460 Minus nine. 65 00:04:14,230 --> 00:04:14,810 Then this. 66 00:04:17,890 --> 00:04:19,430 Now, this one rehabilitated. 67 00:04:20,500 --> 00:04:24,760 The other one was at the 15th Fusion, which will now be 14th position. 68 00:04:25,600 --> 00:04:26,740 You can check again. 69 00:04:26,900 --> 00:04:28,840 We'll go to the waterboarding one column. 70 00:04:29,770 --> 00:04:31,310 You can see it is at the deporting inquisition. 71 00:04:31,480 --> 00:04:33,880 So we just change nine to 14. 72 00:04:41,980 --> 00:04:42,720 And we'll run it. 73 00:04:43,560 --> 00:04:46,980 You can see that now we have 17 variables. 74 00:04:48,540 --> 00:04:53,400 So we have created dummy variables for these two categorical variables in our dataset. 75 00:04:54,180 --> 00:04:59,640 All the variables in our dataset now have numerical values and we can run that regression. 76 00:04:59,710 --> 00:05:00,350 Marylanders.