1 00:00:00,150 --> 00:00:06,180 Helen, before going deep dive into the session, let's have a walkthrough on what we all have done 2 00:00:06,180 --> 00:00:07,790 in all our previous session. 3 00:00:07,980 --> 00:00:13,770 So from importing the data to the data, cleaning from data preprocessing, we have to deal with all 4 00:00:13,770 --> 00:00:14,420 these features. 5 00:00:14,430 --> 00:00:15,470 We have to deal with data. 6 00:00:15,480 --> 00:00:22,800 Jurnee, arrival time, departure time, as well as we have also deal with our duration column as well. 7 00:00:23,070 --> 00:00:25,460 So in this session, we have to fudge. 8 00:00:25,770 --> 00:00:26,220 Yeah. 9 00:00:26,340 --> 00:00:31,520 What can with our what can we do Mynott for this particular duration. 10 00:00:31,530 --> 00:00:38,140 So we have to fetch this particular data and we have to insert as a separate feature in our data frame. 11 00:00:38,370 --> 00:00:45,330 So what if let's say if I'm going to copy it, let me show you a thing and if I'm going to paste over 12 00:00:45,330 --> 00:00:52,680 there and let's say if I'm going to call, I split or there on the basis of space, if you have to split 13 00:00:52,680 --> 00:00:52,860 it. 14 00:00:53,100 --> 00:01:00,900 So it would exactly return a list of our and this may not let's say I have to add to this hour so you 15 00:01:00,900 --> 00:01:09,570 guys can access this hour, bypassing index of your list to just pazira an index and you guys will see 16 00:01:09,780 --> 00:01:16,740 you have access to our and if you want to access your two, because two is exactly. 17 00:01:16,740 --> 00:01:18,010 You are over here. 18 00:01:18,240 --> 00:01:25,380 So in such case, what you have to do, you have to say here, you have to say I have to access from 19 00:01:25,380 --> 00:01:27,370 zero to minus one. 20 00:01:27,390 --> 00:01:34,740 So if I'm going to execute it, you will see over here, here you have two as our that's what you exactly 21 00:01:34,740 --> 00:01:35,070 need. 22 00:01:35,310 --> 00:01:37,960 Similarly, if you need your mynor. 23 00:01:38,100 --> 00:01:41,210 So in such case, you guys can pass over here one. 24 00:01:41,400 --> 00:01:43,530 So here you will get to safety. 25 00:01:43,740 --> 00:01:45,240 That's what I'm trying to show you. 26 00:01:45,690 --> 00:01:51,720 So what I'm going to do now, I'm going to define a function like, say, hour and whatever X or whatever 27 00:01:51,720 --> 00:01:53,700 duration I'm going to pass it to function. 28 00:01:53,700 --> 00:01:57,080 It will exactly an hour on that particular duration. 29 00:01:57,420 --> 00:02:02,310 So to access your hour, I'm just going to copy this. 30 00:02:02,310 --> 00:02:04,300 And here I have to just pass zero. 31 00:02:04,890 --> 00:02:12,060 So what I'm going to do on X, I'm going to say just call is split and here you have to parse zero to 32 00:02:12,060 --> 00:02:21,180 an the hour and basically you have to return all this stuff to access your other similarly to access 33 00:02:21,180 --> 00:02:22,200 your minute. 34 00:02:22,620 --> 00:02:26,280 I'm just going to copy it and just going to paste over here. 35 00:02:26,280 --> 00:02:30,100 And this time I have to do some modifications in my function. 36 00:02:30,360 --> 00:02:35,700 So here my function name is nothing but my Mynatt and this time I'm going to say two acts as many of 37 00:02:35,700 --> 00:02:39,600 you have to password have one and you have to simply return it. 38 00:02:39,780 --> 00:02:46,110 I'm just going to execute it and now you have to apply this function on your duration column. 39 00:02:46,140 --> 00:02:52,500 So now I'm going to say krein underscore data of duration don't apply. 40 00:02:52,500 --> 00:02:55,090 To apply a function you have to use this apply. 41 00:02:55,470 --> 00:02:58,090 And what I have to apply, I have to apply this hour. 42 00:02:58,380 --> 00:03:03,980 Similarly, I'm just going to copy it and I'm just going to paste over here. 43 00:03:03,990 --> 00:03:11,600 This time I have to apply my function and whatever hour it is I have to store it somewhere else. 44 00:03:11,610 --> 00:03:12,140 That's it. 45 00:03:12,510 --> 00:03:14,730 I have to store it in a new feature. 46 00:03:14,760 --> 00:03:22,020 So I have to define that feature length in my feature name is nothing but duration underscore hours. 47 00:03:22,230 --> 00:03:29,790 Similarly, this time my feature name is nothing but duration underscore like say Maynards and what 48 00:03:29,790 --> 00:03:32,240 I have to do, I have to just execute it. 49 00:03:32,400 --> 00:03:35,390 Now we will see all the stuff gets executed over. 50 00:03:35,550 --> 00:03:42,810 And now this time if I'm going to call ahead or whatever you will see are two new column has been added 51 00:03:42,810 --> 00:03:44,250 in your data frame, which is nothing. 52 00:03:44,250 --> 00:03:48,720 But my duration is call hours and duration and it's now what we have to do. 53 00:03:48,990 --> 00:03:52,860 We have to simply drop this duration to four days. 54 00:03:52,860 --> 00:03:58,490 What we guys can do, you guys can use our defined function name that we have defined earlier. 55 00:03:58,620 --> 00:04:03,990 And here, if you will, press shift plus tab, you will get all the parameters that I have defined. 56 00:04:04,200 --> 00:04:05,160 The very first one. 57 00:04:05,160 --> 00:04:10,200 What exactly is a frame and the second one, what is exactly column name. 58 00:04:10,210 --> 00:04:17,730 So I have to say, all these things just executed and this column gets removed from your data frame. 59 00:04:17,880 --> 00:04:25,590 And if I'm going to check what exactly is a data type right now of each and every feature, you will 60 00:04:25,590 --> 00:04:28,890 see a duration of hours and duration. 61 00:04:28,890 --> 00:04:34,530 Myners is of Objet, but theno it is somehow of nomadic format. 62 00:04:34,530 --> 00:04:36,230 It means you have to deal with that. 63 00:04:36,240 --> 00:04:42,240 It means you have to convert that data type of this duration is four hours as well as duration to four 64 00:04:42,240 --> 00:04:42,720 minutes. 65 00:04:43,020 --> 00:04:43,770 So four days. 66 00:04:43,770 --> 00:04:48,990 What I'm going to do very first, I have to constrain it and here I am going to access my duration for 67 00:04:48,990 --> 00:04:50,640 hours on this. 68 00:04:50,640 --> 00:04:52,890 I'm going to call my Istat function. 69 00:04:52,890 --> 00:04:58,000 And here I have to mention I'm going to change it into my into the format. 70 00:04:58,020 --> 00:04:59,750 So now I have to ask. 71 00:04:59,880 --> 00:05:01,730 This my generation. 72 00:05:01,790 --> 00:05:06,930 School, I was assured, because I have to update it as well, so after doing all this stuff, I have 73 00:05:06,930 --> 00:05:11,070 to also perform it for my ghoulishness coalminer's. 74 00:05:11,250 --> 00:05:17,910 And on days, what I am going to do, I have to simply call my stip over there and here. 75 00:05:17,940 --> 00:05:21,650 I have to update this entire column as well. 76 00:05:21,990 --> 00:05:24,110 So I'm just going to pasted after it. 77 00:05:24,150 --> 00:05:26,460 I have to just execute it now. 78 00:05:26,520 --> 00:05:33,180 The Celgar successfully executed and if I'm with crosschecked basically to just start to get your data 79 00:05:33,180 --> 00:05:36,220 set estimation and accurate, if I'm I would call daps. 80 00:05:36,240 --> 00:05:40,620 You will see this Bota column get converted into some Intisar form. 81 00:05:40,620 --> 00:05:42,190 And that's what I'm trying to show you. 82 00:05:42,360 --> 00:05:48,320 Let's say from this training data, which is exactly my data frame, let's say you have to extract. 83 00:05:48,540 --> 00:05:48,990 Yeah. 84 00:05:49,110 --> 00:05:54,840 What are my categorical data and what are my numerical data or what are my continuous features? 85 00:05:55,140 --> 00:06:00,000 So what I am going to do very first, I'm going to trade on this data frame. 86 00:06:00,000 --> 00:06:06,510 I'm going to trade on each and every column in one frame and whosoever has a data type object, I'm 87 00:06:06,510 --> 00:06:10,440 going to consider that column as a categorical data. 88 00:06:10,590 --> 00:06:16,830 So for this, I have to say for column in this country and data column. 89 00:06:16,860 --> 00:06:18,720 So this is basically my creation. 90 00:06:18,930 --> 00:06:28,720 And here I have to put up a nation if train data column dot data, it goes to object. 91 00:06:28,740 --> 00:06:34,950 So if this condition is going to satisfy, then only I'm going to consider that column in my list. 92 00:06:35,340 --> 00:06:39,960 So this is exactly a code of list comprehension in Biton. 93 00:06:40,200 --> 00:06:48,000 So let's say I'm going to Stormi all the columns now in my Katinas list and if you have to bring this, 94 00:06:48,000 --> 00:06:56,010 you guys can simply printed, you will see it has all these column name that supports categorical data. 95 00:06:56,370 --> 00:07:01,440 Let's say you have to fetch your what are my continuous features in such case. 96 00:07:01,440 --> 00:07:05,350 You guys can say wherever this condition will not satisfy. 97 00:07:05,710 --> 00:07:10,090 So those all the features are exactly my continuous column. 98 00:07:10,440 --> 00:07:15,450 Similarly, you can print it as well to just execute it. 99 00:07:15,630 --> 00:07:23,540 And these are all my continuous features of my data that we all have to play with that in all of our 100 00:07:23,550 --> 00:07:24,780 upcoming sessions. 101 00:07:24,800 --> 00:07:29,100 So in the upcoming session, we are basically going to deal with this categorical data. 102 00:07:29,100 --> 00:07:35,040 We are going to encode this data because these are exactly my categorical data and machine learning. 103 00:07:35,040 --> 00:07:36,990 Just understand my Intisar data. 104 00:07:37,020 --> 00:07:43,920 It means you have to convert this categorical data into some Intisar format, into some flawed format, 105 00:07:43,920 --> 00:07:47,670 because machine learning only works on Intisar data. 106 00:07:47,700 --> 00:07:53,620 That's why you have to do all these modifications, all these feature encoding on your data. 107 00:07:53,940 --> 00:07:55,800 So hope you love the session very much. 108 00:07:55,800 --> 00:07:58,410 And this session will be definitely very helpful for you. 109 00:07:58,440 --> 00:07:59,160 Thank you. 110 00:07:59,340 --> 00:08:00,360 Have a nice day. 111 00:08:00,390 --> 00:08:01,350 Keep learning. 112 00:08:01,350 --> 00:08:02,160 Keep growing. 113 00:08:02,680 --> 00:08:03,660 Keep practicing.