1 00:00:00,150 --> 00:00:06,480 Helen, before going deep dive into the session, let's have a quick recap of what we have done in the 2 00:00:06,480 --> 00:00:07,280 previous session. 3 00:00:07,620 --> 00:00:09,830 So basically we have imported our data. 4 00:00:09,840 --> 00:00:15,990 We have done lots of data processing on our data as well as we have deal with the missing values. 5 00:00:16,200 --> 00:00:20,760 We have simply dropped all the missing value because we have less missing values and after it we have 6 00:00:20,760 --> 00:00:23,600 to define its function to change my data type. 7 00:00:23,700 --> 00:00:28,740 Whatever the type of variable will be, it would just convert into some data format after doing all 8 00:00:28,740 --> 00:00:32,820 these things we have accepted as Journey Day journeyman for our column. 9 00:00:32,820 --> 00:00:36,650 And we have basically dropped this column because it makes no sense at all. 10 00:00:36,960 --> 00:00:43,140 And in the session, we have to deal with this arrival time as well as this departure time feature as 11 00:00:43,140 --> 00:00:47,550 well, because whenever you are going to parse this feature to a machine learning model, your model 12 00:00:47,550 --> 00:00:54,510 isn't able to understand what exactly that particular day, what exactly the particular arrival time, 13 00:00:54,780 --> 00:00:56,850 because you have to tell with machine learning model. 14 00:00:56,850 --> 00:00:57,190 Yeah. 15 00:00:57,510 --> 00:01:04,530 And I would offer this much my arrival, this much my departure hours did much, my departure this much. 16 00:01:04,770 --> 00:01:10,110 And after what we have to do for this, for performing all these things, what do we have to do? 17 00:01:10,110 --> 00:01:15,740 We have to fetch May not an hour from my data and we have to send to that machine learning model. 18 00:01:15,780 --> 00:01:17,120 That's what I want to do. 19 00:01:17,550 --> 00:01:25,240 So to extract work from just data to extract out from this column, I would define a function X. 20 00:01:25,290 --> 00:01:34,250 Its name is Lex, its name is extract, underscore our and on what data stream I have to apply that. 21 00:01:34,260 --> 00:01:36,870 It was at the very first parameter. 22 00:01:37,140 --> 00:01:40,430 The second parameter is on what column I have to apply. 23 00:01:41,100 --> 00:01:42,750 So now what I have to do. 24 00:01:42,930 --> 00:01:47,650 So basically I'm going to say the fourth column dot dot dot hour. 25 00:01:47,700 --> 00:01:50,350 So it will exactly return me some hour. 26 00:01:50,460 --> 00:01:52,560 So now I have to store it somewhere else. 27 00:01:52,810 --> 00:01:58,280 Let's say I have to store it in whatever column I want to pass in my function name. 28 00:01:58,290 --> 00:02:02,080 You have to just concatenate underscore our order. 29 00:02:02,280 --> 00:02:04,770 Similarly, you have to write a function. 30 00:02:04,770 --> 00:02:09,660 Similarly, you have to write a function to extract your minute as well. 31 00:02:09,670 --> 00:02:12,370 So I'm just going to write a function over there. 32 00:02:12,390 --> 00:02:17,460 So for this I have to assign proper indentation, otherwise it will give me some error. 33 00:02:17,700 --> 00:02:20,790 So this time I have to say this time I function. 34 00:02:20,790 --> 00:02:23,700 Name is say accoustic underscoring. 35 00:02:23,730 --> 00:02:31,320 And here I have to say I have to extract this minute and here I'm going to say column name is Will be 36 00:02:31,320 --> 00:02:31,890 Nothing. 37 00:02:31,890 --> 00:02:33,510 My column on this column. 38 00:02:33,870 --> 00:02:39,590 And just as an all this team likes it, after doing all this thing, you have to drop that column because 39 00:02:39,610 --> 00:02:41,850 that column makes no sense at all. 40 00:02:42,070 --> 00:02:45,000 So I will define mine, let's say some new function. 41 00:02:45,210 --> 00:02:50,910 Let's say in my function, name is nothing but drop on a score column to make it more user friendly. 42 00:02:50,910 --> 00:02:55,170 And after it, I have to pass two parameters, my dear friend column. 43 00:02:55,380 --> 00:02:57,640 Then I have to say what I have to draw. 44 00:02:57,720 --> 00:02:59,580 So I'm going to say, do you have to draw? 45 00:02:59,610 --> 00:03:02,550 And this time I have to pass my column. 46 00:03:02,550 --> 00:03:08,430 Then X is parameter as well as I have to pass my in-place parameter as well, because I have to admit 47 00:03:09,000 --> 00:03:09,470 as well. 48 00:03:09,750 --> 00:03:14,130 And after doing all this, what I have to do, I have to just executed. 49 00:03:14,130 --> 00:03:16,800 And let's say very first what I'm going to do. 50 00:03:16,920 --> 00:03:22,070 I have to apply all of these three functions on my this departure time. 51 00:03:22,350 --> 00:03:27,230 So let's say the very first I would apply all these functions on my departure. 52 00:03:27,240 --> 00:03:33,480 Let's go down so that if I have to say what function I have to play, so let's say I have to extract 53 00:03:33,510 --> 00:03:37,620 all those thoughts, I have to call this on a sort of assumption. 54 00:03:37,800 --> 00:03:43,410 And here I'm going to say my datastream name is nothing but my training data. 55 00:03:43,410 --> 00:03:49,450 And this time my column name is nothing or be on this call time. 56 00:03:49,490 --> 00:03:53,170 Make sure we don't have any case sensitive issues, otherwise it will give you error. 57 00:03:53,430 --> 00:03:58,290 Similarly, if you have to accept mynor, you guys can call this function as well. 58 00:03:58,500 --> 00:04:03,660 And here very first you have to Passo data from the second one feature you have to parse. 59 00:04:03,720 --> 00:04:05,970 What exactly is your column name. 60 00:04:06,120 --> 00:04:10,800 So I'm going to pass all this stuff and after doing all these things you have to drop that particular 61 00:04:10,800 --> 00:04:11,190 column. 62 00:04:11,490 --> 00:04:16,460 So I'm going to say my function name is nothing but drop on this column and my data framing the thing, 63 00:04:16,480 --> 00:04:22,170 what it's called data mining feature, multimode departure on this school time. 64 00:04:22,350 --> 00:04:25,350 And after it, we have to just execute it. 65 00:04:25,350 --> 00:04:31,950 And if this time I'm with the caller had already, we will see what we don't have any departure time 66 00:04:31,950 --> 00:04:32,460 column. 67 00:04:32,670 --> 00:04:39,690 And you have a two column names additionally over here as departure time and our deposit that on the 68 00:04:39,690 --> 00:04:40,250 school minutes. 69 00:04:40,560 --> 00:04:42,240 So that's a power function. 70 00:04:42,240 --> 00:04:48,660 Whenever you have to do a task multiple times, just like that block of code and you function whenever 71 00:04:48,660 --> 00:04:52,020 you have a need of that function, just call it that simple. 72 00:04:52,170 --> 00:04:58,650 Similarly, you have to perform all these stuffs for your arrival time feature as well, which is nothing. 73 00:04:58,650 --> 00:04:59,940 But which is this one. 74 00:05:00,210 --> 00:05:07,320 So I'm just going to copy this and just I'm going to paste over here this time here, I'm going to say 75 00:05:07,620 --> 00:05:10,950 my future is nothing but arrival time. 76 00:05:11,220 --> 00:05:14,940 So just to do just played here. 77 00:05:15,330 --> 00:05:16,820 Just played here. 78 00:05:17,070 --> 00:05:23,820 And if you are again going to execute it, so if you are going to call ahead on your data now, we will 79 00:05:23,820 --> 00:05:28,700 again see or there are two new Collum as a driver and it's go down. 80 00:05:28,710 --> 00:05:30,990 It's called hour and arrival and it's called time. 81 00:05:30,990 --> 00:05:37,330 And it's minute has been added in your data as well as you don't have any column name as a travel escort 82 00:05:37,410 --> 00:05:37,700 time. 83 00:05:38,010 --> 00:05:39,960 So that's about power, the automation. 84 00:05:39,960 --> 00:05:45,390 That's the power of creating a function in Python as well as in any programming language. 85 00:05:45,570 --> 00:05:52,770 Now, what we have to do is we have to process, we have to pre-process our this duration column. 86 00:05:53,010 --> 00:05:56,700 So we will observe over here in this duration column. 87 00:05:57,000 --> 00:06:02,120 This duration is nothing but in form of hours and minutes. 88 00:06:02,280 --> 00:06:07,230 So whenever you are going to parse this feature to your machine learning model as yeah, this is my 89 00:06:07,230 --> 00:06:12,780 duration, but machine learning model isn't able to understand because you have deltour machine learning 90 00:06:12,780 --> 00:06:13,210 model. 91 00:06:13,390 --> 00:06:17,720 Yeah, I have that much power in duration and that minute in duration. 92 00:06:18,060 --> 00:06:26,100 So but I think that you will notice over here maybe at some places you don't have a minute, you don't 93 00:06:26,100 --> 00:06:27,030 have a half hour as well. 94 00:06:27,030 --> 00:06:29,830 It means you have to pre-process this column as well. 95 00:06:30,150 --> 00:06:37,950 So what I am going to do over here, so that is first, let's say I have to outrate on each and every 96 00:06:37,950 --> 00:06:42,330 duration to where I'm going to say wherever I don't have any Myners. 97 00:06:42,540 --> 00:06:48,300 So simply I'm going to append as you get older and wherever, I don't have any hour in such case I can 98 00:06:48,360 --> 00:06:49,650 spend zero hour with. 99 00:06:50,040 --> 00:06:53,070 So for this, what I'm going to do, I'm just going to exit. 100 00:06:53,070 --> 00:07:02,100 As for I, in range of whatever will be the length of my duration column so that if I'm going to say 101 00:07:02,310 --> 00:07:08,080 this duration is nothing but I'm going to create a list of my duration. 102 00:07:08,370 --> 00:07:13,040 So just create a list of the so very first to have to add to the data. 103 00:07:13,230 --> 00:07:15,510 And here you have to access your duration. 104 00:07:15,720 --> 00:07:24,990 And after that you have to I try to zero to whatever will be the length of your duration list, whatever 105 00:07:25,170 --> 00:07:26,670 the length of my duration list. 106 00:07:27,280 --> 00:07:30,710 And once you have all these things that I'm going to say. 107 00:07:30,930 --> 00:07:32,370 So let me think. 108 00:07:32,380 --> 00:07:34,410 Let me show you a thing. 109 00:07:34,800 --> 00:07:40,710 Let's say I'm going to copy this and let's say I'm going to convert it into some string. 110 00:07:40,710 --> 00:07:48,330 And if I'm going to call a spade over there, let's say I have to split it on the basis of the space 111 00:07:48,360 --> 00:07:48,940 operator. 112 00:07:49,500 --> 00:07:52,420 So remove the extra spaces. 113 00:07:52,590 --> 00:08:00,780 So if I'm going to execute it now, we will see it will exactly the list where I have two hours and 114 00:08:00,990 --> 00:08:01,720 this minute. 115 00:08:01,890 --> 00:08:04,290 So my logic will be simple as that. 116 00:08:04,500 --> 00:08:12,430 Wherever the count, whatever it's count, whatever the count of the list will be two will be two. 117 00:08:12,570 --> 00:08:18,000 I don't have to do any manipulation in my data, but wherever it's length will be one. 118 00:08:18,180 --> 00:08:21,540 I have to do some modifications, some manipulations in my data. 119 00:08:21,720 --> 00:08:34,710 So here I'm going say is length of is length of my duration of I got a split because you have to split 120 00:08:34,710 --> 00:08:38,220 it and you have to split it on the basis of the space tiebreaker. 121 00:08:38,430 --> 00:08:43,110 And wherever this land is equally close to two, do so. 122 00:08:43,110 --> 00:08:45,870 In such case, what you have to do, you have to simply skip it. 123 00:08:45,870 --> 00:08:52,830 So to skip you guys can use this pass given order and here you have s block. 124 00:08:52,830 --> 00:08:57,170 And in this block, what you have to do, you have to wait for the condition. 125 00:08:57,480 --> 00:09:00,990 So wherever you have edge, it means that you have over. 126 00:09:01,200 --> 00:09:08,190 So wherever you have this hour in duration of I, it means in each and every duration, if you have 127 00:09:08,190 --> 00:09:10,910 this hour, if you have this edge. 128 00:09:11,220 --> 00:09:16,800 So in such case you have to append minute, you have to open the zero minute in your data. 129 00:09:16,920 --> 00:09:27,080 So here I'm going to say duration of by plus zero and you have to update this duration as well. 130 00:09:27,330 --> 00:09:36,030 So I'm going to say just based on the what if you don't have any edge, it means maybe you have a minute 131 00:09:36,030 --> 00:09:36,690 in your data. 132 00:09:36,700 --> 00:09:44,470 So in such case, what you have to do in such case, basically you have to spend zero hour in your data. 133 00:09:44,490 --> 00:09:49,570 So in such case, you have to upend zero hour in your data for here. 134 00:09:49,590 --> 00:09:54,610 I'm basically going to append this zero edge in my data. 135 00:09:55,050 --> 00:09:59,790 So what I have to do now just executed all of this just gets executed and. 136 00:10:00,280 --> 00:10:07,480 I would have call had on Monday, the same night you see all the modifications, but you will think, 137 00:10:07,480 --> 00:10:15,160 no, it's not happened because you don't have assigned this duration list in your data frame because 138 00:10:15,370 --> 00:10:16,720 it is not updated. 139 00:10:16,750 --> 00:10:20,300 You are too slow to update your data. 140 00:10:20,410 --> 00:10:25,620 You have to insert this judicial list at this place. 141 00:10:25,750 --> 00:10:31,750 So for this, what I am going to do is I have to excuse my data frame, then I have to access the duration 142 00:10:31,900 --> 00:10:36,880 and here I have to assign this duration so you don't have to copy. 143 00:10:36,910 --> 00:10:43,690 So I have to just as you ordered and if again, I am going to execute it, you will see a manipulation 144 00:10:43,690 --> 00:10:45,360 has been happen in your data. 145 00:10:45,370 --> 00:10:48,560 You will see a zero minute has been added in your data. 146 00:10:48,820 --> 00:10:50,380 That's what I'm trying to show you. 147 00:10:50,410 --> 00:10:52,070 So that's all about the session. 148 00:10:52,090 --> 00:10:54,010 Hopefully we'll love the session very much. 149 00:10:54,340 --> 00:10:55,240 Thank you, guys. 150 00:10:55,270 --> 00:10:56,170 Have a nice day. 151 00:10:56,320 --> 00:10:57,160 Keep learning. 152 00:10:57,160 --> 00:10:58,030 Keep growing. 153 00:10:58,330 --> 00:10:59,230 Keep practicing.