1 00:00:01,400 --> 00:00:07,070 So find us is a software library written for the Python programming language for data manipulation and 2 00:00:07,070 --> 00:00:08,000 analysis. 3 00:00:08,000 --> 00:00:14,230 This is specifically for data manipulation and analysis. 4 00:00:14,240 --> 00:00:16,490 First we need to import minus 5 00:00:19,550 --> 00:00:28,560 so we'll write import find us as speedy if we are using and I want. 6 00:00:28,790 --> 00:00:33,670 And then we're not have automatically installed you find another in your system so you won't need to 7 00:00:33,680 --> 00:00:35,070 install it separately. 8 00:00:35,080 --> 00:00:40,470 You just have been ordered into your workspace fought by us. 9 00:00:40,480 --> 00:00:47,030 We will be using our customer data so yes we find you can find this file in the resources section on 10 00:00:47,050 --> 00:00:48,550 this video. 11 00:00:48,550 --> 00:00:53,140 So go on download this fight and put it in your folder. 12 00:00:55,050 --> 00:01:06,300 We will start by importing a customer see as we find civil right data when this is over variable then 13 00:01:06,300 --> 00:01:16,220 we'll write up on notes function to import CSC that is BD dot read underscore the CSP then we provide 14 00:01:16,220 --> 00:01:17,950 the location of Odysseus to find 15 00:01:20,850 --> 00:01:24,240 remember routine this back slashes into forward slashes 16 00:01:30,740 --> 00:01:34,060 then the file name somewhere not CSP 17 00:01:37,720 --> 00:01:42,400 and then edits equate those zeros and so what for stroke and been beheaded. 18 00:01:45,030 --> 00:01:56,270 Done this will get our people very well done for day to day to undertake really get the first five rows 19 00:01:57,830 --> 00:01:58,640 of our data 20 00:02:04,500 --> 00:02:12,360 you can see we have a somewhat idea someone a name segment each increase a b c the board and the region 21 00:02:12,520 --> 00:02:20,970 as our columns then we have multiple similar details Rose first is of course somewhat I read this is 22 00:02:20,970 --> 00:02:27,690 a unique idea for each customer second column is like a similar name here we have someone name for name 23 00:02:28,350 --> 00:02:35,610 then there is a segment where the customer belong to and zoom on the segment or corporate segment then 24 00:02:35,610 --> 00:02:42,390 we have a column for each that each your customer then the concrete city state or cell board and region 25 00:02:42,420 --> 00:02:50,250 of that customer if you want to grab Motorola's you can provide the number and this record by default 26 00:02:50,250 --> 00:02:55,290 it is 5 if you write then and would put you in deck then goes 27 00:02:58,070 --> 00:03:01,750 now here you are seeing this 0 1 2 3 4. 28 00:03:02,540 --> 00:03:13,260 This all are the index off of this table for example zero throws will be this row if you want to know 29 00:03:13,420 --> 00:03:21,180 somewhat of a B as an index and so it goes his family we can add it as an index will write this CSC 30 00:03:21,180 --> 00:03:28,020 file into another database that is data to really copy the outcome on 31 00:03:33,420 --> 00:03:41,460 who will write another parameter that is index underscore column we are providing the location of index 32 00:03:41,460 --> 00:03:50,870 column and for our data it discussed somewhat 80 which is the 0 call them off our data since this is 33 00:03:50,880 --> 00:03:57,950 the first column the index is 0 that's why we are providing 0 similar to what we provided for this. 34 00:03:58,110 --> 00:04:06,240 So what I had done was by then in the war before I get and 0 low vision of it it also or the next column 35 00:04:06,240 --> 00:04:13,050 is 0 if we run this and again if we run the hurdle is 36 00:04:17,510 --> 00:04:24,020 you can see now 0 1 2 3 4 indexes that have on and now this out of our indexes 37 00:04:28,010 --> 00:04:35,570 these are important and we will discuss I ordered a short way now Hey come on let's use to you the sample 38 00:04:35,570 --> 00:04:44,240 of your data if you want to know statistics of your data you write data one don't describe 39 00:04:46,850 --> 00:04:53,830 don't describe this game is a killer and then this 40 00:04:57,750 --> 00:05:00,970 so there are only two in beta we are losing our data. 41 00:05:01,440 --> 00:05:09,190 That's why we are only getting two columns year for season and second this post syllable. 42 00:05:09,350 --> 00:05:12,210 Here you can see the total account of value. 43 00:05:12,250 --> 00:05:17,310 The mean not alternate the standard deviation of age minimum made maximum age. 44 00:05:17,360 --> 00:05:18,810 These are the percentile value. 45 00:05:18,810 --> 00:05:26,690 This the 25 percentile value so if you arrange all the agent ascending order the value present under 46 00:05:26,680 --> 00:05:31,100 25 percentile of their data is this value. 47 00:05:31,940 --> 00:05:39,380 Similarly this is the 50 percent and also known as the median value is the 70 percentile value of it 48 00:05:39,750 --> 00:05:44,210 and is the maximum value of each. 49 00:05:44,300 --> 00:05:51,500 This very discussed and unique period analysis which we will be covering in the later part of this course 50 00:05:52,850 --> 00:05:56,830 there are two ways to index our data stream. 51 00:05:57,740 --> 00:06:04,400 So we discuss a little while importing this data we can provide index next column for our data one index. 52 00:06:04,400 --> 00:06:11,490 We did not provide any index column and four data to our index column it's got some variety. 53 00:06:12,140 --> 00:06:20,660 So if you want to view the first rule of our data we have to IWC lock or I lock you. 54 00:06:20,660 --> 00:06:32,900 If we use data one that I lock and then we provide 0 what I lock we do is it will grab the data that 55 00:06:32,900 --> 00:06:43,620 is present in the z index of our data frame so our output is same as what the first scroll is of our 56 00:06:43,620 --> 00:06:45,170 data frame. 57 00:06:45,360 --> 00:06:52,820 If we want to use the index column which we defined by creating or determining we have to use lock 58 00:06:57,770 --> 00:07:06,110 and in the bracket if we write the cost somewhat 80 C D 1 2 5 0 0 59 00:07:09,060 --> 00:07:14,210 in the attack to be defined our index column as cost somewhat 80. 60 00:07:14,250 --> 00:07:22,860 So now we can use LOC keyword to get the date of this somewhat 80 feet on this. 61 00:07:23,100 --> 00:07:28,560 You can see we are getting all the details of our cost somewhat except the cost somewhat ironic since 62 00:07:28,710 --> 00:07:30,500 this is the index column. 63 00:07:31,060 --> 00:07:39,900 Similarly if I don't know this 80 and I just wanted to grab the first customer I can use I log 64 00:07:51,350 --> 00:07:58,410 on so I am getting the same in here I was using I along with I look you have to use the serial number 65 00:07:58,680 --> 00:08:00,750 0 and so on. 66 00:08:00,750 --> 00:08:05,910 With lock you can use the index column that you throw away. 67 00:08:06,180 --> 00:08:14,820 So if you know the position you can use a lock and if you know the value you can use the locks 68 00:08:17,420 --> 00:08:23,540 just like in less than the time frame you can also mention multiple values using called an operator. 69 00:08:23,570 --> 00:08:36,190 So for example if I write data to that boat I lock 0 column 5. 70 00:08:36,570 --> 00:08:44,730 This will give me that data off first five rows where the index will Lewis 0 1 2 3 and 4. 71 00:08:44,760 --> 00:08:47,850 Remember 5 is excluded from this research. 72 00:08:48,370 --> 00:09:00,740 So I'm getting data of this paper summer you can use steps as one if I tried to run this I'm getting 73 00:09:00,740 --> 00:09:04,940 on three days since I'm using the steps. 74 00:09:05,870 --> 00:09:14,340 That's all on the we will be using find a lot more by doing our work and would discuss new topics than 75 00:09:14,370 --> 00:09:14,780 that on.