1 00:00:05,850 --> 00:00:06,360 Here. 2 00:00:06,710 --> 00:00:09,430 So now we have the data uploaded here. 3 00:00:10,530 --> 00:00:16,360 And if you can try to bring these fights like that we go for the largest one that is projects. 4 00:00:16,540 --> 00:00:23,050 And if I shoot you down here we will get this fight in which if you scroll down known and they will 5 00:00:23,050 --> 00:00:33,910 you find they are this like maybe this one is one million eleven Lek and seventeen fights. 6 00:00:33,950 --> 00:00:37,770 Rose we can see and 18 columns. 7 00:00:37,770 --> 00:00:39,850 So you're going to have the idea that how many out there. 8 00:00:40,030 --> 00:00:47,170 And this one is 1 million maybe not 11 million but a large number of data and we are going to analyze 9 00:00:47,170 --> 00:00:48,640 all this data here. 10 00:00:48,670 --> 00:00:50,020 You can check any of these files. 11 00:00:50,020 --> 00:00:51,400 They are all nuts like this one. 12 00:00:52,030 --> 00:01:03,550 So first of all if you print here like project though shape and then you will get the number of rows 13 00:01:03,640 --> 00:01:09,070 and number of columns that are present in the particular data fight and you feel I notice there. 14 00:01:09,070 --> 00:01:13,280 This one is the same number that we have seen in that particular fight. 15 00:01:13,300 --> 00:01:19,900 Now we will be in the shape of all these files first to describe and show the data put column ideas 16 00:01:20,560 --> 00:01:26,380 will go for building the shape then we will be in their head and then we will describe that data. 17 00:01:27,620 --> 00:01:29,840 So now let's begin with printing that one. 18 00:01:30,080 --> 00:01:39,560 So first of all I will do something like print and then shape of resources for the first one. 19 00:01:39,560 --> 00:01:51,970 That is not data solicit donations that is the donations data frame is and then provide these notes. 20 00:01:52,350 --> 00:01:59,700 After that here just passed the name donations and then shape the method by which we have printed to 21 00:01:59,700 --> 00:02:00,140 shape. 22 00:02:00,450 --> 00:02:05,990 If you do that money you will get a shape of donation data frame is for like sixty eight thousand for 23 00:02:05,990 --> 00:02:10,530 like forty six like eighty seven thousand eight hundred and eighty four. 24 00:02:10,530 --> 00:02:19,840 That is four million approx four point five then just copy this line maybe have some and based date 25 00:02:19,840 --> 00:02:28,250 for six times for all of these and this one will be donors then we have projects. 26 00:02:28,350 --> 00:02:29,110 So first. 27 00:02:29,110 --> 00:02:32,620 Donors this one projects. 28 00:02:32,820 --> 00:02:36,440 So rude checks. 29 00:02:36,600 --> 00:02:38,100 After that we have these sources 30 00:02:40,760 --> 00:02:42,700 and then schools 31 00:02:45,680 --> 00:02:47,400 and in lust reaches 32 00:02:52,450 --> 00:03:00,970 also teens that one here like this one is going to us this one going to be projects 33 00:03:04,680 --> 00:03:11,570 and this one going to be sold says this sign is schools. 34 00:03:12,080 --> 00:03:19,120 So here we have this other schools what is then after that we have just 35 00:03:21,970 --> 00:03:22,660 shifted down. 36 00:03:22,660 --> 00:03:26,410 We get the data for all these shows and columns. 37 00:03:26,410 --> 00:03:32,080 If you multiply the number of rows with the columns you will find that how many data files are available 38 00:03:32,080 --> 00:03:33,340 in that particular data frame. 39 00:03:34,090 --> 00:03:41,320 And if you sum all these two will get that 68 million data is best and then none of it being the head 40 00:03:41,320 --> 00:03:44,080 of these so that we can have the children chlamydia. 41 00:03:45,700 --> 00:03:53,670 So first of all donations or had shifted and there we go we define face. 42 00:03:53,770 --> 00:04:03,350 After that we have no as low head David Levy that one lump effect in this thing also. 43 00:04:03,780 --> 00:04:14,490 Then we have projects teachers teachers just project resources so projects not head. 44 00:04:15,300 --> 00:04:19,110 Then we have these sources don't head 45 00:04:21,950 --> 00:04:22,400 after that. 46 00:04:22,400 --> 00:04:31,600 We have schools no head and at last we have teachers doped. 47 00:04:31,640 --> 00:04:37,380 Had nobody noticed this data since then. 48 00:04:37,430 --> 00:04:39,270 First one is did donations stop. 49 00:04:39,530 --> 00:04:41,680 This contains a particular project. 50 00:04:42,200 --> 00:04:48,260 And if you notice this one that is projects it also has project writing. 51 00:04:48,470 --> 00:04:54,550 It doesn't have school I.D. and teacher writing and this one has donation ideas and donor. 52 00:04:54,650 --> 00:05:02,960 So few have same ideas like this one also has donor writing but they don't have donation and project 53 00:05:02,960 --> 00:05:03,210 data. 54 00:05:03,210 --> 00:05:07,290 This one at school lady similarly this one has teacher 55 00:05:10,410 --> 00:05:17,820 and after that these particular C slide ideas we have few things here like donation included then donation 56 00:05:17,910 --> 00:05:18,660 amount. 57 00:05:19,170 --> 00:05:22,260 Then donor card sequence and donation to C.. 58 00:05:22,800 --> 00:05:29,560 So we have to work with the date and we also have to work with the amount and these ideas. 59 00:05:29,610 --> 00:05:35,790 After that if you notice donors festival is just about the donation received the amount and number of 60 00:05:35,790 --> 00:05:37,370 the donations and then dissipate. 61 00:05:37,500 --> 00:05:43,410 If we talk about donors then it's simple that we are going to have some names their cities and whatever 62 00:05:43,530 --> 00:05:46,010 they have donated. 63 00:05:46,080 --> 00:05:46,850 We have donors. 64 00:05:47,280 --> 00:05:50,900 Ze fine donors is teacher or not. 65 00:05:51,000 --> 00:05:56,940 Then we have projects projects is something that is going to be something like projects type their name. 66 00:05:56,940 --> 00:06:07,190 So here we have these things like project type data says here we have the coast and they are funding 67 00:06:07,340 --> 00:06:09,330 and expiry date. 68 00:06:09,350 --> 00:06:11,450 Here we have project fully funded data. 69 00:06:11,520 --> 00:06:17,720 Here we have project posted date So remember these columns which I'm focusing here so that while we 70 00:06:17,720 --> 00:06:19,360 are working on that one you will get that. 71 00:06:19,640 --> 00:06:25,620 Yes the columns up as and these dolphins hidden resources don't head. 72 00:06:25,710 --> 00:06:33,890 We have a resource item Dec 1 Kitty and as more you can check their price is schools. 73 00:06:33,890 --> 00:06:34,950 We have type. 74 00:06:34,970 --> 00:06:41,130 We are going to have schools state the sun and we are going to have the country here. 75 00:06:42,480 --> 00:06:44,410 After that we have teachers. 76 00:06:44,730 --> 00:06:52,460 We just create teach writing there prefix and a date when their project first posted. 77 00:06:52,470 --> 00:06:56,510 So this is about the data frames that we have. 78 00:06:56,570 --> 00:07:01,610 Now we are going to describe these data frames so we can have a general idea of the things available 79 00:07:01,650 --> 00:07:02,220 in that. 80 00:07:02,290 --> 00:07:04,420 Like first of all we have donations. 81 00:07:04,560 --> 00:07:11,550 So we just do donations don't describe method here and postie pencils shifted on. 82 00:07:11,820 --> 00:07:18,810 It will take little time legals it will provide some calculation there and provide do things like here 83 00:07:18,840 --> 00:07:23,550 we have come the donation amount here. 84 00:07:23,790 --> 00:07:25,950 Then we have the mean of the donation amount. 85 00:07:25,950 --> 00:07:33,440 Standard deviation of that one minimum of debt from that 25 percent 50 percent and 75 percent. 86 00:07:33,450 --> 00:07:35,910 Here we have the maximum value there. 87 00:07:35,910 --> 00:07:41,380 We have done a cop sequence and if you notice on the door no's donations do Ted. 88 00:07:41,460 --> 00:07:46,870 Then we have just these two columns which are actually in digital floating values. 89 00:07:47,010 --> 00:07:52,470 So they are going to have these mathematical values only for that particular columns which are integers 90 00:07:53,430 --> 00:08:01,200 because like it or find like 25 percent or standard means of any string then we have doughnuts and then 91 00:08:01,200 --> 00:08:02,250 we will do describe 92 00:08:06,220 --> 00:08:12,330 this would provide us the basic idea that what these things are in that particular data frame. 93 00:08:12,520 --> 00:08:16,120 This one is again processing them and now we have projects 94 00:08:18,550 --> 00:08:20,570 not describe. 95 00:08:20,740 --> 00:08:23,170 There we go with that one who did ended. 96 00:08:23,260 --> 00:08:27,950 LICHTMAN So here we have the count unique to append frequency for donors. 97 00:08:29,410 --> 00:08:34,240 So that want something different from that one depends on the type of values of a liberal deal. 98 00:08:35,820 --> 00:08:43,170 After that in the projects we again have standard mean 25 50 and 75 percent for that particular fight. 99 00:08:44,840 --> 00:08:47,130 After this we have resources. 100 00:08:47,190 --> 00:08:56,200 Describe we go with that one and we go to attribute error because I have misplaced they soon describe. 101 00:08:56,200 --> 00:09:08,110 There we go then we have schools low describe and have a look at this one again we have a minimum standard 102 00:09:08,110 --> 00:09:14,260 in twenty five seventy five and fifty percent there makes him when we leave home values here if you 103 00:09:14,260 --> 00:09:22,720 notice these values are like something one E plus zero this means the power of that explanation is something 104 00:09:22,720 --> 00:09:32,500 like these zero three this one and in featureless we have just these things so what can we do this one 105 00:09:32,520 --> 00:09:33,630 on teach us. 106 00:09:33,790 --> 00:09:40,780 So if you write here teachers don't describe and then we this. 107 00:09:41,350 --> 00:09:49,750 And when you should on that one you will find again that things that are available for donors because 108 00:09:49,770 --> 00:09:56,640 in donor ourselves we have something like that one so it has counted did the teacher prefix speech it 109 00:09:56,700 --> 00:09:59,190 ideas uniquely them. 110 00:09:59,220 --> 00:10:02,670 How many of them are unique their dates. 111 00:10:02,670 --> 00:10:08,910 And it's something like it had not calculated that particular things like it has calculated the standard 112 00:10:08,910 --> 00:10:15,380 deviation of like any amount or any value it has just counted all the numbers of ideas or prefix available 113 00:10:15,420 --> 00:10:24,740 there or the unique values of a live a live so that's about this one and now we do not have done anything 114 00:10:25,460 --> 00:10:33,780 much important they are just describing and inputting the phys printing the head from now we are going 115 00:10:33,780 --> 00:10:37,830 to have some particular analyses that we're going to do from the very next video. 116 00:10:38,430 --> 00:10:39,400 So thanks for watching. 117 00:10:39,430 --> 00:10:40,340 Tune in next to.