1 00:00:01,370 --> 00:00:05,900 Hello, guys, and welcome back to another class of our course about the complete introduction to data 2 00:00:05,900 --> 00:00:06,250 science. 3 00:00:06,770 --> 00:00:09,200 So we saw many things in the past few classes. 4 00:00:09,200 --> 00:00:15,910 And today, of course, we are going to talk about data analysis and about Banda's. 5 00:00:16,520 --> 00:00:20,160 So basically we are going into this class. 6 00:00:20,180 --> 00:00:25,940 More specifically, we are going to have a complete introduction to Pender's and talk about what is 7 00:00:25,940 --> 00:00:31,460 it, what is data analysis, the concept of data analysis, and make a quick recap of what we can do 8 00:00:31,490 --> 00:00:32,600 with Biton. 9 00:00:33,980 --> 00:00:41,810 And we are going to have in this part of the course, not in this class, but in the few next classes, 10 00:00:42,050 --> 00:00:46,400 we're going to learn how to use Banda's like we did for the past. 11 00:00:46,790 --> 00:00:48,340 Based on what? 12 00:00:48,410 --> 00:00:49,070 For the past. 13 00:00:49,070 --> 00:00:51,410 Based on program that we learned a little bit before. 14 00:00:51,950 --> 00:00:52,220 All right. 15 00:00:52,280 --> 00:00:52,940 So let's start. 16 00:00:53,570 --> 00:01:00,620 So basically, it's important to understand the basics of Python before jumping inside of Pendas. 17 00:01:00,860 --> 00:01:05,790 And for that, we will make a quick recap of what we saw at the beginning of the course. 18 00:01:06,020 --> 00:01:12,230 So basically, Python can have many applications, such as web development, simple programming for 19 00:01:12,230 --> 00:01:13,460 different programs, for example. 20 00:01:13,460 --> 00:01:15,710 It could be for machine learning. 21 00:01:15,890 --> 00:01:23,480 It could be, for example, for Aronow in many fields, it could be in security, it could be in finances. 22 00:01:23,480 --> 00:01:29,750 It could be in many different fields, fields, but it can also be in research and analysis. 23 00:01:29,760 --> 00:01:36,920 So basically in data analysis and this is exactly what we are going to talk about in this part of the 24 00:01:36,920 --> 00:01:37,310 course. 25 00:01:37,460 --> 00:01:45,920 And we are going to learn some tools that can be really, really useful for this world, for this to 26 00:01:45,920 --> 00:01:47,480 be able to do this part of the course. 27 00:01:47,570 --> 00:01:49,680 Basically to be able to do data analysis. 28 00:01:49,940 --> 00:01:54,500 So here are some some things that you guys can do with Python. 29 00:01:54,500 --> 00:02:00,440 So as you can see, development, research and analysis, data science and many others, there is no 30 00:02:00,440 --> 00:02:01,030 security here. 31 00:02:01,040 --> 00:02:03,470 So there are many other things that you can do with Python. 32 00:02:04,400 --> 00:02:04,760 All right. 33 00:02:05,390 --> 00:02:07,970 So understanding data analysis. 34 00:02:07,980 --> 00:02:12,080 So basically, the first thing that we need to talk about when we are talking about data analysis, 35 00:02:12,080 --> 00:02:17,810 the data lifecycle, since data analysis is part of the data lifecycle. 36 00:02:19,100 --> 00:02:24,500 So what you will see in the for example, you guys are looking for the data lifecycle is that you will 37 00:02:24,500 --> 00:02:31,280 see plenty of different things like this that will show you different data, life cycles, they're all 38 00:02:31,280 --> 00:02:31,720 correct. 39 00:02:31,730 --> 00:02:37,450 So there is no correct or incorrect thing that represents, well, data lifecycle. 40 00:02:37,760 --> 00:02:40,620 Some of them will have seven like this one right here. 41 00:02:40,790 --> 00:02:41,890 Others will have six. 42 00:02:41,890 --> 00:02:42,920 Some of them will have five. 43 00:02:43,340 --> 00:02:47,970 But at the end of the day, it's pretty much the same thing, just with different names and different 44 00:02:49,070 --> 00:02:50,520 different things inside of it. 45 00:02:50,540 --> 00:02:52,250 So basically it's just different names. 46 00:02:52,520 --> 00:02:54,140 But at the end of the day, it's the same thing. 47 00:02:54,860 --> 00:02:59,900 A word that will always come inside of those representations will be the analyzing parts, since the 48 00:02:59,900 --> 00:03:04,010 analysis of the data is one of the most important things that you guys can do. 49 00:03:04,520 --> 00:03:07,820 Because when you analyze that, you understand you transform does that. 50 00:03:07,850 --> 00:03:13,630 So you take raw data and you will transform it because the analysis path is where the data is transformed. 51 00:03:14,270 --> 00:03:15,650 So you take everything. 52 00:03:15,650 --> 00:03:22,970 Well, you you can you are able to take all the raw data and transform it into data that you can use 53 00:03:22,970 --> 00:03:30,050 later to answer different questions or for research purposes or for any other purposes that you can 54 00:03:31,070 --> 00:03:32,010 that you want, basically. 55 00:03:32,030 --> 00:03:33,700 So it's pretty much explained here. 56 00:03:34,010 --> 00:03:39,140 So you have your data, you need your analysis, and at the end you are able to answer to your basic 57 00:03:39,140 --> 00:03:39,580 question questions. 58 00:03:39,590 --> 00:03:46,190 So your first question or simply make some discoveries, even if it doesn't answer your question, you 59 00:03:46,190 --> 00:03:49,480 are able to make some discoveries just by analyzing this data. 60 00:03:51,080 --> 00:03:54,040 So basically, what is data analysis? 61 00:03:54,050 --> 00:03:58,510 So it's set to be able to perform data analysis with Python. 62 00:03:58,520 --> 00:04:04,670 It's really important to use, as I explained data analysis a little bit before, now it's important 63 00:04:04,670 --> 00:04:09,320 to understand how we can perform it with Python. 64 00:04:09,800 --> 00:04:12,080 So be able to be able to perform that analysis. 65 00:04:12,080 --> 00:04:19,730 With Python, we will use a special module called Banda's, and this is exactly what we'll talk about 66 00:04:19,730 --> 00:04:20,160 right now. 67 00:04:20,180 --> 00:04:26,900 So basically, Panders is one of few models that, well, you of many models that exists with Python. 68 00:04:27,320 --> 00:04:31,490 And basically it's a software that is used for that. 69 00:04:31,490 --> 00:04:33,230 I'm going to end the analysis. 70 00:04:33,500 --> 00:04:41,380 It's up that is built on the different the other applications, such as No Escape by Muttalib. 71 00:04:42,110 --> 00:04:47,330 So as I explained you guys a little bit before, matplotlib is used for data visualization. 72 00:04:47,600 --> 00:04:51,950 So basically to represent charts and graphs and to represent many other things. 73 00:04:51,950 --> 00:04:54,080 So basically to represent your data. 74 00:04:54,590 --> 00:05:00,440 And it's also used and not by and SkyBitz are are the fundamental packages. 75 00:05:00,680 --> 00:05:06,280 A scientific manipulation, so scientific computing, sorry. 76 00:05:07,280 --> 00:05:14,810 So basically it's important to understand that PANDAS is really a whole works with all those applications 77 00:05:15,440 --> 00:05:18,300 and there are plenty of benefits of using pandas. 78 00:05:18,320 --> 00:05:22,480 So basically, some of here you can find some of them. 79 00:05:22,820 --> 00:05:29,190 And as I said, really, there is a lot of different things that you guys can do with by using pandas. 80 00:05:29,510 --> 00:05:34,060 So the first one right here, as you can see, will be more done for less coding. 81 00:05:34,070 --> 00:05:41,340 So basically, you don't have to do that much with pandas to be able to write down the exact same thing. 82 00:05:41,360 --> 00:05:43,970 So basically, there is less coding. 83 00:05:44,510 --> 00:05:46,510 It's really created to work with Python. 84 00:05:46,520 --> 00:05:50,520 So it's not only create work with Python, it's made for Python. 85 00:05:50,900 --> 00:05:53,510 So this is another really good advantage of it. 86 00:05:54,110 --> 00:05:57,620 There is an extensive set, an extensive set of features. 87 00:05:57,650 --> 00:06:03,950 So basically it's really powerful and it will provide you with a huge set of commands and features that 88 00:06:03,950 --> 00:06:09,020 can be used to easily analyze your data, which is pretty amazing. 89 00:06:09,800 --> 00:06:14,690 It can also handle really large data and very efficiently. 90 00:06:16,160 --> 00:06:23,540 So basically, when you guys work with really huge databases, this is the perfect tool to work with 91 00:06:23,540 --> 00:06:23,670 it. 92 00:06:24,080 --> 00:06:27,310 And finally, there is a great data representation. 93 00:06:27,320 --> 00:06:31,840 So basically it will represent the data really in a great way. 94 00:06:32,780 --> 00:06:40,820 So it will offer you really streamlined forms of data representation and that there are many other things 95 00:06:40,820 --> 00:06:44,460 that, well, there are many other advantages of using Pandora. 96 00:06:44,470 --> 00:06:46,070 I just didn't route them here. 97 00:06:46,700 --> 00:06:46,910 All right. 98 00:06:46,950 --> 00:06:51,400 So right now, guys, you have a small, small introduction to pandas. 99 00:06:51,410 --> 00:06:55,880 So basically, you know that panda is a tool that works well. 100 00:06:55,880 --> 00:06:59,770 Basically, it's a software that that works with Python. 101 00:06:59,780 --> 00:07:05,800 So basically it's a main application inside of Python and it's really used for data analysis. 102 00:07:06,020 --> 00:07:09,590 And in this part of the course, we are going to learn how to use it. 103 00:07:09,620 --> 00:07:15,290 So this week, you guys will have an introduction and will be able to work in the future with pandas. 104 00:07:15,650 --> 00:07:16,360 So that's it for this. 105 00:07:16,360 --> 00:07:18,980 Guys, guys, into our next class.