1 00:00:00,720 --> 00:00:06,040 Welcome back to our class of our course about the complete introduction to data science with Python. 2 00:00:06,450 --> 00:00:12,240 So in this class, we are still going to talk about Seabourne and this would be our last class about 3 00:00:12,240 --> 00:00:14,550 this amazing to show up until now. 4 00:00:14,550 --> 00:00:17,830 You understood that this is a really great visualization tool. 5 00:00:18,150 --> 00:00:18,470 All right. 6 00:00:18,480 --> 00:00:20,720 So what we are going to learn today is pretty simple. 7 00:00:20,730 --> 00:00:25,130 We are going to learn to work with multiple plots or multiple graphs. 8 00:00:25,950 --> 00:00:28,500 So basically we are going to work with more than one group. 9 00:00:28,530 --> 00:00:29,360 This is what it means. 10 00:00:30,170 --> 00:00:33,570 So to be able to understand better what type of analysis you guys can do. 11 00:00:33,840 --> 00:00:41,760 I decided to use a another data set of from our well, from our databases. 12 00:00:42,000 --> 00:00:43,930 So we'll use the TIPS database. 13 00:00:44,370 --> 00:00:50,440 So first thing that we are going to do, we are going to we are going to import everything. 14 00:00:50,700 --> 00:00:51,360 So pretty simple. 15 00:00:51,360 --> 00:00:54,630 We are going to create a variable will be called data base. 16 00:00:55,800 --> 00:01:00,090 And as always, we are going to import our dataset. 17 00:01:00,240 --> 00:01:02,250 So load dataset. 18 00:01:04,060 --> 00:01:10,810 And what exactly what the rebels want to import is that this database, so you have everything right 19 00:01:10,810 --> 00:01:16,960 here so we can run the well to print everything first. 20 00:01:20,130 --> 00:01:20,970 All right, here we go. 21 00:01:21,480 --> 00:01:26,400 All right, so we can run everything right now to see if everything works, so as you can see, the 22 00:01:26,400 --> 00:01:27,810 database is generated. 23 00:01:27,810 --> 00:01:29,310 So we have everything right here. 24 00:01:29,340 --> 00:01:31,840 This means that everything works just fine. 25 00:01:32,790 --> 00:01:33,160 All right. 26 00:01:33,180 --> 00:01:40,870 Next thing we want to do right now is creating our data, well, our databases, our two graph search. 27 00:01:40,890 --> 00:01:44,760 So the first thing that we'll do is we are going to create another variable. 28 00:01:44,790 --> 00:01:48,090 This one, let's call it Gref, because we are creating a graph. 29 00:01:48,930 --> 00:01:54,120 And here we want to do right now, we want to use another function that is called face grid. 30 00:01:54,870 --> 00:01:57,270 So once again, this is a seabourne function. 31 00:01:57,270 --> 00:02:01,670 So we are going to start with S.B, that faced grid. 32 00:02:02,130 --> 00:02:03,060 So pretty important. 33 00:02:03,060 --> 00:02:07,470 The effort has to be in capital letters because if it's not, it's not going to work. 34 00:02:07,830 --> 00:02:08,100 All right. 35 00:02:08,130 --> 00:02:13,050 Next thing that we need to write down, the first argument actually will be from where we are going 36 00:02:13,050 --> 00:02:14,140 to take our data. 37 00:02:14,280 --> 00:02:17,270 So we are going to take our data from the variable database. 38 00:02:17,280 --> 00:02:18,590 So this is what we'll write down. 39 00:02:18,600 --> 00:02:24,940 So database right here, then what will be our columns? 40 00:02:25,170 --> 00:02:29,510 So our call in this case, we want to know if the person's a male or a female. 41 00:02:29,550 --> 00:02:33,000 So inside of our columns, we will have the sex of the person. 42 00:02:33,550 --> 00:02:39,300 And finally, the hue, which in our case will be if the person is a smoker or no. 43 00:02:40,260 --> 00:02:41,740 So this is what we are looking for. 44 00:02:42,600 --> 00:02:43,080 All right. 45 00:02:43,500 --> 00:02:47,980 Next thing that we are going to use is another function, which is map. 46 00:02:48,000 --> 00:02:51,640 So basically we will we want to create two graphs. 47 00:02:51,640 --> 00:02:53,430 So basically we are going to use the map function. 48 00:02:54,510 --> 00:02:57,890 So once again, graph that map. 49 00:02:58,350 --> 00:03:04,350 So basically the graph variables and right now what type of graphs that we want or what type of blood 50 00:03:04,350 --> 00:03:04,990 that we want. 51 00:03:05,670 --> 00:03:06,360 Pretty simple. 52 00:03:06,360 --> 00:03:12,720 We want scatter plots because in my opinion, this is the most representative of the well, the situation 53 00:03:12,720 --> 00:03:14,250 that we are trying to study. 54 00:03:15,480 --> 00:03:15,800 All right. 55 00:03:15,810 --> 00:03:21,080 So right now, next thing that we want is what we want inside of our scatterplot. 56 00:03:21,480 --> 00:03:23,160 So we want the total bill. 57 00:03:27,330 --> 00:03:28,200 And that. 58 00:03:32,550 --> 00:03:36,650 So right now, we have everything that we need suggesting that think will do simply show. 59 00:03:36,740 --> 00:03:38,250 So we need to show everything. 60 00:03:38,260 --> 00:03:43,260 So you see the graph, so guilty dot show and then pretty simple, we just run everything. 61 00:03:44,720 --> 00:03:45,070 All right. 62 00:03:45,090 --> 00:03:48,070 So right now we receive this type of graph. 63 00:03:48,570 --> 00:03:54,210 So basically, what can we understand if we want to make a quick analysis of this graph is pretty simple. 64 00:03:54,570 --> 00:03:57,130 So this is this graph is for males. 65 00:03:57,130 --> 00:04:01,840 So basically all those right here are guys and all the persons right here are girls. 66 00:04:02,430 --> 00:04:05,070 So basically, we can see how many smokers. 67 00:04:05,250 --> 00:04:07,350 So we have smokers and non smokers. 68 00:04:07,350 --> 00:04:14,490 So basically the younger ones are smokers and the blue ones are no smokers have left a tip based on 69 00:04:14,490 --> 00:04:15,400 the total bill. 70 00:04:15,660 --> 00:04:18,740 So basically, we are studying right now for variables. 71 00:04:18,790 --> 00:04:24,100 We are able to study basically four variables just with one graph. 72 00:04:24,120 --> 00:04:28,840 So basically, depending on the color, we are able to know how much the person have left. 73 00:04:29,340 --> 00:04:34,910 And we can also know, for example, if a smoker left more than a non smoker. 74 00:04:35,970 --> 00:04:40,790 And in this case, we can see that there is no correlation between smokers and non smokers. 75 00:04:40,800 --> 00:04:46,770 So basically, if a smoker left is more than a non smoker, so we can see the left pretty much the same 76 00:04:46,770 --> 00:04:47,150 thing. 77 00:04:47,610 --> 00:04:54,780 But what we can see in this graph, we can easily see that men's leave more tips, like, for example, 78 00:04:54,780 --> 00:04:58,520 in their extremes, leave more tips than woman. 79 00:04:58,530 --> 00:04:59,590 So, for example, based. 80 00:04:59,660 --> 00:05:05,040 So once again, based on the graph right here, and if we look at the average, it's pretty much the 81 00:05:05,040 --> 00:05:05,560 same thing. 82 00:05:06,150 --> 00:05:13,150 So men live leave more than women based on the data right here once again. 83 00:05:13,170 --> 00:05:16,230 So the extremes are more higher for men. 84 00:05:16,260 --> 00:05:17,050 This is what we see. 85 00:05:18,540 --> 00:05:23,140 So once again, this is what we can conclude with the analysis right here. 86 00:05:23,160 --> 00:05:27,600 So basically, as you can see, we are able to make an analysis with four variables. 87 00:05:27,780 --> 00:05:29,810 So we have the total bill, we have the tip. 88 00:05:30,210 --> 00:05:34,060 We have the sex of the person if the person is a male or a female. 89 00:05:34,290 --> 00:05:41,910 And finally, if the person is a smoker or a nonsmoker, you can see we can make a complete analysis 90 00:05:42,300 --> 00:05:48,960 with the database that we have right here so we can take a simple database that is right here and create 91 00:05:48,960 --> 00:05:55,620 something really advanced and really have really interesting data about, well, actually all the population 92 00:05:55,620 --> 00:05:57,970 that have ate at that restaurant. 93 00:05:58,440 --> 00:06:02,560 So you can see there there is plenty of really interesting things that you guys can do with Seabourne. 94 00:06:02,850 --> 00:06:05,810 So what I showed you right now are just the basics. 95 00:06:05,850 --> 00:06:12,810 Basically, we have learned how to create some plots or some graphs and the basics of analyzing those 96 00:06:12,810 --> 00:06:13,440 graphs. 97 00:06:14,150 --> 00:06:14,730 That's right. 98 00:06:14,740 --> 00:06:18,380 Now, if you guys want to learn a little bit more, you can import your own databases. 99 00:06:18,690 --> 00:06:25,260 So basically from simply write them as big data sets and you can import your own datasets and try to 100 00:06:25,260 --> 00:06:31,260 study them to understand a little bit more of the concept about, well, all the data science that is 101 00:06:31,260 --> 00:06:33,730 around those databases. 102 00:06:33,780 --> 00:06:36,030 There is a lot of really interesting things to learn. 103 00:06:36,660 --> 00:06:42,390 Besides of that, I hope you guys understood the power of Seabourne and how it could be really powerful 104 00:06:42,390 --> 00:06:43,750 in analyzing data. 105 00:06:44,040 --> 00:06:50,400 So this is really an amazing we analyze data and the sense it works with matplotlib, you know, that 106 00:06:50,400 --> 00:06:54,930 it's really high quality data visualization that you guys will have. 107 00:06:55,590 --> 00:07:00,660 So that's it for this seabourne part of this diskless. 108 00:07:00,990 --> 00:07:09,570 And see you on the C o on my other class as well, on the other class, where we are going to talk about 109 00:07:09,570 --> 00:07:11,910 some other topics about data science.