1 00:00:00,120 --> 00:00:05,280 ‫All right, so what is partitioning, guys, so let's say we have a beautiful customer stable here, 2 00:00:05,310 --> 00:00:09,780 ‫this is like, I don't know, one million roll of ten million is not that bad guys, by the way. 3 00:00:09,960 --> 00:00:16,740 ‫But if I want to pick the name of a customer with this particular 701. 4 00:00:16,950 --> 00:00:17,280 ‫Right. 5 00:00:17,420 --> 00:00:23,480 ‫So you going to select name from customer tables where ID equals seven hundred thousand and one. 6 00:00:23,790 --> 00:00:29,550 ‫So what the database will do is, well, if you have an index on the on that table, we're going to 7 00:00:29,760 --> 00:00:35,460 ‫use their index and then we're going to land on that particular roll on that disk. 8 00:00:35,460 --> 00:00:35,850 ‫Right. 9 00:00:36,360 --> 00:00:41,250 ‫Or if we don't have it, we're going to do a sequential scan, which is that does the worst thing we 10 00:00:41,250 --> 00:00:41,730 ‫can do. 11 00:00:41,730 --> 00:00:47,280 ‫But we have to scan all this rolls until we find the rolls we were looking for, regardless whether 12 00:00:47,280 --> 00:00:53,880 ‫using an index or sequential scan, scanning 100 million row is a lot. 13 00:00:53,880 --> 00:00:54,180 ‫Right. 14 00:00:54,180 --> 00:00:58,380 ‫And multiply that by ten or thousand billion rolls. 15 00:00:58,470 --> 00:00:59,780 ‫That's going to slow things down. 16 00:01:00,000 --> 00:01:05,360 ‫Index large tables are always slow to query regardless. 17 00:01:05,370 --> 00:01:05,610 ‫Right. 18 00:01:06,060 --> 00:01:07,470 ‫So what do we do? 19 00:01:07,590 --> 00:01:16,530 ‫Partitioning is the idea here is break the table down to smaller pieces so we know we only work with 20 00:01:16,530 --> 00:01:19,820 ‫so many rolls and instead of a larger set of rules. 21 00:01:20,010 --> 00:01:21,050 ‫So that's the one idea. 22 00:01:21,270 --> 00:01:22,020 ‫Let's break up this. 23 00:01:22,020 --> 00:01:26,730 ‫One million rolls into five partitions, five tables, partition, nothing but a table. 24 00:01:26,880 --> 00:01:34,340 ‫And we're going to name them customers to 100K customers, 400 customers, 600 customers, 800 customers, 25 00:01:34,350 --> 00:01:35,610 ‫a one one million. 26 00:01:35,850 --> 00:01:39,540 ‫So that means basically these are the customers between one and two hundred. 27 00:01:39,700 --> 00:01:45,660 ‫These are the customers going to 201 and 400 customers, between 400 and one and 600 and so on. 28 00:01:45,660 --> 00:01:46,380 ‫You get the idea. 29 00:01:46,380 --> 00:01:46,600 ‫Right. 30 00:01:47,340 --> 00:01:49,110 ‫So that's just how we partitioned it. 31 00:01:49,380 --> 00:01:56,310 ‫And then the database magically, you're going to attach these partition to the to the to the main table, 32 00:01:56,310 --> 00:01:57,570 ‫which is customer stable. 33 00:01:57,840 --> 00:02:03,780 ‫So customers tables, just almost like a parent or inheritance doesn't have any data that actual data 34 00:02:03,780 --> 00:02:05,040 ‫exists on these tables. 35 00:02:05,310 --> 00:02:11,760 ‫So now what I do is I'm going to do the same exact way, select name from customers table where I'd 36 00:02:11,760 --> 00:02:13,920 ‫go seven hundred and one. 37 00:02:13,920 --> 00:02:17,040 ‫And the first question is, I'm going to send that query. 38 00:02:17,040 --> 00:02:17,780 ‫What do we do? 39 00:02:18,180 --> 00:02:21,540 ‫Where does this idea exist? 40 00:02:21,810 --> 00:02:26,460 ‫And that's a very quick check by the database was like seven hundred and one. 41 00:02:26,760 --> 00:02:30,800 ‫Which table should I query this, this, this, this or this. 42 00:02:31,110 --> 00:02:35,090 ‫It's very easy because what's partition is customer has 700, 800. 43 00:02:35,100 --> 00:02:41,640 ‫On one end, we have the definitions in this partition that is metadata attached it where 079 is between 44 00:02:41,640 --> 00:02:45,890 ‫800 and 1000000, which is which is I mean 600 and 800. 45 00:02:46,020 --> 00:02:48,720 ‫So that's exactly the table we need to query. 46 00:02:49,500 --> 00:02:53,700 ‫So partition customers 800, we hit that table immediately. 47 00:02:53,730 --> 00:02:58,050 ‫So now I'm working only with two hundred thousand rows, which is beautiful. 48 00:02:58,440 --> 00:03:06,780 ‫I always say I tweet that the other day that the quickest way to query a table with a billion rows is 49 00:03:06,780 --> 00:03:09,600 ‫to avoid querying a table with a billion euros. 50 00:03:09,870 --> 00:03:10,980 ‫And that's what the data is. 51 00:03:10,980 --> 00:03:16,830 ‫Do they just try to as much as possible to avoid querying this much data and work with smaller set of 52 00:03:16,830 --> 00:03:17,120 ‫data? 53 00:03:17,310 --> 00:03:18,330 ‫That's the trick here. 54 00:03:18,690 --> 00:03:18,960 ‫All right. 55 00:03:19,050 --> 00:03:23,670 ‫So now once I have this, if I have an index even better, right, I'm going to use the index and then 56 00:03:23,760 --> 00:03:28,080 ‫land on customer 710000 and one which is calm in this case.