1 00:00:04,130 --> 00:00:09,620 ‫OK, let's talk about this, the final thing, I think the storage costs between Postgres and Mystikal, 2 00:00:10,070 --> 00:00:17,580 ‫so people plus three secondary index values and I talk about secondary indexes versus a primary index 3 00:00:17,590 --> 00:00:18,920 ‫difference is very important to know. 4 00:00:18,920 --> 00:00:21,350 ‫This difference is the two difference between the two. 5 00:00:21,950 --> 00:00:22,310 ‫Right. 6 00:00:23,450 --> 00:00:27,650 ‫And can either point directly to the two pull. 7 00:00:28,430 --> 00:00:31,520 ‫This is an example of Posterous or to the primary key. 8 00:00:31,550 --> 00:00:35,690 ‫And this is one of the reasons that Ueber moved from post customizable. 9 00:00:35,750 --> 00:00:36,230 ‫So they. 10 00:00:37,270 --> 00:00:43,750 ‫Bosco's points deductable as a result, right, amplifications implode and I have a right amplification, 11 00:00:43,750 --> 00:00:49,900 ‫Lechter goes to the go to the section where you have a various database discussion, you're going to 12 00:00:49,900 --> 00:00:50,990 ‫see the right amplification. 13 00:00:51,010 --> 00:00:52,600 ‫Very critical to understand that. 14 00:00:53,410 --> 00:00:53,730 ‫Right. 15 00:00:54,730 --> 00:01:03,100 ‫So if you if you have a secondary index and the second earners point to the tuple, then the tuple size 16 00:01:03,100 --> 00:01:07,290 ‫is really not that large because it's as a fit. 17 00:01:07,300 --> 00:01:11,230 ‫I believe it's to be it might be wrong. 18 00:01:11,240 --> 00:01:19,960 ‫It might be 60 for a bit, but that is the pointer while the MySQL Sikandar indexes point to the primary 19 00:01:19,960 --> 00:01:20,240 ‫keys. 20 00:01:20,240 --> 00:01:27,220 ‫So if the primary key is large, if it's an integer, you don't have a problem, really is just tiny. 21 00:01:27,910 --> 00:01:41,200 ‫But if it's good or you idy, that is a really bad idea to put a Gwynedd or a UID as the primary key 22 00:01:41,800 --> 00:01:52,570 ‫in energy being MISAKO, because any secondary key unfortunately will point to the primary key and plus 23 00:01:52,570 --> 00:01:53,830 ‫the whole thing is clustered. 24 00:01:53,830 --> 00:02:00,730 ‫So inserts are so slow because of the randomness of the idea is just not worth it at all. 25 00:02:02,050 --> 00:02:10,630 ‫OK, so as a result the secondary indexes will be so large because they have all these values that points 26 00:02:10,630 --> 00:02:17,380 ‫to two primary keys, which are effectively ideas, which are these large things. 27 00:02:18,220 --> 00:02:26,080 ‫And and you can't you can do all sorts of tricks to to to convert you into the stuff using a string, 28 00:02:26,080 --> 00:02:27,790 ‫which is I forgot. 29 00:02:27,790 --> 00:02:31,630 ‫What's the selling of the ID by so many bytes. 30 00:02:31,630 --> 00:02:31,960 ‫Right. 31 00:02:32,590 --> 00:02:33,700 ‫One twenty eight I believe. 32 00:02:34,030 --> 00:02:41,960 ‫But you can trick it to use sixty four bytes or less than that using the binary representational void. 33 00:02:41,990 --> 00:02:45,700 ‫But still you will it is still large. 34 00:02:46,300 --> 00:02:50,200 ‫If it's large disk space, the space can fit the memory. 35 00:02:50,680 --> 00:02:50,980 ‫Right. 36 00:02:51,000 --> 00:02:56,980 ‫Your memory is Prussia's, you might say, okay, I'm going to add one terabyte worth of memory for 37 00:02:56,980 --> 00:02:57,980 ‫my private database. 38 00:02:58,780 --> 00:02:59,410 ‫Sure. 39 00:02:59,410 --> 00:03:02,560 ‫But do you have to really think about this? 40 00:03:02,560 --> 00:03:02,840 ‫Right. 41 00:03:03,580 --> 00:03:10,630 ‫Scaling and database engineering is not something that you take lightly. 42 00:03:10,630 --> 00:03:16,120 ‫You have to think about all this stuff and then all makes sense when you understand these basic fundamentals. 43 00:03:16,570 --> 00:03:19,540 ‫That is what I want to convey in this lecture. 44 00:03:20,380 --> 00:03:23,920 ‫The B trees are not something you do math on. 45 00:03:24,610 --> 00:03:29,590 ‫Every decision you make cost you right. 46 00:03:29,590 --> 00:03:37,840 ‫And what is important to understand how these different DBMS make these design choices, because every 47 00:03:37,840 --> 00:03:42,850 ‫design choice can lead to a completely different outcome. 48 00:03:43,750 --> 00:03:49,690 ‫If a primary key data type is expensive, this can cause bloat and all the secondary indexes. 49 00:03:49,690 --> 00:03:57,010 ‫As I talk to my lymph nodes in my school and really B contains the full row since it's an index organized 50 00:03:57,010 --> 00:03:59,710 ‫table or a clustered index too. 51 00:04:00,550 --> 00:04:01,760 ‫So that's another thing, right? 52 00:04:02,470 --> 00:04:04,750 ‫Clustered indexes in general SQL Server. 53 00:04:04,750 --> 00:04:12,250 ‫I also have this idea of clustered indexes, clustered index or cluster tables, sometimes called index 54 00:04:12,250 --> 00:04:13,060 ‫organize table. 55 00:04:13,270 --> 00:04:14,230 ‫So this is it. 56 00:04:14,630 --> 00:04:19,600 ‫And this is then index where the index is the table. 57 00:04:19,630 --> 00:04:22,780 ‫So if you think about it, this just the whole thing. 58 00:04:22,960 --> 00:04:34,000 ‫The leaf node has the whole row and all the columns in it so that everything is is really clustered 59 00:04:34,000 --> 00:04:34,860 ‫nicely. 60 00:04:34,900 --> 00:04:39,450 ‫It has disadvantages and advantages that I am just not going to go through it. 61 00:04:39,460 --> 00:04:41,860 ‫But it's very important to understand that.