1 00:00:00,120 --> 00:00:00,480 Okay. 2 00:00:00,570 --> 00:00:04,790 So we talked about the three kind of common relationship classifications. 3 00:00:04,800 --> 00:00:09,770 And as I mentioned, we're focusing on one to many first, and we're doing that for a couple of reasons. 4 00:00:09,780 --> 00:00:15,120 One is that compared to many to many, it's a little bit easier to get through shorter. 5 00:00:15,120 --> 00:00:20,070 But also it's more common and it's more essential, in my opinion, to most of the development that 6 00:00:20,070 --> 00:00:25,650 that I've done and that I see other people doing not to discount many to many, but we're going to get 7 00:00:25,650 --> 00:00:25,800 there. 8 00:00:25,800 --> 00:00:27,000 We're starting with one too many. 9 00:00:27,270 --> 00:00:32,159 So we're going to start with a classic example, customers and orders. 10 00:00:32,159 --> 00:00:37,680 Or this could be users and orders or users in carts, whatever the transaction is, but we're going 11 00:00:37,680 --> 00:00:39,480 to call it customers and orders. 12 00:00:39,480 --> 00:00:44,550 So two tables and this is a one to many relationship, as we talked about earlier. 13 00:00:44,550 --> 00:00:50,010 So let's say we have a customer, President Bill Clinton, that's our customer. 14 00:00:50,010 --> 00:00:52,920 And he can place as many orders as he'd like. 15 00:00:52,920 --> 00:01:00,450 He can have many orders, but each of those orders is associated with exactly one user, Bill Clinton. 16 00:01:00,900 --> 00:01:01,860 President Bill Clinton. 17 00:01:01,860 --> 00:01:02,520 Excuse me. 18 00:01:02,820 --> 00:01:07,710 There is no way to have two people associated with one order. 19 00:01:08,380 --> 00:01:12,160 So it's a one to many customers have many orders. 20 00:01:12,160 --> 00:01:15,340 Orders have one customer associated with them. 21 00:01:15,790 --> 00:01:19,540 So let's talk about for our case what data we want to store. 22 00:01:19,540 --> 00:01:21,010 And it's going to be very simple. 23 00:01:21,340 --> 00:01:25,060 I don't want to clutter it with a whole bunch of other pieces of information. 24 00:01:25,060 --> 00:01:27,090 We're just going to focus on the basics. 25 00:01:27,100 --> 00:01:31,250 So a first and last name for customers and an email. 26 00:01:31,270 --> 00:01:34,180 Now, in a real web app, we'd want a lot more information. 27 00:01:34,180 --> 00:01:40,420 Probably we want to create a registration date when a customer signs up, maybe a last active date that 28 00:01:40,420 --> 00:01:44,280 we're storing our date time password, somehow we need to store that. 29 00:01:44,290 --> 00:01:49,750 There's I don't know, there's so many other things phone number if we're trying to do two factor authentication, 30 00:01:49,750 --> 00:01:50,590 lots of things. 31 00:01:50,740 --> 00:01:53,080 So first and last name and email. 32 00:01:53,080 --> 00:01:54,880 So three things per customer. 33 00:01:55,510 --> 00:02:00,970 But then we also for orders, we need to store the date that an order was made. 34 00:02:01,150 --> 00:02:05,950 So it doesn't matter for our example if it's date or date time, but we'll keep it simple with date. 35 00:02:05,950 --> 00:02:09,820 So 2017, slash 11, slash 12 or something. 36 00:02:10,180 --> 00:02:18,460 And then the price of the order, the total amount, let's say 99.99 or 20.50. 37 00:02:18,580 --> 00:02:20,410 So that's going to be a decimal. 38 00:02:21,100 --> 00:02:26,980 So we're going to have first and last name an email, which should probably be our chars, the date 39 00:02:26,980 --> 00:02:31,030 of the purchase, which will be a date time and the price of the order which will be a decimal. 40 00:02:31,360 --> 00:02:32,680 That's the easy part. 41 00:02:32,710 --> 00:02:34,930 The next question is how do we store that? 42 00:02:34,930 --> 00:02:36,160 How do we represent it? 43 00:02:36,460 --> 00:02:40,300 And we could do it this way, which I want to make clear. 44 00:02:40,300 --> 00:02:44,920 I'm not saying this is a good way of doing it, but you could technically get away with doing it in 45 00:02:44,920 --> 00:02:46,090 one giant table. 46 00:02:46,690 --> 00:02:47,560 But there's problems. 47 00:02:47,560 --> 00:02:49,510 But let's first just walk through how it works. 48 00:02:49,510 --> 00:02:54,670 So we have first name and last name and email for customers and then order, date and amount. 49 00:02:55,540 --> 00:03:00,520 So we have Boy George whose email is George at gmail.com. 50 00:03:00,520 --> 00:03:02,800 He was a very early user of Gmail. 51 00:03:03,400 --> 00:03:10,780 And then we have the order date 2016 210 and that order was $99.99. 52 00:03:10,900 --> 00:03:13,930 So this is capturing information for one order. 53 00:03:14,920 --> 00:03:21,730 But then George or Boy George placed another order and we have duplicated information because we're 54 00:03:21,730 --> 00:03:27,130 storing it's Boy George, which I know is not his first name, but just let's go with it. 55 00:03:27,130 --> 00:03:30,760 And then George at Gmail again and then a different date. 56 00:03:30,760 --> 00:03:35,650 This one was later in 2017 and then a different price or a different amount. 57 00:03:36,010 --> 00:03:38,530 So this stores the information that we talked about. 58 00:03:38,530 --> 00:03:42,790 We needed first last email and then data, the purchase and price of the order. 59 00:03:43,480 --> 00:03:44,710 And so that's working. 60 00:03:45,760 --> 00:03:54,160 Same thing with George Michael who has GM at gmail.com, bought something in 2014 that was $800.67. 61 00:03:54,160 --> 00:03:59,920 And then George Michael, same email about something in 2015 that was $12.50. 62 00:04:00,850 --> 00:04:05,410 So already you should be noticing we have some a lot of duplication, right. 63 00:04:05,410 --> 00:04:10,780 Even though we are storing the information that we're supposed to store, there's a lot of duplicated 64 00:04:10,780 --> 00:04:14,170 data and then we run into these two. 65 00:04:14,560 --> 00:04:17,050 David Bowie and Blue Steal My Cat. 66 00:04:17,560 --> 00:04:20,529 There's problems because they haven't placed orders yet. 67 00:04:20,529 --> 00:04:22,360 So we have their customer information. 68 00:04:22,360 --> 00:04:28,930 We have David Bowie, David at gmail.com, also early adopter of Gmail and we have blue, steel, blue 69 00:04:28,930 --> 00:04:35,710 at gmail.com and they haven't placed orders, but they still have order, date and amount as columns 70 00:04:35,710 --> 00:04:37,060 and they just have no value. 71 00:04:37,060 --> 00:04:37,780 It's null. 72 00:04:38,260 --> 00:04:44,620 So this technically represents that fact that boy George and George Michael have two orders each and 73 00:04:44,620 --> 00:04:46,750 David Bowie and Blue still have no orders. 74 00:04:46,750 --> 00:04:48,910 So that's really not a great way of doing it. 75 00:04:48,910 --> 00:04:50,080 Yes, it works. 76 00:04:50,080 --> 00:04:53,050 But we're not only duplicating a lot of things, right? 77 00:04:53,050 --> 00:04:55,030 Boy, George, George, Michael. 78 00:04:55,030 --> 00:05:01,660 And if somebody shopped a lot, we would have tons of duplicated information, which is just a waste. 79 00:05:01,660 --> 00:05:07,480 But then also we run into problems where we have users who maybe haven't ordered anything yet and they 80 00:05:07,480 --> 00:05:08,830 just signed up for the site. 81 00:05:08,830 --> 00:05:13,090 There's no reason to have anything to do with orders until they actually place an order. 82 00:05:13,180 --> 00:05:19,630 So it's much better to keep our data separated because if we ever need to work with just customers on 83 00:05:19,630 --> 00:05:24,820 their own, which is happens, there are times on a site where you don't care about orders, you just 84 00:05:24,820 --> 00:05:25,660 care about customers. 85 00:05:25,660 --> 00:05:31,120 Let's say when a customer is signing in, all that you need is to take their email and check it and 86 00:05:31,120 --> 00:05:33,670 take a password and check it against the password field. 87 00:05:33,820 --> 00:05:35,470 That has nothing to do with orders. 88 00:05:35,470 --> 00:05:39,520 So why would we need to go through a giant customer orders table? 89 00:05:39,850 --> 00:05:41,380 And that's just a really silly example. 90 00:05:41,380 --> 00:05:45,220 But what I'm trying to show is that it's not just about the duplication. 91 00:05:45,220 --> 00:05:51,100 It's also that sometimes it's better to keep your data separate almost every time unless you're always 92 00:05:51,100 --> 00:05:53,770 accessing your data together. 93 00:05:53,770 --> 00:06:00,430 Like if the only time we're ever talking about customers is when we're talking about orders, then maybe 94 00:06:00,430 --> 00:06:01,570 you could put them together. 95 00:06:01,570 --> 00:06:07,060 But then we still have this problem where we have people who haven't placed orders or we have duplicated 96 00:06:07,060 --> 00:06:07,480 data. 97 00:06:07,730 --> 00:06:09,350 So this is a bad idea. 98 00:06:09,620 --> 00:06:12,530 Not a good idea, aka bad idea. 99 00:06:12,980 --> 00:06:14,030 So what do we do? 100 00:06:14,900 --> 00:06:17,180 Well, here is the simplest approach. 101 00:06:17,180 --> 00:06:20,270 This is how we express a one to many relationship. 102 00:06:20,480 --> 00:06:21,590 And it looks like this. 103 00:06:21,590 --> 00:06:27,740 So we have two tables, customers and orders, and customers have an ID field and then a first name, 104 00:06:27,740 --> 00:06:34,010 last name and email and then orders have an ID field and then the date, the amount. 105 00:06:34,010 --> 00:06:37,220 And then this is the important this is the crux of everything. 106 00:06:37,220 --> 00:06:42,650 They have a field called it doesn't matter necessarily what it's called, but ours is called customer 107 00:06:42,650 --> 00:06:46,460 ID and it's a reference to the customers table. 108 00:06:46,460 --> 00:06:53,600 So whatever customer ID is in a given order is going to correspond to an actual customer who placed 109 00:06:53,600 --> 00:06:53,960 it. 110 00:06:54,860 --> 00:06:56,870 So let's dive into it a bit more. 111 00:06:56,870 --> 00:06:57,950 Here's an example. 112 00:06:58,370 --> 00:07:05,210 So I took the exact same data we had for orders and for customers from back here, and I split it up 113 00:07:05,210 --> 00:07:06,380 into separate tables. 114 00:07:07,220 --> 00:07:09,590 And you can see let's just talk about customers first. 115 00:07:09,620 --> 00:07:14,240 We have a customers table with an ID, a first name, last name and an email. 116 00:07:14,660 --> 00:07:15,890 So we have Boy George. 117 00:07:15,890 --> 00:07:24,650 George at Gmail ID one, George Michael at Gmail ID two, David Bowie, David at gmail.com, ID three 118 00:07:24,650 --> 00:07:27,890 and then blue steal blue at gmail.com, ID of four. 119 00:07:28,430 --> 00:07:32,990 These are all unique, corresponding to exactly one user or one customer. 120 00:07:33,530 --> 00:07:35,630 And then we have our orders table. 121 00:07:36,170 --> 00:07:39,140 So there's an ID there, order ID. 122 00:07:39,170 --> 00:07:41,330 And then the date that an order was placed. 123 00:07:41,840 --> 00:07:43,130 And then the amount. 124 00:07:43,580 --> 00:07:45,670 And then the most important part. 125 00:07:45,680 --> 00:07:52,190 Well, arguably the most important part, definitely the crux of the relationship is customer ID. 126 00:07:53,120 --> 00:07:59,600 So in this case, it's one which is pointing to this customer with idea of one meaning. 127 00:07:59,780 --> 00:08:00,710 Boy George. 128 00:08:01,130 --> 00:08:06,260 So Boy George placed an order in 2016 for $99.99. 129 00:08:07,320 --> 00:08:14,610 Here's another customer ID one telling us that this entire row here, this order is related to Boy George 130 00:08:14,610 --> 00:08:14,910 again. 131 00:08:14,910 --> 00:08:18,750 So he placed an order for 3550 on 2017. 132 00:08:19,980 --> 00:08:21,630 And then we have George Michael. 133 00:08:21,660 --> 00:08:30,300 Customer ID two matches this ID here, and he bought something for $800.67 in 2014 and something in 134 00:08:30,300 --> 00:08:35,159 2015 for $12.50, which is the exact same data we had represented here. 135 00:08:36,030 --> 00:08:41,760 And then notice that we have users like David Bowie or customers excuse me, David Bowie and Blue Steel 136 00:08:41,789 --> 00:08:43,140 who haven't bought anything. 137 00:08:43,140 --> 00:08:46,110 And there's no extra data, there's no NOLs. 138 00:08:46,140 --> 00:08:48,800 We have the minimum information that we need. 139 00:08:48,810 --> 00:08:51,420 We're not storing anything that we don't need. 140 00:08:51,420 --> 00:08:56,100 So we don't have extra orders that are created because they haven't created orders. 141 00:08:56,100 --> 00:08:57,240 We have separate tables. 142 00:08:57,780 --> 00:09:01,530 So this is the classic way of structuring a one to many relationship. 143 00:09:01,770 --> 00:09:06,570 And along with that, there are two terms we need to point out, which we have briefly discussed. 144 00:09:06,840 --> 00:09:08,880 And the first one is primary key. 145 00:09:09,090 --> 00:09:14,340 So remember this keyword we can type when we're defining a schema, we can say and what we've been saying 146 00:09:14,340 --> 00:09:21,270 a lot is ID space, INT space, auto increment and primary key. 147 00:09:21,810 --> 00:09:27,310 And the effect of that, what a primary key means is that some particular column in our case, this 148 00:09:27,330 --> 00:09:30,330 customer ID is always unique. 149 00:09:30,870 --> 00:09:34,740 So we only have one customer with ID of one. 150 00:09:35,530 --> 00:09:37,940 And only one customer, the idea of two and three and four. 151 00:09:37,960 --> 00:09:43,570 And that's really important because if we're referencing it somewhere else, like over here, if there 152 00:09:43,570 --> 00:09:48,730 are two different customers that have ID of one, then this is useless because which one does it refer 153 00:09:48,730 --> 00:09:49,210 to? 154 00:09:49,420 --> 00:09:51,010 So it has to be unique. 155 00:09:51,010 --> 00:09:53,650 And then that's where the auto increment comes in, right? 156 00:09:53,650 --> 00:09:55,660 Where it will make it unique. 157 00:09:55,660 --> 00:10:00,550 Every time we insert a new customer, it will automatically increment this to five and to six and to 158 00:10:00,550 --> 00:10:00,880 seven. 159 00:10:00,880 --> 00:10:04,780 So there is no possible way we could have duplicate customer IDs. 160 00:10:04,780 --> 00:10:09,360 And then also here is order ID, which is also a primary key. 161 00:10:09,370 --> 00:10:13,300 It is the way, that unique way of referencing an order. 162 00:10:13,480 --> 00:10:21,160 Now, this is a bit extreme, but it's possible that we would have two duplicate rows here where basically 163 00:10:21,160 --> 00:10:26,830 an order was placed for the same amount at the same day by the same person. 164 00:10:26,830 --> 00:10:32,470 It's possible, and we would need to refer to them separately, though we don't want our database just 165 00:10:32,470 --> 00:10:33,430 to combine them. 166 00:10:33,430 --> 00:10:36,940 Maybe a user purposefully bought the same thing twice. 167 00:10:36,940 --> 00:10:39,710 They bought it once and then they realized, oh shoot, I really should have bought two. 168 00:10:39,730 --> 00:10:40,930 So then they buy it again. 169 00:10:40,930 --> 00:10:43,930 Same amount, same day, same customer ID. 170 00:10:43,960 --> 00:10:48,280 Well, the only thing that keeps those orders unique is our order ID. 171 00:10:48,730 --> 00:10:51,130 So again, these two are primary keys. 172 00:10:51,130 --> 00:10:55,630 They are the primary way of referring to rows in this table. 173 00:10:55,630 --> 00:11:00,430 It's what makes them or it's not the only thing that makes them unique, but it is guaranteed to be 174 00:11:00,430 --> 00:11:01,030 unique. 175 00:11:01,540 --> 00:11:03,970 And then that brings us to foreign keys. 176 00:11:04,210 --> 00:11:09,910 And foreign keys are references to another table within a given table. 177 00:11:10,450 --> 00:11:19,120 So in this case, in our orders table, customer ID is a foreign key because it referring to this ID 178 00:11:19,120 --> 00:11:20,260 of the customers. 179 00:11:20,800 --> 00:11:27,430 So primary key primary key foreign key customers does not have a foreign key in it. 180 00:11:27,430 --> 00:11:33,340 It's not referring to any external table, but orders has a foreign key and that terminology is important. 181 00:11:33,340 --> 00:11:36,430 We'll come back to it a lot, primary key and foreign key. 182 00:11:36,430 --> 00:11:42,220 And we'll also show you how you actually say in your schema definition that something is a foreign key. 183 00:11:42,340 --> 00:11:49,570 And the reason you would do that is to enforce that whatever customer ID is in here is an existing customer 184 00:11:49,570 --> 00:11:50,530 ID over here. 185 00:11:51,010 --> 00:11:52,480 Because we might. 186 00:11:52,510 --> 00:11:58,810 What if we inserted something an order in an amount with customer ID of 20 and we don't have a customer 187 00:11:58,810 --> 00:12:00,070 with ID of 20? 188 00:12:00,220 --> 00:12:00,930 We may not. 189 00:12:00,940 --> 00:12:03,490 I mean, most likely we don't want that to happen, right? 190 00:12:03,490 --> 00:12:06,430 We would want to bounce that back and say, hey, this is a problem. 191 00:12:06,430 --> 00:12:12,940 There's not a valid customer with that ID and that's what foreign key, when we use that in our schema 192 00:12:12,940 --> 00:12:14,710 definition, it will enforce that. 193 00:12:14,860 --> 00:12:21,010 Otherwise, if we don't say that it's a foreign key explicitly, we could have customer IDs all over 194 00:12:21,010 --> 00:12:22,720 the place that don't actually exist. 195 00:12:23,590 --> 00:12:29,200 So this is the schema that we've defined well, this isn't the actual schema, but this is the basics 196 00:12:29,200 --> 00:12:31,180 of what we want our schema to look like. 197 00:12:31,180 --> 00:12:33,400 Now we're going to implement it in the next video.