1 00:00:00,150 --> 00:00:05,250 Next up, we'll see another clause we can use with group by called with Rollup. 2 00:00:05,490 --> 00:00:08,850 Now, this one is a little tricky to put into words. 3 00:00:09,150 --> 00:00:10,680 I'll show you what the docs say. 4 00:00:10,920 --> 00:00:17,160 It says that the group by clause permits a width rollup modifier that causes summary output to include 5 00:00:17,160 --> 00:00:19,470 extra rows that represent higher level. 6 00:00:19,500 --> 00:00:22,860 That is super aggregate summary operations. 7 00:00:22,980 --> 00:00:27,990 That sounds a little intense, but if you see an example, I think it helps. 8 00:00:28,320 --> 00:00:33,630 It's not exactly self explanatory, but it helps to explain it quite a bit compared to just reading 9 00:00:33,630 --> 00:00:34,320 the docs. 10 00:00:34,770 --> 00:00:39,600 So let's work again with our what do we have? 11 00:00:39,750 --> 00:00:41,610 Select start from full reviews. 12 00:00:41,610 --> 00:00:48,900 This view that we made this joined three tables join together and let's do a simple thing like use one 13 00:00:48,900 --> 00:00:52,650 of the aggregate functions average or count. 14 00:00:53,250 --> 00:01:00,360 So we know that we can do things like select average rating from full reviews. 15 00:01:00,390 --> 00:01:03,810 Now this gives me one average for the whole table. 16 00:01:04,470 --> 00:01:08,820 But we also know we can do things like select average rating. 17 00:01:09,210 --> 00:01:13,440 If we group by let's do title title. 18 00:01:14,480 --> 00:01:20,900 Average rating from full reviews grouped by and then title. 19 00:01:22,040 --> 00:01:27,560 So it's the the exact same average function, but it's operating at different levels. 20 00:01:27,560 --> 00:01:32,960 In this level, it's operating on groups of rows grouped by title. 21 00:01:33,290 --> 00:01:37,340 At this level, it's operating on all of the rows for the whole table. 22 00:01:37,430 --> 00:01:43,970 Now, when I take this second query that we just ran, this one here, select title and average rating 23 00:01:43,970 --> 00:01:50,750 from full reviews grouped by title and I add with roll up at the end of the group VI. 24 00:01:50,780 --> 00:01:54,560 This only works with group by group by title with rollup. 25 00:01:54,950 --> 00:01:59,210 The results look very similar except for this last row. 26 00:01:59,570 --> 00:02:03,920 This last row here is a summary statistic. 27 00:02:03,920 --> 00:02:07,640 It's a summary average for the entire table. 28 00:02:07,670 --> 00:02:10,580 So we have the average for each group. 29 00:02:10,610 --> 00:02:16,430 Archer Group average is 8.1 to Bob's Burgers, seven five to Stranger Things. 30 00:02:16,430 --> 00:02:19,790 And then this has no title, hence the null. 31 00:02:20,060 --> 00:02:23,570 It is the average for the entire table. 32 00:02:23,570 --> 00:02:26,570 All of the rows 8.02553. 33 00:02:27,230 --> 00:02:29,750 So we could do the same thing for count. 34 00:02:30,020 --> 00:02:31,670 Let me do it in a file though. 35 00:02:32,000 --> 00:02:38,390 So we have title average rating from full reviews grouped by title with Roll up. 36 00:02:38,600 --> 00:02:39,860 Let's do another example. 37 00:02:39,860 --> 00:02:42,230 This time I'll do count. 38 00:02:42,800 --> 00:02:45,380 How many ratings does each one have? 39 00:02:46,560 --> 00:02:48,490 And once again, we're grouping by title. 40 00:02:48,510 --> 00:02:50,220 We see Archer has five. 41 00:02:51,240 --> 00:02:57,030 Fargo has two reviews, Stranger Things has three, and then Noel has 47. 42 00:02:57,030 --> 00:03:02,040 That's all of our counts or the count of all of the ratings in the table. 43 00:03:02,820 --> 00:03:05,820 So this is one level of width rollup. 44 00:03:05,820 --> 00:03:15,120 But what gets more confusing to explain is if we group by multiple columns with rollup also will give 45 00:03:15,120 --> 00:03:18,510 us data at different levels of that hierarchy. 46 00:03:18,960 --> 00:03:21,060 So let me just show an example. 47 00:03:21,540 --> 00:03:26,250 Instead of grouping by title and looking at the title and the count. 48 00:03:26,280 --> 00:03:31,700 Let's do something like Select, Let's group by the year and look at the average rating by year. 49 00:03:31,710 --> 00:03:38,430 So select released year and average rating by year. 50 00:03:38,430 --> 00:03:40,800 So from full reviews. 51 00:03:42,810 --> 00:03:46,350 Group by animal group by the released year. 52 00:03:47,010 --> 00:03:49,170 And I won't do a roll up just yet. 53 00:03:49,800 --> 00:03:54,240 Okay, So we get the average rating for all of the reviews in a given year. 54 00:03:54,240 --> 00:03:55,380 We grouped by year. 55 00:03:55,950 --> 00:03:57,990 Now, if I put that with roll up in there. 56 00:04:00,350 --> 00:04:04,400 Where are we going to get the overall average for all of the ratings? 57 00:04:04,970 --> 00:04:06,050 We've seen this before. 58 00:04:06,080 --> 00:04:07,790 8.2553. 59 00:04:08,240 --> 00:04:14,090 But now, instead of just grouping by release here, I'm going to add on genre. 60 00:04:14,150 --> 00:04:16,160 So let's also display genre. 61 00:04:19,529 --> 00:04:20,370 There we are. 62 00:04:20,490 --> 00:04:22,079 And then the average rating. 63 00:04:22,630 --> 00:04:25,990 With roll up and the results. 64 00:04:25,990 --> 00:04:26,880 It's a little different. 65 00:04:26,890 --> 00:04:28,750 Let me do it without the roll up first. 66 00:04:28,750 --> 00:04:31,750 Just so you see what our end result looks like. 67 00:04:32,260 --> 00:04:39,280 So we're looking at in 2009, all of the animations released animation movies had this average rating 68 00:04:39,640 --> 00:04:42,640 in 1999 comedy. 69 00:04:42,910 --> 00:04:47,020 Okay, so I'm going to run this and we get something quite a bit different. 70 00:04:47,530 --> 00:04:50,800 What we see here is a lot more NOLs. 71 00:04:50,920 --> 00:04:56,710 Each one of these NOLs represents one of those summary statistics, a summary average. 72 00:04:56,710 --> 00:05:01,930 So down at the bottom we have no NOL 8.2553. 73 00:05:01,930 --> 00:05:06,340 That once again is the average for the entire data set, all of our ratings. 74 00:05:06,700 --> 00:05:15,130 But if we start at the top, we have the average rating for dramas released in 1963 and then below that 75 00:05:15,130 --> 00:05:23,710 we have the average rating from all genres in 1963 where there is no genre, it's not grouped by genre, 76 00:05:23,710 --> 00:05:24,910 hence the null. 77 00:05:25,060 --> 00:05:28,450 So we get this second level of summary statistics. 78 00:05:28,750 --> 00:05:37,150 Now, our data is not robust enough to really have a lot of examples until we get down to 2014 right 79 00:05:37,150 --> 00:05:39,190 here, what do we have? 80 00:05:39,190 --> 00:05:49,780 We have animation in 2014, the average review 7.9 for drama in 2014, the average review 9.56. 81 00:05:49,780 --> 00:05:55,300 And then we have the summary for all of 2014, 8.55. 82 00:05:55,300 --> 00:05:59,890 And I mean that tracks rate 9.9 versus 9.5. 83 00:05:59,920 --> 00:06:02,080 You average them, it's going to be around 8.5. 84 00:06:02,590 --> 00:06:05,080 So this shows us how it's working. 85 00:06:05,080 --> 00:06:08,770 It's hard to understand without an example to look at. 86 00:06:08,770 --> 00:06:11,590 But we have our group by Right. 87 00:06:11,590 --> 00:06:17,710 Look at the query one more time here, grouped by release year and genre with Roll up the With Roll 88 00:06:17,710 --> 00:06:24,400 Up gives us not just the average rating for each release year genre combination, but also for each 89 00:06:24,400 --> 00:06:26,920 release year and for all rows. 90 00:06:27,370 --> 00:06:30,700 So that is a brief intro to with Roll Up. 91 00:06:30,700 --> 00:06:32,200 There's more that you can do with it. 92 00:06:32,200 --> 00:06:38,200 There's some operations where you could replace the null and change that to be something else. 93 00:06:38,200 --> 00:06:44,080 There's a special function called grouping, but I just wanted to keep it brief, show you the concept 94 00:06:44,080 --> 00:06:50,110 of with roll up, it helps us generate higher level summary statistics and we're not limited to just 95 00:06:50,110 --> 00:06:52,150 getting one summary statistic. 96 00:06:52,270 --> 00:06:56,110 Every column that we add in here, we don't really have anything else. 97 00:06:56,110 --> 00:07:00,880 I mean, I guess I guess I could do. 98 00:07:03,330 --> 00:07:08,460 I don't know first name of the reviewer, but this makes no sense to do. 99 00:07:08,460 --> 00:07:10,800 But I just want to show that if I did that. 100 00:07:13,400 --> 00:07:15,410 We get a lot more NOLs. 101 00:07:15,530 --> 00:07:18,710 And now we have summary statistics. 102 00:07:18,830 --> 00:07:20,630 We have the average for Colts. 103 00:07:20,630 --> 00:07:23,860 In 1963, drama movies was a 4.5. 104 00:07:23,870 --> 00:07:32,960 But then here we have a summary for all users in 1963 and I said five, but it's 1963 dramas. 105 00:07:33,080 --> 00:07:38,570 And then we have all reviews in 1963, regardless of genre. 106 00:07:38,930 --> 00:07:41,870 And then we could keep going and just see more examples of this. 107 00:07:41,900 --> 00:07:46,220 Look at 2014, and then we have the average for Colt and animation. 108 00:07:46,220 --> 00:07:53,540 Kimbra And Animation, but we also have the average for Colt in drama in 2014 and Wyatt in Drama 2014. 109 00:07:53,540 --> 00:07:59,240 And then we also have the average for all reviewers for drama in 2014. 110 00:07:59,240 --> 00:08:07,250 And then finally, the average for all reviews just in 2014, regardless of genre 8.55. 111 00:08:07,250 --> 00:08:13,010 And then down at the very bottom, we have the average for all reviews, regardless of who left it, 112 00:08:13,010 --> 00:08:15,950 what year and what genre, just all of them. 113 00:08:16,730 --> 00:08:17,780 So that's with Rollup.