1 00:00:00,810 --> 00:00:03,270 Congratulations, you have made it 2 00:00:03,270 --> 00:00:05,290 to the Section Recap. 3 00:00:05,290 --> 00:00:07,530 In this lesson, we are going to be looking back 4 00:00:07,530 --> 00:00:10,500 at the lessons in section 2. 5 00:00:10,500 --> 00:00:11,920 Some things to keep in mind. 6 00:00:11,920 --> 00:00:15,020 First, hey, it's all a review. 7 00:00:15,020 --> 00:00:16,560 If you are having struggles 8 00:00:16,560 --> 00:00:18,880 as we go through one of these concepts, 9 00:00:18,880 --> 00:00:21,380 go back and re-watch the video. 10 00:00:21,380 --> 00:00:24,490 Keep in mind that for this section, 11 00:00:24,490 --> 00:00:27,450 there's not a whole lot of depth, and that's okay. 12 00:00:27,450 --> 00:00:30,970 This is really around focusing on the DP-203, 13 00:00:30,970 --> 00:00:35,390 which for this, means understanding what the services are, 14 00:00:35,390 --> 00:00:38,463 where they live, and some high-level looks at what they do. 15 00:00:39,830 --> 00:00:42,440 If you don't know something, review. 16 00:00:42,440 --> 00:00:44,920 Go back, and check it out again. 17 00:00:44,920 --> 00:00:48,500 All right, let's dive in and start reviewing. 18 00:00:48,500 --> 00:00:51,260 First, we talked about data engineering, 19 00:00:51,260 --> 00:00:52,170 and if you remember, 20 00:00:52,170 --> 00:00:54,850 everything begins with Sally, the shopper, 21 00:00:54,850 --> 00:00:58,150 and the piece of data that she creates. 22 00:00:58,150 --> 00:01:01,440 That data then needs to move somewhere. 23 00:01:01,440 --> 00:01:04,450 And that data movement creates data engineering. 24 00:01:04,450 --> 00:01:08,150 Data engineering is really about storing, moving, 25 00:01:08,150 --> 00:01:11,063 and pulling meaningful insights out of data. 26 00:01:12,690 --> 00:01:15,070 Next, we talked about data transformation, 27 00:01:15,070 --> 00:01:16,770 and for the DP-203, 28 00:01:16,770 --> 00:01:18,410 that's primarily going to be 29 00:01:18,410 --> 00:01:21,240 Synapse Analytics and Databricks. 30 00:01:21,240 --> 00:01:23,870 And remember that data transformation 31 00:01:23,870 --> 00:01:28,010 consists of pulling data from various sources 32 00:01:28,010 --> 00:01:29,430 and transforming it 33 00:01:29,430 --> 00:01:33,010 so that it looks like one uniform data source. 34 00:01:33,010 --> 00:01:35,800 So that could be correcting, that could be removing nulls. 35 00:01:35,800 --> 00:01:37,710 It could be quite a few different things, 36 00:01:37,710 --> 00:01:42,180 but the process of transforming and curating data 37 00:01:42,180 --> 00:01:43,973 is data transformation. 38 00:01:45,820 --> 00:01:48,970 We also talked about structured versus unstructured. 39 00:01:48,970 --> 00:01:51,290 Recall SQL versus NoSQL. 40 00:01:51,290 --> 00:01:53,540 SQL being that relational database 41 00:01:53,540 --> 00:01:55,810 with fixed schema complex queries 42 00:01:55,810 --> 00:02:00,510 or the ability to process complex queries reasonably fast. 43 00:02:00,510 --> 00:02:03,160 And then, we talked about that vertical scaling 44 00:02:03,160 --> 00:02:06,360 versus the horizontal scaling of NoSQL. 45 00:02:06,360 --> 00:02:08,500 This one's actually really important. 46 00:02:08,500 --> 00:02:10,260 So for the DP-203, 47 00:02:10,260 --> 00:02:11,850 make sure you understand the differences 48 00:02:11,850 --> 00:02:13,860 between structured and unstructured 49 00:02:13,860 --> 00:02:15,370 and then the types of services 50 00:02:15,370 --> 00:02:17,620 that would live in either one. 51 00:02:17,620 --> 00:02:19,530 I find it helpful just to kind of create 52 00:02:19,530 --> 00:02:22,900 a little notebook of the services and then just a brief, 53 00:02:22,900 --> 00:02:24,890 hey, this is what this one does. 54 00:02:24,890 --> 00:02:26,370 So when we talk about SQL, 55 00:02:26,370 --> 00:02:29,603 we would be talking about which database type? 56 00:02:31,020 --> 00:02:35,260 If you said Synapse Analytics, you would be correct. 57 00:02:35,260 --> 00:02:38,640 NoSQL would be more like Cosmos DB. 58 00:02:38,640 --> 00:02:40,980 Cosmos DB, at least at this point, 59 00:02:40,980 --> 00:02:44,480 does not appear on the DP-203, 60 00:02:44,480 --> 00:02:47,253 but that's going to be more of an example of NoSQL. 61 00:02:49,540 --> 00:02:51,250 Talked about Data Factory. 62 00:02:51,250 --> 00:02:53,670 Data Factory is all about those pipelines. 63 00:02:53,670 --> 00:02:55,320 So, creating pipelines, 64 00:02:55,320 --> 00:02:58,040 which is a logical grouping of activities 65 00:02:58,040 --> 00:03:01,070 to create our cloud projects. 66 00:03:01,070 --> 00:03:04,880 Those activities are going to be data movement, 67 00:03:04,880 --> 00:03:08,110 data transformation, and data control. 68 00:03:08,110 --> 00:03:11,340 That's also very important for the DP-203. 69 00:03:11,340 --> 00:03:13,000 Understanding not only the pipelines, 70 00:03:13,000 --> 00:03:16,370 but the types of activities that you're going to see. 71 00:03:16,370 --> 00:03:18,810 And then, as we look at datasets, 72 00:03:18,810 --> 00:03:21,320 datasets, remember, are the data structures 73 00:03:21,320 --> 00:03:23,360 that we have within those data stores. 74 00:03:23,360 --> 00:03:27,880 So basically, it's where the data we need lives. 75 00:03:27,880 --> 00:03:29,270 And then we have the linked services, 76 00:03:29,270 --> 00:03:31,010 which is just that connection string 77 00:03:31,010 --> 00:03:33,210 to point to the data that we need to get to. 78 00:03:36,290 --> 00:03:40,020 We also talked about Azure Synapse Analytics. 79 00:03:40,020 --> 00:03:43,730 So it's going to be that combination of our Data Lake, 80 00:03:43,730 --> 00:03:47,710 Data Factory, and then our big data warehouse as well. 81 00:03:47,710 --> 00:03:49,253 So just keep that in mind. 82 00:03:52,510 --> 00:03:54,880 Stream Analytics, remember Stream Analytics 83 00:03:54,880 --> 00:03:57,920 deals with streaming data, of course. 84 00:03:57,920 --> 00:04:01,570 And with an important concept for Streaming Analytics, 85 00:04:01,570 --> 00:04:06,570 remember it all comes down to that input, query, and output. 86 00:04:06,670 --> 00:04:08,010 And when we talk about input, 87 00:04:08,010 --> 00:04:09,970 the 3 pieces you need to remember, 88 00:04:09,970 --> 00:04:13,900 Event Hubs, IoT Hubs, and Blob storage. 89 00:04:13,900 --> 00:04:15,630 And then finally, windowing. 90 00:04:15,630 --> 00:04:17,610 And again, we'll dive into windowing 91 00:04:17,610 --> 00:04:19,630 at a much deeper level later. 92 00:04:19,630 --> 00:04:22,670 Just remember that windowing helps us 93 00:04:22,670 --> 00:04:25,270 to determine how data comes through 94 00:04:25,270 --> 00:04:27,860 and how we're going to group or arrange it together 95 00:04:27,860 --> 00:04:31,083 to get the views that we need to make business decisions. 96 00:04:32,750 --> 00:04:35,480 Finally, we talked about Azure Databricks. 97 00:04:35,480 --> 00:04:37,850 And we talked about data engineering 98 00:04:37,850 --> 00:04:40,420 being the main component that you really care about. 99 00:04:40,420 --> 00:04:41,300 And then we talked about 100 00:04:41,300 --> 00:04:45,570 the transformational aspect of Databricks to clean up, 101 00:04:45,570 --> 00:04:48,760 correct, curate, or process our data 102 00:04:48,760 --> 00:04:50,320 so that it all looks the same 103 00:04:50,320 --> 00:04:52,530 so that when we pull insights out of it, 104 00:04:52,530 --> 00:04:55,510 or we do machine learning or whatever the next step is, 105 00:04:55,510 --> 00:04:57,200 we have a clean set of data 106 00:04:57,200 --> 00:04:58,973 that's all going to look the same. 107 00:05:01,810 --> 00:05:05,580 So, in summary, let me point out just a few things. 108 00:05:05,580 --> 00:05:07,450 First, this is the foundation. 109 00:05:07,450 --> 00:05:09,950 We're going to integrate data engineering concepts 110 00:05:09,950 --> 00:05:11,460 as we move forward. 111 00:05:11,460 --> 00:05:14,340 But understanding these key services 112 00:05:14,340 --> 00:05:16,560 and where they fit within Azure, 113 00:05:16,560 --> 00:05:18,810 and some of the basics about data, 114 00:05:18,810 --> 00:05:20,060 is going to help you 115 00:05:20,060 --> 00:05:23,040 as we introduce data engineering concepts. 116 00:05:23,040 --> 00:05:25,960 Next, focus on the DP-203. 117 00:05:25,960 --> 00:05:27,140 So make sure that you are 118 00:05:27,140 --> 00:05:30,140 reviewing the Microsoft exam requirements, 119 00:05:30,140 --> 00:05:32,800 and you will actually see those right here. 120 00:05:32,800 --> 00:05:36,050 So if you go to Google and just search for DP-203, 121 00:05:36,050 --> 00:05:40,500 you can actually find the Microsoft DP-203 exam. 122 00:05:40,500 --> 00:05:41,570 Scroll down a bit, 123 00:05:41,570 --> 00:05:45,180 and you will see download exam skills outline. 124 00:05:45,180 --> 00:05:47,230 Make sure that you are referencing this. 125 00:05:47,230 --> 00:05:48,220 Now keep in mind, of course, 126 00:05:48,220 --> 00:05:50,960 we've referenced this as well, very heavily. 127 00:05:50,960 --> 00:05:53,210 But these are the topics and the skills 128 00:05:53,210 --> 00:05:56,140 that you are going to be measured on for the exam. 129 00:05:56,140 --> 00:05:58,050 So make sure that you're looking at that 130 00:05:58,050 --> 00:06:01,380 and that you understand the concepts that fall in there. 131 00:06:01,380 --> 00:06:02,710 And what you can also do 132 00:06:02,710 --> 00:06:06,370 is you can also go in and look at Microsoft Docs. 133 00:06:06,370 --> 00:06:09,693 So, for instance, if I type in Azure Data Lake, 134 00:06:11,870 --> 00:06:12,850 there we go. You will see 135 00:06:12,850 --> 00:06:15,310 this is our introduction to Azure Data Lake, 136 00:06:15,310 --> 00:06:17,740 docs.microsoft.com. 137 00:06:17,740 --> 00:06:20,360 If I click on that one, it's going to pull this up. 138 00:06:20,360 --> 00:06:23,220 And I would suggest that you actually go through 139 00:06:23,220 --> 00:06:25,690 and you skim through the concepts. 140 00:06:25,690 --> 00:06:27,330 We've covered all of this material, 141 00:06:27,330 --> 00:06:31,100 but skim through the concepts, go through the overview. 142 00:06:31,100 --> 00:06:34,200 That's going to help give you a feel for how Microsoft 143 00:06:34,200 --> 00:06:38,010 writes. And that's going to help you a lot on the exam. 144 00:06:38,010 --> 00:06:41,660 So make sure that you're doing that as well. 145 00:06:41,660 --> 00:06:43,700 Finally, don't forget the labs. 146 00:06:43,700 --> 00:06:45,420 Each of the sections as you go through, 147 00:06:45,420 --> 00:06:47,100 as you have those labs, 148 00:06:47,100 --> 00:06:49,370 make sure that you not only have completed the lab 149 00:06:49,370 --> 00:06:50,780 but that you've completed it in a way 150 00:06:50,780 --> 00:06:53,320 that you know how to do that on your own. 151 00:06:53,320 --> 00:06:55,510 That's going to help you both on the exam 152 00:06:55,510 --> 00:06:57,630 and help you in your career. 153 00:06:57,630 --> 00:06:58,950 And that's true whether Microsoft 154 00:06:58,950 --> 00:07:01,180 has labs on the exam or doesn't. 155 00:07:01,180 --> 00:07:04,080 Building that confidence and doing it for yourself 156 00:07:04,080 --> 00:07:05,100 is really going to give you 157 00:07:05,100 --> 00:07:08,090 better understanding of how it works. 158 00:07:08,090 --> 00:07:09,980 All right, so this has been a fairly long review, 159 00:07:09,980 --> 00:07:13,430 but we had a lot of stuff in this section to cover, 160 00:07:13,430 --> 00:07:15,640 and it was a very broad section 161 00:07:15,640 --> 00:07:17,700 covering a lot of different services. 162 00:07:17,700 --> 00:07:19,980 So make sure that you have a good foundation 163 00:07:19,980 --> 00:07:21,580 before moving forward. 164 00:07:21,580 --> 00:07:25,520 And lastly, if you would do Landon and I a huge favor 165 00:07:25,520 --> 00:07:29,000 and just leave a thumbs up as you go through these videos, 166 00:07:29,000 --> 00:07:30,330 that really helps me to know 167 00:07:30,330 --> 00:07:32,040 that the concepts and content 168 00:07:32,040 --> 00:07:34,090 that I'm building for you is helpful. 169 00:07:34,090 --> 00:07:37,150 So if you can do that, I would greatly appreciate it. 170 00:07:37,150 --> 00:07:39,910 So, hey, that's it, you have completed section 2. 171 00:07:39,910 --> 00:07:42,150 Landon is going to be picking up in section 3, 172 00:07:42,150 --> 00:07:43,520 so he'll see you in the next section, 173 00:07:43,520 --> 00:07:46,090 and I'll see you in a couple of sections down the road. 174 00:07:46,090 --> 00:07:46,923 All right.