1 00:00:00,960 --> 00:00:10,270 ‫One of the big promises of containers is that we can easily deploy our apps like we were a platform 2 00:00:10,270 --> 00:00:15,550 ‫service, you know, like Heroku or something. We can do that on anyone's hardware, whether it's on our 3 00:00:15,550 --> 00:00:20,920 ‫hardware, it's a cloud provider, whether it's virtual or physical. With Docker, our apps can run 4 00:00:20,920 --> 00:00:28,240 ‫the same whether they're an Amazon, Azure, DigitalOcean, Linode, Rackspace, Google Cloud or whatever. Without 5 00:00:28,240 --> 00:00:34,900 ‫all those platform features, how do we easily deploy and maintain our dozens, or hundreds, or even thousands 6 00:00:34,900 --> 00:00:40,510 ‫of containers across many servers or instances? Or nodes or droplets or whatever. 7 00:00:40,570 --> 00:00:47,580 ‫That brings to bear some really new problems that weren't previously problems for small organizations. 8 00:00:47,590 --> 00:00:53,290 ‫We all know about Netflix and big organizations like that that scale to thousands and thousands 9 00:00:53,290 --> 00:00:56,620 ‫of nodes, and tens of thousands of nodes, and they've got lots of engineers. 10 00:00:56,620 --> 00:01:02,190 ‫But, if you have just a couple of people on your team, or if you're just a solo developer, how do you take 11 00:01:02,200 --> 00:01:08,410 ‫your containers and scale them and deal with their entire life cycle? From deploying them, to starting 12 00:01:08,410 --> 00:01:13,600 ‫them, to restarting them, to recreating them, to deleting them and updating them and all that. 13 00:01:13,600 --> 00:01:20,470 ‫We start to ask questions around how exactly does Docker do that or does it even do that at all? 14 00:01:20,530 --> 00:01:25,480 ‫Some of the questions we ask ourselves are things like, how do we scale out? How do we scale up? How do 15 00:01:25,480 --> 00:01:28,620 ‫we ensure that our containers restart if they fail? 16 00:01:28,630 --> 00:01:34,300 ‫How do we replace those containers when we actually have an update for them? Which is something called 17 00:01:34,300 --> 00:01:40,540 ‫blue green deploy nowadays, which means that you have zero downtime and you take servers out of the pool 18 00:01:40,600 --> 00:01:45,970 ‫in order to bring new ones in. You end up with always something available so that you're never really 19 00:01:45,970 --> 00:01:52,270 ‫down when you update. How do we know if we've got dozens of nodes or even just three nodes where we 20 00:01:52,270 --> 00:01:58,030 ‫started those containers? Which node is our container on? How do we talk across those different nodes 21 00:01:58,060 --> 00:02:01,930 ‫or servers with our networking inside the containers? 22 00:02:01,930 --> 00:02:07,210 ‫When it comes to security, how do we be sure that our containers are only running on the machines 23 00:02:07,210 --> 00:02:09,410 ‫that we intended for them to run on? 24 00:02:09,670 --> 00:02:15,070 ‫Now that we're moving things around and deploying them dynamically on the fly, how do we store the 25 00:02:15,070 --> 00:02:19,570 ‫private information we need for our containers, like secrets or passwords? 26 00:02:19,570 --> 00:02:25,310 ‫So this brings us to a major evolution in the scope of what Docker tries to solve. 27 00:02:25,360 --> 00:02:30,850 ‫When people talk to me and they think that Docker is just a container runtime, that's when I actually 28 00:02:30,850 --> 00:02:37,720 ‫just mention Swarm Mode, which is a brand new feature in 2016 that brings together years of understanding 29 00:02:37,810 --> 00:02:42,060 ‫the needs of containers and how to actually run them live in production. 30 00:02:42,070 --> 00:02:48,940 ‫So at its core, Swarm is actually a server clustering solution that brings together different operating 31 00:02:48,940 --> 00:02:50,590 ‫systems or hosts, 32 00:02:50,620 --> 00:02:57,670 ‫or nodes, or whatever you want to call them, into a single manageable unit that you can then orchestrate the lifecycle 33 00:02:57,700 --> 00:03:02,750 ‫of your containers in. Just to be clear, we're not actually talking about Swarm, which I'm going to 34 00:03:02,750 --> 00:03:12,250 ‫call Swarm Classic, which was an add-on component to Docker before 1.12 came out. It was really a container 35 00:03:12,340 --> 00:03:14,500 ‫that would run inside of Docker, 36 00:03:14,500 --> 00:03:19,250 ‫that really just took your Docker run commands and repeated them out to multiple servers. 37 00:03:19,270 --> 00:03:25,660 ‫It did solve a few problems, but it wasn't really at the scale that we needed to truly solve 80% 38 00:03:25,660 --> 00:03:28,060 ‫of the cases for how you're going to run your containers. 39 00:03:29,320 --> 00:03:30,780 ‫In the summer of 2016, 40 00:03:30,790 --> 00:03:37,930 ‫at dockercon 2016, actually Docker announced Swarm Kit, which was a set of libraries or a tool kit related 41 00:03:37,930 --> 00:03:39,820 ‫around a whole bunch of new Swarm features. 42 00:03:39,820 --> 00:03:45,760 ‫Then they stuck that right in the Docker server. All along in this course, you've actually had the 43 00:03:45,760 --> 00:03:49,090 ‫features available to you in the CLI and on your server. 44 00:03:49,150 --> 00:03:51,200 ‫Now we finally get to dive into them. 45 00:03:52,120 --> 00:03:57,240 ‫In January of 2017, 1.13 came out, and of course because Swarm Mode was new 46 00:03:57,240 --> 00:04:01,270 ‫in 2016, they're going to continue to make it better in the years to come. 47 00:04:01,270 --> 00:04:07,120 ‫In January, they added additional features called Stacks and Secrets, as well as other bonuses 48 00:04:07,120 --> 00:04:12,740 ‫we'll talk about later. It's important to note that Swarm is not actually enabled out of the box. 49 00:04:12,760 --> 00:04:17,680 ‫In fact, on your machine, you couldn't use these commands listed here right now. That would actually give 50 00:04:17,680 --> 00:04:20,480 ‫you an error because Swarm has to be enabled. 51 00:04:20,620 --> 00:04:25,660 ‫That was one of the design goals was that none of the Swarm code would affect the existing Docker 52 00:04:25,840 --> 00:04:31,510 ‫daemon, and that all the tools and systems out there that were already relying on Docker, or maybe they 53 00:04:31,510 --> 00:04:37,420 ‫had their own orchestration on top of Docker, would continue to function and not be interfered with by 54 00:04:37,450 --> 00:04:37,950 ‫Swarm 55 00:04:37,960 --> 00:04:44,230 ‫now being a part of Docker. Some really basic concepts before we start diving in, is that these blue 56 00:04:44,230 --> 00:04:49,240 ‫boxes you see over the top are what we call Manager Nodes, and they actually have a database locally 57 00:04:49,240 --> 00:04:52,120 ‫on them known as the Raft Database. 58 00:04:52,120 --> 00:04:57,520 ‫It stores their configuration and gives them all the information they need to have to be the authority 59 00:04:57,640 --> 00:04:58,920 ‫inside a swarm. 60 00:04:58,930 --> 00:05:03,760 ‫So what we have here is three different managers that have all been added to the swarm, and they 61 00:05:03,760 --> 00:05:10,720 ‫all keep a copy of that database and encrypt their traffic in order to ensure integrity and guarantee 62 00:05:10,720 --> 00:05:14,440 ‫the trust that they're able to manage this swarm securely. 63 00:05:14,830 --> 00:05:20,290 ‫Below in the green, we actually have worker notes. Now you can see in the concept of Swarm, we 64 00:05:20,290 --> 00:05:22,600 ‫have now managers and workers. 65 00:05:22,600 --> 00:05:28,660 ‫Each one of these would be a virtual machine, or a physical host, running some distribution of Linux or 66 00:05:28,660 --> 00:05:30,190 ‫Windows server. 67 00:05:30,220 --> 00:05:35,320 ‫This is showing how they're actually all communicating over what we call the Control Plane, which is 68 00:05:35,320 --> 00:05:42,710 ‫how orders get sent around the swarm, partaking actions. In a little bit more complicated view, 69 00:05:42,790 --> 00:05:48,290 ‫we have this Raft consensus database I mentioned, that is replicated again amongst all the nodes. 70 00:05:48,460 --> 00:05:50,540 ‫They issue orders down to the workers. 71 00:05:50,560 --> 00:05:56,170 ‫The managers themselves can also be workers. Of course, you can demote and promote workers and 72 00:05:56,170 --> 00:05:58,650 ‫managers into the two different roles. 73 00:05:58,660 --> 00:06:06,080 ‫When you think of a manager, typically think of a worker with permissions to control the swarm. 74 00:06:06,240 --> 00:06:11,550 ‫Again, the only requirements for each one of these servers is that they're running the same Docker 75 00:06:11,640 --> 00:06:19,750 ‫that you're already using now. With this concept of a swarm, and these managers, we now have new concepts 76 00:06:19,780 --> 00:06:21,670 ‫of what our containers look like. 77 00:06:21,730 --> 00:06:26,680 ‫So, with the Docker run command, we could only really deploy one container. 78 00:06:26,680 --> 00:06:28,180 ‫It would just create a container. 79 00:06:28,180 --> 00:06:32,920 ‫It was always on whatever machine the Docker CLI was talking to. 80 00:06:32,920 --> 00:06:38,170 ‫That's usually your local machine, or maybe a server they are logged into. That Docker run command didn't 81 00:06:38,170 --> 00:06:42,280 ‫have concepts around how to scale out or scale up. 82 00:06:42,340 --> 00:06:45,560 ‫So we needed new commands to deal with that. 83 00:06:45,610 --> 00:06:49,920 ‫That's where the Docker service command comes from. In a swarm, 84 00:06:49,920 --> 00:06:55,980 ‫it replaces the Docker run command, and allows us to add extra features to our container when we run 85 00:06:55,980 --> 00:06:59,930 ‫it, such as replicas to tell us how many of those it wants to run. 86 00:06:59,970 --> 00:07:02,020 ‫Those are known as tasks. 87 00:07:02,340 --> 00:07:08,190 ‫A single service can have multiple tasks, and each one of those tasks will launch a container. 88 00:07:08,190 --> 00:07:14,820 ‫In this example, we've created a service using docker service create to spin up an Nginx service 89 00:07:14,820 --> 00:07:17,600 ‫using the Nginx image like we've done several times before. 90 00:07:17,670 --> 00:07:20,010 ‫But we've told it that we'd like three replicas. 91 00:07:20,010 --> 00:07:28,350 ‫So it will use the manager nodes to decide where in the swarm to place those. By default, it tries 92 00:07:28,350 --> 00:07:29,470 ‫to spread them out. 93 00:07:29,520 --> 00:07:36,610 ‫Each node would get its own copy of the Nginx container up to the three replicas that we told 94 00:07:36,610 --> 00:07:38,220 ‫it we needed. 95 00:07:38,250 --> 00:07:44,170 ‫This is a quick and basic understanding of how the managers work and what they're doing in the background. 96 00:07:44,170 --> 00:07:46,410 ‫All these features are new. 97 00:07:46,410 --> 00:07:51,550 ‫It's not simply just taking your command and running it on an API like we would experience with a Docker 98 00:07:51,560 --> 00:07:52,650 ‫run command. 99 00:07:52,680 --> 00:08:01,350 ‫There's actually a totally new Swarm API here that has a bunch of background services, such as the scheduler, 100 00:08:01,380 --> 00:08:07,680 ‫and the dispatcher, and the allocator and orchestrator, that help make decisions around what the workers 101 00:08:07,770 --> 00:08:10,180 ‫should be executing at any given moment. 102 00:08:10,200 --> 00:08:14,460 ‫So the workers are constantly reporting in to the managers and asking for new work. 103 00:08:14,460 --> 00:08:20,520 ‫The managers are constantly doling out new work and evaluating what you've told them to do against what 104 00:08:20,520 --> 00:08:22,030 ‫they're actually doing. 105 00:08:22,050 --> 00:08:27,750 ‫Then if there's any reconciliation to happen, they will make those changes, such as maybe you told it 106 00:08:27,750 --> 00:08:31,600 ‫to spin up three more replicate tasks in that service. 107 00:08:31,620 --> 00:08:37,540 ‫So the orchestrator will realize that and then issue orders down to the workers and so on. 108 00:08:38,720 --> 00:08:46,160 ‫With this Swarm Mode, we actually get a feature-packed set of capabilities out of the box that allow 109 00:08:46,160 --> 00:08:51,740 ‫us to already use the existing Docker skills we have in order to deploy our containers to the Internet 110 00:08:51,830 --> 00:08:55,150 ‫in a reliable fashion, and solve a lot of problems that we would have 111 00:08:55,220 --> 00:08:56,090 ‫once we go production.