1 00:00:05,060 --> 00:00:11,310 ‫So that little bit of magic wasn't actually magic. It was the Routing Mesh. The Routing Mesh is 2 00:00:11,310 --> 00:00:18,510 ‫an incoming or ingress network that distributes packets for our service to the tasks for that service, 3 00:00:18,510 --> 00:00:20,880 ‫because we can have more than one task. 4 00:00:20,940 --> 00:00:26,700 ‫This actually spans all the nodes, and it's using Kernel primitives that have been around a while 5 00:00:26,700 --> 00:00:28,310 ‫actually called IPVS. 6 00:00:28,320 --> 00:00:29,050 ‫... 7 00:00:29,100 --> 00:00:33,150 ‫So we're not talking about some fancy new service. We're really just talking about the core features 8 00:00:33,150 --> 00:00:39,780 ‫of the Linux Kernel. What it's really doing here is it's load balancing across all the nodes and 9 00:00:39,780 --> 00:00:43,360 ‫listening on all the nodes for traffic. 10 00:00:43,380 --> 00:00:46,110 ‫There's a couple of ways we can talk about how this works. 11 00:00:46,110 --> 00:00:50,050 ‫The first example is from one container to another. 12 00:00:50,160 --> 00:00:57,780 ‫If our backend system, like say the databases, were increased to two replicas, the frontends talking 13 00:00:57,780 --> 00:00:58,900 ‫to the backends, 14 00:00:58,900 --> 00:01:01,920 ‫wouldn't actually talk directly to their IP address. 15 00:01:01,920 --> 00:01:07,530 ‫They would actually talk to something called a VIP, or a Virtual IP, that Swarm puts in front of all 16 00:01:07,530 --> 00:01:08,570 ‫services. 17 00:01:08,640 --> 00:01:14,730 ‫This is a private IP inside the virtual networking of Swarm, and it ensures that the load is distributed 18 00:01:14,730 --> 00:01:16,680 ‫amongst all the tasks for a service. 19 00:01:16,690 --> 00:01:24,000 ‫So if you can imagine if you had a worker role in your application, and it had 10 different containers, 20 00:01:24,180 --> 00:01:26,210 ‫you don't have to put a load balancer in front of that. 21 00:01:26,220 --> 00:01:27,550 ‫This does that for you. 22 00:01:27,570 --> 00:01:33,420 ‫When you're talking about traffic from one service inside your virtual network talking to another service 23 00:01:33,480 --> 00:01:34,910 ‫inside your virtual network. 24 00:01:35,340 --> 00:01:41,400 ‫And the second example of how the routing mesh works is external traffic coming into your swarm can 25 00:01:41,400 --> 00:01:45,180 ‫actually choose to hit any of the nodes in your swarm. 26 00:01:45,180 --> 00:01:51,990 ‫Any of the worker nodes are going to have that published port open and listening for that container's 27 00:01:51,990 --> 00:01:57,990 ‫traffic, and then it will reroute that traffic to the proper container based on its load balancing. 28 00:01:57,990 --> 00:02:02,610 ‫What this means is when you're deploying containers in a swarm, you're not supposed to have to care about 29 00:02:02,700 --> 00:02:05,280 ‫what server it's on because that might move, right? 30 00:02:05,280 --> 00:02:10,560 ‫If a container fails, and the task is recreated by the swarm, it might put that on a different node. And 31 00:02:10,560 --> 00:02:15,000 ‫you certainly don't want to have to change your firewall around or your DNS settings to make that container 32 00:02:15,000 --> 00:02:15,840 ‫work again. 33 00:02:15,840 --> 00:02:21,480 ‫The routing mesh solves a lot of those problems by allowing our Drupal website on port 80 to be 34 00:02:21,480 --> 00:02:26,880 ‫accessible from any node in the swarm. And in the background, it's taking those packets from that server 35 00:02:26,910 --> 00:02:31,170 ‫and then routing them to the container. If it's on a different node, in it'll route it over the virtual 36 00:02:31,170 --> 00:02:31,960 ‫networks. 37 00:02:32,100 --> 00:02:35,580 ‫If it's on the same node, it'll just reroute it to the port of that container. 38 00:02:35,760 --> 00:02:37,500 ‫And we didn't have to do anything to enable this. 39 00:02:37,500 --> 00:02:39,590 ‫This was all out of the box. 40 00:02:39,750 --> 00:02:41,640 ‫Let's see a diagram of how this might work. 41 00:02:42,560 --> 00:02:49,340 ‫If I created a newsform service, and I told it to have three replicas. And it created three tasks, with 42 00:02:49,340 --> 00:02:56,930 ‫three containers, on three nodes. Inside that overlay network, it's actually creating a virtual IP that's 43 00:02:57,050 --> 00:03:02,810 ‫mapped to the DNS name of the service, right? And the service, by default, the DNS name, is the name of the 44 00:03:02,810 --> 00:03:03,190 ‫service. 45 00:03:03,190 --> 00:03:09,470 ‫In this case, I created a service called my-web, and any other containers I have in my overlay 46 00:03:09,470 --> 00:03:13,160 ‫networks that need to talk to that service inside the swarm, 47 00:03:13,160 --> 00:03:19,370 ‫they only have to worry about using the my web DNS. The virtual IP properly load bounces the traffic 48 00:03:19,460 --> 00:03:23,750 ‫amongst all the tasks in that service. 49 00:03:23,750 --> 00:03:29,240 ‫This isn't actually DNS Round Robin. That's actually a slightly different configuration. We could 50 00:03:29,240 --> 00:03:33,040 ‫enable that if we wanted to. There's actually an option to use DNS Round Robin. 51 00:03:33,350 --> 00:03:41,230 ‫The benefits of VIPs over Round Robin is that a lot of times our DNS caches inside our apps prevent 52 00:03:41,240 --> 00:03:43,480 ‫us from properly distributing the load. 53 00:03:43,490 --> 00:03:49,640 ‫Rather than fight with our DNS clients in DNS configuration, we're just relying on the VIP, which is 54 00:03:49,640 --> 00:03:54,700 ‫kind of like what you would have if you bought a dedicated hardware load balancer. 55 00:03:54,780 --> 00:03:59,640 ‫In the second example, this is actually showing what it would be like with external traffic coming 56 00:03:59,640 --> 00:03:59,990 ‫in. 57 00:04:00,000 --> 00:04:06,150 ‫This is similar to what we just did with Drupal, where when I created those yellow boxes, by creating 58 00:04:06,150 --> 00:04:13,950 ‫one service, called my web, and it created two tasks, and applied them to two different nodes, each one of 59 00:04:13,950 --> 00:04:21,120 ‫those nodes has a built-in load balancer on the external IP address. For me, because I'm using DigitalOcean, 60 00:04:21,120 --> 00:04:27,950 ‫that IP address is the one that DigitalOcean gave me. When I use the -p and published 61 00:04:27,950 --> 00:04:34,260 ‫it on a port, in this example it's using port 80:80, any traffic that comes in to any of these three nodes 62 00:04:34,800 --> 00:04:42,270 ‫hits that load balancer on port 80:80. The load balancer decides which container should get the traffic 63 00:04:42,300 --> 00:04:46,740 ‫and whether or not that traffic is on the local node, or it needs to send the traffic over the 64 00:04:46,740 --> 00:04:48,570 ‫network to a different node. 65 00:04:48,570 --> 00:04:53,450 ‫Again, this actually all happens in the background without any special effort on our part. 66 00:04:54,970 --> 00:04:57,080 ‫All right. Let's see this Routing Mesh in action. 67 00:04:57,100 --> 00:05:00,700 ‫We already saw the example with Drupal and how it listens on all three nodes. 68 00:05:00,790 --> 00:05:05,090 ‫But what if we had multiple tasks that see that load balancer working. 69 00:05:05,110 --> 00:05:07,780 ‫If we do a docker service create, 70 00:05:10,430 --> 00:05:19,610 ‫what we're going to do is, we're going to use an Elasticsearch container. We're going to call it search and 71 00:05:19,610 --> 00:05:25,550 ‫give it three replicas. And want to publish it on port 9200, which is its default port. 72 00:05:30,880 --> 00:05:35,240 ‫In this case, we definitely want the :2 in there because it's the easiest version to deploy 73 00:05:35,240 --> 00:05:37,930 ‫right now. While that's creating, 74 00:05:37,970 --> 00:05:43,820 ‫I'll just mention that Elasticsearch is actually a search database that's accessible via a JSON web 75 00:05:43,820 --> 00:05:44,290 ‫API. 76 00:05:44,300 --> 00:05:49,500 ‫So it's really easy to hit with curl and give us good examples of how this works. 77 00:05:49,550 --> 00:05:59,480 ‫I do a docker service ps search, we can see that it's smartly created each task on a different node. 78 00:05:59,710 --> 00:06:08,290 ‫If I just do a curl on my localhost, on node one on port 9200 because I published that port, 79 00:06:09,580 --> 00:06:16,810 ‫I'll get back the Elasticsearch basic information. Part of that is it will actually create a random 80 00:06:16,810 --> 00:06:17,360 ‫name. 81 00:06:17,380 --> 00:06:20,020 ‫That's just a feature out of Elasticsearch. 82 00:06:20,020 --> 00:06:27,160 ‫If I curl this multiple times, you'll see on this one it's Patch, the Patch server, and then a third 83 00:06:27,160 --> 00:06:28,260 ‫time, 84 00:06:28,330 --> 00:06:34,080 ‫Jane Foster, and then it'll start repeating itself like such. 85 00:06:34,160 --> 00:06:40,610 ‫That's actually the virtual IP acting as a load balancer and distributing my load across the three 86 00:06:40,610 --> 00:06:41,790 ‫tasks. 87 00:06:41,840 --> 00:06:50,700 ‫A few more notes on the routing mesh. In 17.03, which is the release that I'm showing you this on, the 88 00:06:50,700 --> 00:06:55,570 ‫routing mesh and the load balancing are currently a stateless load balancer. 89 00:06:55,590 --> 00:07:02,930 ‫If you've ever dealt with State inside of maybe Amazon's classic load balancer, or other load balancer 90 00:07:02,940 --> 00:07:05,200 ‫technologies, you know what this is about. 91 00:07:05,280 --> 00:07:12,330 ‫This is basically saying that if you have to use session cookies on your application, or it expects a 92 00:07:12,330 --> 00:07:18,420 ‫consistent container to be talking to a consistent client, then you may need to add some other things 93 00:07:18,480 --> 00:07:19,740 ‫to help solve that problem. 94 00:07:19,740 --> 00:07:25,350 ‫Out of the box, every time you hit a service with multiple tasks, it's going to give you potentially 95 00:07:25,350 --> 00:07:26,320 ‫a different result. 96 00:07:27,890 --> 00:07:34,640 ‫Also, if you get into the details of this, it's actually a layer-3 load balancer, and that actually 97 00:07:34,640 --> 00:07:37,510 ‫operates at the IP and port layer. 98 00:07:37,520 --> 00:07:40,720 ‫It doesn't actually operate at the DNS layer. 99 00:07:40,790 --> 00:07:46,940 ‫If you've ever run multiple websites on the same port, on the same server, this isn't going to do that 100 00:07:46,940 --> 00:07:47,360 ‫yet. 101 00:07:47,480 --> 00:07:51,290 ‫You're still going to need another piece of the puzzle 102 00:07:51,290 --> 00:07:57,840 ‫on top of that if you're actually wanting to run multiple websites on the same port, on the same swarm. 103 00:07:58,130 --> 00:08:03,080 ‫Luckily, that's a pretty common request, and there's several options you can do to solve both of these 104 00:08:03,080 --> 00:08:04,070 ‫problems. 105 00:08:04,950 --> 00:08:12,220 ‫One of them is to use Nginx or HAProxy, which there are pretty good examples out there of containers 106 00:08:12,340 --> 00:08:19,480 ‫that will sit in front with your routing mesh, and actually act as a stateful load balancer or a layer 107 00:08:19,480 --> 00:08:24,970 ‫for load balancer, that can also do caching and lots of other things. If you need that, you might want 108 00:08:24,970 --> 00:08:28,220 ‫to check some of those out in the resources of this section. 109 00:08:28,330 --> 00:08:33,700 ‫I should also mention that if you were to pay for a subscription of Docker Enterprise edition, with 110 00:08:33,700 --> 00:08:39,760 ‫it comes something called UCP or Docker Data Center, which is a web interface. But I should mention that 111 00:08:39,760 --> 00:08:45,190 ‫Docker Enterprise edition, if you get a subscription to that for your swarm nodes, it actually comes with 112 00:08:45,190 --> 00:08:52,090 ‫a built-in layer for web proxy that allows you to just throw DNS names in the web config of your swarm 113 00:08:52,090 --> 00:08:54,240 ‫services and everything just works.