1
00:00:05,060 --> 00:00:11,310
‫So that little bit of magic wasn't actually magic. It was the Routing Mesh. The Routing Mesh is

2
00:00:11,310 --> 00:00:18,510
‫an incoming or ingress network that distributes packets for our service to the tasks for that service,

3
00:00:18,510 --> 00:00:20,880
‫because we can have more than one task.

4
00:00:20,940 --> 00:00:26,700
‫This actually spans all the nodes, and it's using Kernel primitives that have been around a while

5
00:00:26,700 --> 00:00:28,310
‫actually called IPVS.

6
00:00:28,320 --> 00:00:29,050
‫...

7
00:00:29,100 --> 00:00:33,150
‫So we're not talking about some fancy new service. We're really just talking about the core features

8
00:00:33,150 --> 00:00:39,780
‫of the Linux Kernel. What it's really doing here is it's load balancing across all the nodes and

9
00:00:39,780 --> 00:00:43,360
‫listening on all the nodes for traffic.

10
00:00:43,380 --> 00:00:46,110
‫There's a couple of ways we can talk about how this works.

11
00:00:46,110 --> 00:00:50,050
‫The first example is from one container to another.

12
00:00:50,160 --> 00:00:57,780
‫If our backend system, like say the databases, were increased to two replicas, the frontends talking

13
00:00:57,780 --> 00:00:58,900
‫to the backends,

14
00:00:58,900 --> 00:01:01,920
‫wouldn't actually talk directly to their IP address.

15
00:01:01,920 --> 00:01:07,530
‫They would actually talk to something called a VIP, or a Virtual IP, that Swarm puts in front of all

16
00:01:07,530 --> 00:01:08,570
‫services.

17
00:01:08,640 --> 00:01:14,730
‫This is a private IP inside the virtual networking of Swarm, and it ensures that the load is distributed

18
00:01:14,730 --> 00:01:16,680
‫amongst all the tasks for a service.

19
00:01:16,690 --> 00:01:24,000
‫So if you can imagine if you had a worker role in your application, and it had 10 different containers,

20
00:01:24,180 --> 00:01:26,210
‫you don't have to put a load balancer in front of that.

21
00:01:26,220 --> 00:01:27,550
‫This does that for you.

22
00:01:27,570 --> 00:01:33,420
‫When you're talking about traffic from one service inside your virtual network talking to another service

23
00:01:33,480 --> 00:01:34,910
‫inside your virtual network.

24
00:01:35,340 --> 00:01:41,400
‫And the second example of how the routing mesh works is external traffic coming into your swarm can

25
00:01:41,400 --> 00:01:45,180
‫actually choose to hit any of the nodes in your swarm.

26
00:01:45,180 --> 00:01:51,990
‫Any of the worker nodes are going to have that published port open and listening for that container's

27
00:01:51,990 --> 00:01:57,990
‫traffic, and then it will reroute that traffic to the proper container based on its load balancing.

28
00:01:57,990 --> 00:02:02,610
‫What this means is when you're deploying containers in a swarm, you're not supposed to have to care about

29
00:02:02,700 --> 00:02:05,280
‫what server it's on because that might move, right?

30
00:02:05,280 --> 00:02:10,560
‫If a container fails, and the task is recreated by the swarm, it might put that on a different node. And

31
00:02:10,560 --> 00:02:15,000
‫you certainly don't want to have to change your firewall around or your DNS settings to make that container

32
00:02:15,000 --> 00:02:15,840
‫work again.

33
00:02:15,840 --> 00:02:21,480
‫The routing mesh solves a lot of those problems by allowing our Drupal website on port 80 to be

34
00:02:21,480 --> 00:02:26,880
‫accessible from any node in the swarm. And in the background, it's taking those packets from that server

35
00:02:26,910 --> 00:02:31,170
‫and then routing them to the container. If it's on a different node, in it'll route it over the virtual

36
00:02:31,170 --> 00:02:31,960
‫networks.

37
00:02:32,100 --> 00:02:35,580
‫If it's on the same node, it'll just reroute it to the port of that container.

38
00:02:35,760 --> 00:02:37,500
‫And we didn't have to do anything to enable this.

39
00:02:37,500 --> 00:02:39,590
‫This was all out of the box.

40
00:02:39,750 --> 00:02:41,640
‫Let's see a diagram of how this might work.

41
00:02:42,560 --> 00:02:49,340
‫If I created a newsform service, and I told it to have three replicas. And it created three tasks, with

42
00:02:49,340 --> 00:02:56,930
‫three containers, on three nodes. Inside that overlay network, it's actually creating a virtual IP that's

43
00:02:57,050 --> 00:03:02,810
‫mapped to the DNS name of the service, right? And the service, by default, the DNS name, is the name of the

44
00:03:02,810 --> 00:03:03,190
‫service.

45
00:03:03,190 --> 00:03:09,470
‫In this case, I created a service called my-web, and any other containers I have in my overlay

46
00:03:09,470 --> 00:03:13,160
‫networks that need to talk to that service inside the swarm,

47
00:03:13,160 --> 00:03:19,370
‫they only have to worry about using the my web DNS. The virtual IP properly load bounces the traffic

48
00:03:19,460 --> 00:03:23,750
‫amongst all the tasks in that service.

49
00:03:23,750 --> 00:03:29,240
‫This isn't actually DNS Round Robin. That's actually a slightly different configuration. We could

50
00:03:29,240 --> 00:03:33,040
‫enable that if we wanted to. There's actually an option to use DNS Round Robin.

51
00:03:33,350 --> 00:03:41,230
‫The benefits of VIPs over Round Robin is that a lot of times our DNS caches inside our apps prevent

52
00:03:41,240 --> 00:03:43,480
‫us from properly distributing the load.

53
00:03:43,490 --> 00:03:49,640
‫Rather than fight with our DNS clients in DNS configuration, we're just relying on the VIP, which is

54
00:03:49,640 --> 00:03:54,700
‫kind of like what you would have if you bought a dedicated hardware load balancer.

55
00:03:54,780 --> 00:03:59,640
‫In the second example, this is actually showing what it would be like with external traffic coming

56
00:03:59,640 --> 00:03:59,990
‫in.

57
00:04:00,000 --> 00:04:06,150
‫This is similar to what we just did with Drupal, where when I created those yellow boxes, by creating

58
00:04:06,150 --> 00:04:13,950
‫one service, called my web, and it created two tasks, and applied them to two different nodes, each one of

59
00:04:13,950 --> 00:04:21,120
‫those nodes has a built-in load balancer on the external IP address. For me, because I'm using DigitalOcean,

60
00:04:21,120 --> 00:04:27,950
‫that IP address is the one that DigitalOcean gave me. When I use the -p and published

61
00:04:27,950 --> 00:04:34,260
‫it on a port, in this example it's using port 80:80, any traffic that comes in to any of these three nodes

62
00:04:34,800 --> 00:04:42,270
‫hits that load balancer on port 80:80. The load balancer decides which container should get the traffic

63
00:04:42,300 --> 00:04:46,740
‫and whether or not that traffic is on the local node, or it needs to send the traffic over the

64
00:04:46,740 --> 00:04:48,570
‫network to a different node.

65
00:04:48,570 --> 00:04:53,450
‫Again, this actually all happens in the background without any special effort on our part.

66
00:04:54,970 --> 00:04:57,080
‫All right. Let's see this Routing Mesh in action.

67
00:04:57,100 --> 00:05:00,700
‫We already saw the example with Drupal and how it listens on all three nodes.

68
00:05:00,790 --> 00:05:05,090
‫But what if we had multiple tasks that see that load balancer working.

69
00:05:05,110 --> 00:05:07,780
‫If we do a docker service create,

70
00:05:10,430 --> 00:05:19,610
‫what we're going to do is, we're going to use an Elasticsearch container. We're going to call it search and

71
00:05:19,610 --> 00:05:25,550
‫give it three replicas. And want to publish it on port 9200, which is its default port.

72
00:05:30,880 --> 00:05:35,240
‫In this case, we definitely want the :2 in there because it's the easiest version to deploy

73
00:05:35,240 --> 00:05:37,930
‫right now. While that's creating,

74
00:05:37,970 --> 00:05:43,820
‫I'll just mention that Elasticsearch is actually a search database that's accessible via a JSON web

75
00:05:43,820 --> 00:05:44,290
‫API.

76
00:05:44,300 --> 00:05:49,500
‫So it's really easy to hit with curl and give us good examples of how this works.

77
00:05:49,550 --> 00:05:59,480
‫I do a docker service ps search, we can see that it's smartly created each task on a different node.

78
00:05:59,710 --> 00:06:08,290
‫If I just do a curl on my localhost, on node one on port 9200 because I published that port,

79
00:06:09,580 --> 00:06:16,810
‫I'll get back the Elasticsearch basic information. Part of that is it will actually create a random

80
00:06:16,810 --> 00:06:17,360
‫name.

81
00:06:17,380 --> 00:06:20,020
‫That's just a feature out of Elasticsearch.

82
00:06:20,020 --> 00:06:27,160
‫If I curl this multiple times, you'll see on this one it's Patch, the Patch server, and then a third

83
00:06:27,160 --> 00:06:28,260
‫time,

84
00:06:28,330 --> 00:06:34,080
‫Jane Foster, and then it'll start repeating itself like such.

85
00:06:34,160 --> 00:06:40,610
‫That's actually the virtual IP acting as a load balancer and distributing my load across the three

86
00:06:40,610 --> 00:06:41,790
‫tasks.

87
00:06:41,840 --> 00:06:50,700
‫A few more notes on the routing mesh. In 17.03, which is the release that I'm showing you this on, the

88
00:06:50,700 --> 00:06:55,570
‫routing mesh and the load balancing are currently a stateless load balancer.

89
00:06:55,590 --> 00:07:02,930
‫If you've ever dealt with State inside of maybe Amazon's classic load balancer, or other load balancer

90
00:07:02,940 --> 00:07:05,200
‫technologies, you know what this is about.

91
00:07:05,280 --> 00:07:12,330
‫This is basically saying that if you have to use session cookies on your application, or it expects a

92
00:07:12,330 --> 00:07:18,420
‫consistent container to be talking to a consistent client, then you may need to add some other things

93
00:07:18,480 --> 00:07:19,740
‫to help solve that problem.

94
00:07:19,740 --> 00:07:25,350
‫Out of the box, every time you hit a service with multiple tasks, it's going to give you potentially

95
00:07:25,350 --> 00:07:26,320
‫a different result.

96
00:07:27,890 --> 00:07:34,640
‫Also, if you get into the details of this, it's actually a layer-3 load balancer, and that actually

97
00:07:34,640 --> 00:07:37,510
‫operates at the IP and port layer.

98
00:07:37,520 --> 00:07:40,720
‫It doesn't actually operate at the DNS layer.

99
00:07:40,790 --> 00:07:46,940
‫If you've ever run multiple websites on the same port, on the same server, this isn't going to do that

100
00:07:46,940 --> 00:07:47,360
‫yet.

101
00:07:47,480 --> 00:07:51,290
‫You're still going to need another piece of the puzzle

102
00:07:51,290 --> 00:07:57,840
‫on top of that if you're actually wanting to run multiple websites on the same port, on the same swarm.

103
00:07:58,130 --> 00:08:03,080
‫Luckily, that's a pretty common request, and there's several options you can do to solve both of these

104
00:08:03,080 --> 00:08:04,070
‫problems.

105
00:08:04,950 --> 00:08:12,220
‫One of them is to use Nginx or HAProxy, which there are pretty good examples out there of containers

106
00:08:12,340 --> 00:08:19,480
‫that will sit in front with your routing mesh, and actually act as a stateful load balancer or a layer

107
00:08:19,480 --> 00:08:24,970
‫for load balancer, that can also do caching and lots of other things. If you need that, you might want

108
00:08:24,970 --> 00:08:28,220
‫to check some of those out in the resources of this section.

109
00:08:28,330 --> 00:08:33,700
‫I should also mention that if you were to pay for a subscription of Docker Enterprise edition, with

110
00:08:33,700 --> 00:08:39,760
‫it comes something called UCP or Docker Data Center, which is a web interface. But I should mention that

111
00:08:39,760 --> 00:08:45,190
‫Docker Enterprise edition, if you get a subscription to that for your swarm nodes, it actually comes with

112
00:08:45,190 --> 00:08:52,090
‫a built-in layer for web proxy that allows you to just throw DNS names in the web config of your swarm

113
00:08:52,090 --> 00:08:54,240
‫services and everything just works.