1 00:00:01,040 --> 00:00:02,470 And it's also a good spot. 2 00:00:02,470 --> 00:00:04,420 I've got a whole string of errors here. 3 00:00:04,420 --> 00:00:05,680 Yikes! 4 00:00:05,680 --> 00:00:07,970 Yeah, well, there are some reasons, frankly, 5 00:00:07,970 --> 00:00:11,080 why I'm generating so many errors, but in fact, 6 00:00:11,080 --> 00:00:15,340 it might wind up with that updating process failing because of it. 7 00:00:15,340 --> 00:00:17,010 I'm a hurting unit here, okay? 8 00:00:17,010 --> 00:00:19,100 Come on now, be gentle. 9 00:00:19,100 --> 00:00:22,180 Okay, so let's take a look while we're waiting for that, 10 00:00:22,180 --> 00:00:26,540 and we'll report on it back in the CAU application in a moment. 11 00:00:26,540 --> 00:00:29,110 Let's take a look at failover and failback. 12 00:00:29,110 --> 00:00:32,610 We recall that we've got the two main roles that I've seen 13 00:00:32,610 --> 00:00:35,820 in business on failover clustering 3, really, 14 00:00:35,820 --> 00:00:40,100 our highly available VMs, then we've got our file shares, 15 00:00:40,100 --> 00:00:42,840 which you can do the traditional highly available, 16 00:00:42,840 --> 00:00:43,900 active‑passive, 17 00:00:43,900 --> 00:00:48,440 or you could do Scale‑Out File Server active‑active with cluster shared volumes, 18 00:00:48,440 --> 00:00:49,220 and the SOFS, 19 00:00:49,220 --> 00:00:55,550 the Scale‑Out File Server is used to store Hyper‑V VHDs and configurations, 20 00:00:55,550 --> 00:00:56,120 and also, 21 00:00:56,120 --> 00:00:59,200 SQL Server databases when you're doing high availability 22 00:00:59,200 --> 00:01:01,150 in a failover cluster infrastructure. 23 00:01:01,150 --> 00:01:03,140 All right, so far so good. 24 00:01:03,140 --> 00:01:08,340 So let me select this vm1, and let's actually make a connection to it here, 25 00:01:08,340 --> 00:01:12,170 and we can see that it looks like I haven't yet installed an operating system, 26 00:01:12,170 --> 00:01:14,740 but we're going to go through that process here. 27 00:01:14,740 --> 00:01:20,390 And watch this, we're going to, we see that the Owner Node for vm1 is mem2, 28 00:01:20,390 --> 00:01:23,590 Well, I'm going to mosey on over to mem2, 29 00:01:23,590 --> 00:01:25,870 and I'm not going to Pause and Drain. 30 00:01:25,870 --> 00:01:28,580 This is what we would want to do if we were manually, 31 00:01:28,580 --> 00:01:32,180 gracefully taking a cluster host out of commission. 32 00:01:32,180 --> 00:01:34,220 Instead, I'm going to, well, 33 00:01:34,220 --> 00:01:37,410 I don't want to evict or stop the cluster service either, 34 00:01:37,410 --> 00:01:39,790 so actually, why don't I just do a Pause, 35 00:01:39,790 --> 00:01:43,340 Do Not Drain Roles. 36 00:01:43,340 --> 00:01:44,220 Let's just see. 37 00:01:44,220 --> 00:01:47,950 We still have availability here on the virtual machine. 38 00:01:47,950 --> 00:01:50,860 I am using cluster shared volumes, by the way, 39 00:01:50,860 --> 00:01:53,540 on this virtual machine connection. 40 00:01:53,540 --> 00:01:57,340 You know, the Pause really isn't giving us enough impact here, 41 00:01:57,340 --> 00:02:01,840 but let me do a Resume, Do Not Fail Roles Back, 42 00:02:01,840 --> 00:02:04,020 and let's just swing the heavier hammer here. 43 00:02:04,020 --> 00:02:11,360 Let me go to PowerShell and do an Invoke‑Command where the ComputerName is mem2, 44 00:02:11,360 --> 00:02:18,100 and the ScriptBlock I'm going to pass here is simply Stop‑Computer. 45 00:02:18,100 --> 00:02:22,190 That's going to be the equivalent of the machine having a catastrophic 46 00:02:22,190 --> 00:02:25,640 interruption where it's just going to go dark immediately. 47 00:02:25,640 --> 00:02:28,490 And that's more of actually a real‑world case. 48 00:02:28,490 --> 00:02:29,780 We can see Draining. 49 00:02:29,780 --> 00:02:35,990 So almost immediately, the failover cluster is aware that mem2 is in trouble, 50 00:02:35,990 --> 00:02:38,840 and we need to initiate a Role Drain. 51 00:02:38,840 --> 00:02:42,810 So if we go over to Roles, we can see the two things, 52 00:02:42,810 --> 00:02:45,940 actually, it looks like the VM already shifted, 53 00:02:45,940 --> 00:02:49,830 and the SDDC Group, that's our Storage Spaces Direct subsystem, 54 00:02:49,830 --> 00:02:53,360 that one is currently owned by mem2, and that's in a Pending state. 55 00:02:53,360 --> 00:02:55,540 It's taking a little bit longer, 56 00:02:55,540 --> 00:02:59,640 but that's going to come over to mem1 as the Owner Node as well, 57 00:02:59,640 --> 00:03:02,070 and it looks like my installation here. 58 00:03:02,070 --> 00:03:05,940 I've never actually tried a failover while I'm doing an OS installation. 59 00:03:05,940 --> 00:03:08,420 This might be a little bit different than if the 60 00:03:08,420 --> 00:03:10,940 machine were just started fresh. 61 00:03:10,940 --> 00:03:13,540 Maybe if I force a reset. 62 00:03:13,540 --> 00:03:17,200 I just wanted to get something on this VM up and running so we 63 00:03:17,200 --> 00:03:21,340 could see it in process survive a failover event. 64 00:03:21,340 --> 00:03:23,590 Yeah, well, we didn't even need to do the reset. 65 00:03:23,590 --> 00:03:25,260 It looks like it was just waiting. 66 00:03:25,260 --> 00:03:27,930 Oh, and then the reset kicked in. 67 00:03:27,930 --> 00:03:29,600 Okay, there we go. 68 00:03:29,600 --> 00:03:30,230 Well, that's fine. 69 00:03:30,230 --> 00:03:33,940 It's bringing us back to where we need to be, as you can see. 70 00:03:33,940 --> 00:03:36,310 I think the concept is proved though. 71 00:03:36,310 --> 00:03:38,200 We're able to shift that very, 72 00:03:38,200 --> 00:03:44,560 very easily because if we look at the C ClusterStorage mount point on each node, 73 00:03:44,560 --> 00:03:47,540 that's where our virtual machine, our vm1, 74 00:03:47,540 --> 00:03:51,550 both the VHD, as well as its configuration snapshots, 75 00:03:51,550 --> 00:03:54,120 those are all available to all cluster nodes, 76 00:03:54,120 --> 00:03:58,540 so it makes it a lot easier to perform that failover and failback. 77 00:03:58,540 --> 00:04:03,320 Now, I had mentioned that if we look at the Properties of a clustered role, 78 00:04:03,320 --> 00:04:06,210 this is where you can specify those Preferred Owners. 79 00:04:06,210 --> 00:04:10,440 Now you can select which one, or ones, it's going to be. 80 00:04:10,440 --> 00:04:12,800 You can adjust the priority. 81 00:04:12,800 --> 00:04:15,280 I want mem1 to be at the top of the list High, 82 00:04:15,280 --> 00:04:20,470 I want 1 and 2 to be my 2 preferred nodes. 83 00:04:20,470 --> 00:04:24,380 Actually, there's just the two of them, but notice that if you select just one, 84 00:04:24,380 --> 00:04:27,430 you can shift the Priority, High, Medium, 85 00:04:27,430 --> 00:04:32,040 Low, No Auto Start, depending upon how important the workload is. 86 00:04:32,040 --> 00:04:35,560 And then for failover, as I said, I'm going to do an Allow failback, 87 00:04:35,560 --> 00:04:36,350 Immediately. 88 00:04:36,350 --> 00:04:39,140 I probably should have had that set originally. 89 00:04:39,140 --> 00:04:41,890 That would be if I want the machine to go back to mem2. 90 00:04:41,890 --> 00:04:45,940 I actually don't, so I'm going to put that on preferred fallback. 91 00:04:45,940 --> 00:04:49,410 Let's balance out the cluster a bit because it looks like we've got some 92 00:04:49,410 --> 00:04:54,540 serious issues going on in terms of mem1 is hosting everything. 93 00:04:54,540 --> 00:04:55,700 Well, in order to do that, 94 00:04:55,700 --> 00:04:58,500 we're going to need to bring that machine back online, 95 00:04:58,500 --> 00:05:01,730 so why don't I just adjust this? 96 00:05:01,730 --> 00:05:03,960 Actually, I don't think I'll be able to hit the machine. 97 00:05:03,960 --> 00:05:08,040 No, because of the Invoke‑Command, I've got the machine flat on its back. 98 00:05:08,040 --> 00:05:12,200 But what I'm going to need to do actually is go to mem2 and start it. 99 00:05:12,200 --> 00:05:14,600 Through the magic of video editing, I did that. 100 00:05:14,600 --> 00:05:18,800 I can quickly verify, mstsc, mem2. 101 00:05:18,800 --> 00:05:23,990 Oh, I'm not sure I have my firewall configured for Remote Desktop. 102 00:05:23,990 --> 00:05:26,210 Oh, okay. Well, when all else fails, 103 00:05:26,210 --> 00:05:36,340 we can try something like test‑NetConnection ‑ComputerName mem2.timw.info, 104 00:05:36,340 --> 00:05:40,940 Port 6516, that's the Windows Admin Center port. 105 00:05:40,940 --> 00:05:43,790 Test‑NetConnection is a great alternative to ping 106 00:05:43,790 --> 00:05:45,710 because you can hit on any port. 107 00:05:45,710 --> 00:05:45,950 Again, 108 00:05:45,950 --> 00:05:51,140 this False is probably indicative more than anything else of a firewall block. 109 00:05:51,140 --> 00:05:55,740 So, anyway, I think the best thing to do is to come back up here, 110 00:05:55,740 --> 00:05:59,440 and we can see, very easily, that mem2 is back up. 111 00:05:59,440 --> 00:06:07,000 So if we want to balance this out and shift SDDC Group back, we can do a Move, Select Node.