1
00:00:01,040 --> 00:00:02,470
And it's also a good spot.

2
00:00:02,470 --> 00:00:04,420
I've got a whole string of errors here.

3
00:00:04,420 --> 00:00:05,680
Yikes!

4
00:00:05,680 --> 00:00:07,970
Yeah, well, there are some reasons, frankly,

5
00:00:07,970 --> 00:00:11,080
why I'm generating so many errors, but in fact,

6
00:00:11,080 --> 00:00:15,340
it might wind up with that updating process failing because of it.

7
00:00:15,340 --> 00:00:17,010
I'm a hurting unit here, okay?

8
00:00:17,010 --> 00:00:19,100
Come on now, be gentle.

9
00:00:19,100 --> 00:00:22,180
Okay, so let's take a look while we're waiting for that,

10
00:00:22,180 --> 00:00:26,540
and we'll report on it back in the CAU application in a moment.

11
00:00:26,540 --> 00:00:29,110
Let's take a look at failover and failback.

12
00:00:29,110 --> 00:00:32,610
We recall that we've got the two main roles that I've seen

13
00:00:32,610 --> 00:00:35,820
in business on failover clustering 3, really,

14
00:00:35,820 --> 00:00:40,100
our highly available VMs, then we've got our file shares,

15
00:00:40,100 --> 00:00:42,840
which you can do the traditional highly available,

16
00:00:42,840 --> 00:00:43,900
active‑passive,

17
00:00:43,900 --> 00:00:48,440
or you could do Scale‑Out File Server active‑active with cluster shared volumes,

18
00:00:48,440 --> 00:00:49,220
and the SOFS,

19
00:00:49,220 --> 00:00:55,550
the Scale‑Out File Server is used to store Hyper‑V VHDs and configurations,

20
00:00:55,550 --> 00:00:56,120
and also,

21
00:00:56,120 --> 00:00:59,200
SQL Server databases when you're doing high availability

22
00:00:59,200 --> 00:01:01,150
in a failover cluster infrastructure.

23
00:01:01,150 --> 00:01:03,140
All right, so far so good.

24
00:01:03,140 --> 00:01:08,340
So let me select this vm1, and let's actually make a connection to it here,

25
00:01:08,340 --> 00:01:12,170
and we can see that it looks like I haven't yet installed an operating system,

26
00:01:12,170 --> 00:01:14,740
but we're going to go through that process here.

27
00:01:14,740 --> 00:01:20,390
And watch this, we're going to, we see that the Owner Node for vm1 is mem2,

28
00:01:20,390 --> 00:01:23,590
Well, I'm going to mosey on over to mem2,

29
00:01:23,590 --> 00:01:25,870
and I'm not going to Pause and Drain.

30
00:01:25,870 --> 00:01:28,580
This is what we would want to do if we were manually,

31
00:01:28,580 --> 00:01:32,180
gracefully taking a cluster host out of commission.

32
00:01:32,180 --> 00:01:34,220
Instead, I'm going to, well,

33
00:01:34,220 --> 00:01:37,410
I don't want to evict or stop the cluster service either,

34
00:01:37,410 --> 00:01:39,790
so actually, why don't I just do a Pause,

35
00:01:39,790 --> 00:01:43,340
Do Not Drain Roles.

36
00:01:43,340 --> 00:01:44,220
Let's just see.

37
00:01:44,220 --> 00:01:47,950
We still have availability here on the virtual machine.

38
00:01:47,950 --> 00:01:50,860
I am using cluster shared volumes, by the way,

39
00:01:50,860 --> 00:01:53,540
on this virtual machine connection.

40
00:01:53,540 --> 00:01:57,340
You know, the Pause really isn't giving us enough impact here,

41
00:01:57,340 --> 00:02:01,840
but let me do a Resume, Do Not Fail Roles Back,

42
00:02:01,840 --> 00:02:04,020
and let's just swing the heavier hammer here.

43
00:02:04,020 --> 00:02:11,360
Let me go to PowerShell and do an Invoke‑Command where the ComputerName is mem2,

44
00:02:11,360 --> 00:02:18,100
and the ScriptBlock I'm going to pass here is simply Stop‑Computer.

45
00:02:18,100 --> 00:02:22,190
That's going to be the equivalent of the machine having a catastrophic

46
00:02:22,190 --> 00:02:25,640
interruption where it's just going to go dark immediately.

47
00:02:25,640 --> 00:02:28,490
And that's more of actually a real‑world case.

48
00:02:28,490 --> 00:02:29,780
We can see Draining.

49
00:02:29,780 --> 00:02:35,990
So almost immediately, the failover cluster is aware that mem2 is in trouble,

50
00:02:35,990 --> 00:02:38,840
and we need to initiate a Role Drain.

51
00:02:38,840 --> 00:02:42,810
So if we go over to Roles, we can see the two things,

52
00:02:42,810 --> 00:02:45,940
actually, it looks like the VM already shifted,

53
00:02:45,940 --> 00:02:49,830
and the SDDC Group, that's our Storage Spaces Direct subsystem,

54
00:02:49,830 --> 00:02:53,360
that one is currently owned by mem2, and that's in a Pending state.

55
00:02:53,360 --> 00:02:55,540
It's taking a little bit longer,

56
00:02:55,540 --> 00:02:59,640
but that's going to come over to mem1 as the Owner Node as well,

57
00:02:59,640 --> 00:03:02,070
and it looks like my installation here.

58
00:03:02,070 --> 00:03:05,940
I've never actually tried a failover while I'm doing an OS installation.

59
00:03:05,940 --> 00:03:08,420
This might be a little bit different than if the

60
00:03:08,420 --> 00:03:10,940
machine were just started fresh.

61
00:03:10,940 --> 00:03:13,540
Maybe if I force a reset.

62
00:03:13,540 --> 00:03:17,200
I just wanted to get something on this VM up and running so we

63
00:03:17,200 --> 00:03:21,340
could see it in process survive a failover event.

64
00:03:21,340 --> 00:03:23,590
Yeah, well, we didn't even need to do the reset.

65
00:03:23,590 --> 00:03:25,260
It looks like it was just waiting.

66
00:03:25,260 --> 00:03:27,930
Oh, and then the reset kicked in.

67
00:03:27,930 --> 00:03:29,600
Okay, there we go.

68
00:03:29,600 --> 00:03:30,230
Well, that's fine.

69
00:03:30,230 --> 00:03:33,940
It's bringing us back to where we need to be, as you can see.

70
00:03:33,940 --> 00:03:36,310
I think the concept is proved though.

71
00:03:36,310 --> 00:03:38,200
We're able to shift that very,

72
00:03:38,200 --> 00:03:44,560
very easily because if we look at the C ClusterStorage mount point on each node,

73
00:03:44,560 --> 00:03:47,540
that's where our virtual machine, our vm1,

74
00:03:47,540 --> 00:03:51,550
both the VHD, as well as its configuration snapshots,

75
00:03:51,550 --> 00:03:54,120
those are all available to all cluster nodes,

76
00:03:54,120 --> 00:03:58,540
so it makes it a lot easier to perform that failover and failback.

77
00:03:58,540 --> 00:04:03,320
Now, I had mentioned that if we look at the Properties of a clustered role,

78
00:04:03,320 --> 00:04:06,210
this is where you can specify those Preferred Owners.

79
00:04:06,210 --> 00:04:10,440
Now you can select which one, or ones, it's going to be.

80
00:04:10,440 --> 00:04:12,800
You can adjust the priority.

81
00:04:12,800 --> 00:04:15,280
I want mem1 to be at the top of the list High,

82
00:04:15,280 --> 00:04:20,470
I want 1 and 2 to be my 2 preferred nodes.

83
00:04:20,470 --> 00:04:24,380
Actually, there's just the two of them, but notice that if you select just one,

84
00:04:24,380 --> 00:04:27,430
you can shift the Priority, High, Medium,

85
00:04:27,430 --> 00:04:32,040
Low, No Auto Start, depending upon how important the workload is.

86
00:04:32,040 --> 00:04:35,560
And then for failover, as I said, I'm going to do an Allow failback,

87
00:04:35,560 --> 00:04:36,350
Immediately.

88
00:04:36,350 --> 00:04:39,140
I probably should have had that set originally.

89
00:04:39,140 --> 00:04:41,890
That would be if I want the machine to go back to mem2.

90
00:04:41,890 --> 00:04:45,940
I actually don't, so I'm going to put that on preferred fallback.

91
00:04:45,940 --> 00:04:49,410
Let's balance out the cluster a bit because it looks like we've got some

92
00:04:49,410 --> 00:04:54,540
serious issues going on in terms of mem1 is hosting everything.

93
00:04:54,540 --> 00:04:55,700
Well, in order to do that,

94
00:04:55,700 --> 00:04:58,500
we're going to need to bring that machine back online,

95
00:04:58,500 --> 00:05:01,730
so why don't I just adjust this?

96
00:05:01,730 --> 00:05:03,960
Actually, I don't think I'll be able to hit the machine.

97
00:05:03,960 --> 00:05:08,040
No, because of the Invoke‑Command, I've got the machine flat on its back.

98
00:05:08,040 --> 00:05:12,200
But what I'm going to need to do actually is go to mem2 and start it.

99
00:05:12,200 --> 00:05:14,600
Through the magic of video editing, I did that.

100
00:05:14,600 --> 00:05:18,800
I can quickly verify, mstsc, mem2.

101
00:05:18,800 --> 00:05:23,990
Oh, I'm not sure I have my firewall configured for Remote Desktop.

102
00:05:23,990 --> 00:05:26,210
Oh, okay. Well, when all else fails,

103
00:05:26,210 --> 00:05:36,340
we can try something like test‑NetConnection ‑ComputerName mem2.timw.info,

104
00:05:36,340 --> 00:05:40,940
Port 6516, that's the Windows Admin Center port.

105
00:05:40,940 --> 00:05:43,790
Test‑NetConnection is a great alternative to ping

106
00:05:43,790 --> 00:05:45,710
because you can hit on any port.

107
00:05:45,710 --> 00:05:45,950
Again,

108
00:05:45,950 --> 00:05:51,140
this False is probably indicative more than anything else of a firewall block.

109
00:05:51,140 --> 00:05:55,740
So, anyway, I think the best thing to do is to come back up here,

110
00:05:55,740 --> 00:05:59,440
and we can see, very easily, that mem2 is back up.

111
00:05:59,440 --> 00:06:07,000
So if we want to balance this out and shift SDDC Group back, we can do a Move, Select Node.