1
00:00:01,140 --> 00:00:06,240
What is the ASR Recovery Plan and how might this work specifically?

2
00:00:06,240 --> 00:00:06,500
Well,

3
00:00:06,500 --> 00:00:09,690
we saw in the previous module that when you're configuring

4
00:00:09,690 --> 00:00:12,260
backup with the Recovery Services vault,

5
00:00:12,260 --> 00:00:14,250
you have the backup policy.

6
00:00:14,250 --> 00:00:19,090
There is a similar policy object in the Recovery Services vault for ASR called

7
00:00:19,090 --> 00:00:22,870
the recovery plan and it's a way to do the orchestration.

8
00:00:22,870 --> 00:00:23,560
Okay.

9
00:00:23,560 --> 00:00:25,340
So let's take a look here.

10
00:00:25,340 --> 00:00:28,610
You might configure your failover environment such

11
00:00:28,610 --> 00:00:31,540
that when you initiate that failover,

12
00:00:31,540 --> 00:00:35,730
the recovery plan initiates a shutdown of your source machines.

13
00:00:35,730 --> 00:00:39,270
Hopefully it makes sense that you want those machines down and

14
00:00:39,270 --> 00:00:43,230
deallocated and offline because you want your failover

15
00:00:43,230 --> 00:00:45,640
environment to become the primary one.

16
00:00:45,640 --> 00:00:47,370
Failover is initiative.

17
00:00:47,370 --> 00:00:50,740
And then if this is a multi‑tier workload,

18
00:00:50,740 --> 00:00:53,000
you might want to deploy the data tier first.

19
00:00:53,000 --> 00:00:54,840
Now what's being deployed?

20
00:00:54,840 --> 00:00:57,230
You have to understand that with our ASR,

21
00:00:57,230 --> 00:01:00,850
it's the storage that is principally being replicated.

22
00:01:00,850 --> 00:01:03,440
The VM configuration is trivial.

23
00:01:03,440 --> 00:01:06,340
It's the storage really that is the most important,

24
00:01:06,340 --> 00:01:08,910
and that in your ASR failover environment,

25
00:01:08,910 --> 00:01:13,740
Azure doesn't actually create the VMs until you do the failover.

26
00:01:13,740 --> 00:01:18,240
Then it will build out those VMs and you determine the order in

27
00:01:18,240 --> 00:01:21,940
which those VMs get built in your recovery plan.

28
00:01:21,940 --> 00:01:25,880
You also can layer in scripts before or after each phase.

29
00:01:25,880 --> 00:01:29,780
So maybe after the database cluster VMs are up,

30
00:01:29,780 --> 00:01:32,970
you can run a PowerShell script to make sure that services

31
00:01:32,970 --> 00:01:36,680
are started and connectivity is working and so on before you

32
00:01:36,680 --> 00:01:39,060
move to the business tier where, again,

33
00:01:39,060 --> 00:01:42,670
you can inject scripts before or after that process,

34
00:01:42,670 --> 00:01:45,230
then maybe lastly, you've got your web front end,

35
00:01:45,230 --> 00:01:47,460
and the idea here is you don't want your business

36
00:01:47,460 --> 00:01:50,040
tier up until your data tier is up,

37
00:01:50,040 --> 00:01:53,790
you don't want your web tier up until your business and data tiers are up.

38
00:01:53,790 --> 00:01:54,710
You see what I mean?

39
00:01:54,710 --> 00:01:59,640
Again, you can inject scripts at any point before or after these processes,

40
00:01:59,640 --> 00:02:01,320
and then at that point,

41
00:02:01,320 --> 00:02:07,190
you're ready to either commit that failover or choose another recovery point.

42
00:02:07,190 --> 00:02:11,140
You see, part of your setup for ASI, I know there is a lot to it,

43
00:02:11,140 --> 00:02:12,440
there is no question,

44
00:02:12,440 --> 00:02:15,530
part of the setup is determining the frequency that

45
00:02:15,530 --> 00:02:19,400
replication snapshots are kept, and when you do a failover,

46
00:02:19,400 --> 00:02:23,170
you can choose if you realize that the most recent snapshot

47
00:02:23,170 --> 00:02:25,930
isn't the best state of your workload,

48
00:02:25,930 --> 00:02:28,590
you can actually do the failover and then choose

49
00:02:28,590 --> 00:02:31,640
another recovery point to failover again.

50
00:02:31,640 --> 00:02:34,340
It isn't until you issue a commit that you can no

51
00:02:34,340 --> 00:02:37,800
longer choose a recovery point, and you'll find at that time,

52
00:02:37,800 --> 00:02:43,380
hopefully, your disaster recovery workload is up and running and functional.

53
00:02:43,380 --> 00:02:46,160
Now, how would you handle a failback?

54
00:02:46,160 --> 00:02:51,040
Let's say the primary environment is repaired, it's back online, what do you do?

55
00:02:51,040 --> 00:02:51,200
Well,

56
00:02:51,200 --> 00:02:53,450
I don't have a separate slide for that so I'll just

57
00:02:53,450 --> 00:02:55,090
walk you through the process.

58
00:02:55,090 --> 00:02:58,180
There is not a simple button that says failback, I'm afraid.

59
00:02:58,180 --> 00:03:01,150
What you're going to do is enable replication,

60
00:03:01,150 --> 00:03:05,520
but in the reverse direction, wait for synchronization to happen.

61
00:03:05,520 --> 00:03:10,840
Now depending upon how long your downtime was, that could be a significant wait.

62
00:03:10,840 --> 00:03:14,960
Once your storage between your failover and your primary

63
00:03:14,960 --> 00:03:17,340
environment is again synchronized,

64
00:03:17,340 --> 00:03:20,190
you would do the same process you see on this slide,

65
00:03:20,190 --> 00:03:21,910
but in the reverse direction.

66
00:03:21,910 --> 00:03:22,950
You see what I mean?

67
00:03:22,950 --> 00:03:25,660
So once you complete that second failover,

68
00:03:25,660 --> 00:03:29,640
you'll find that your primary environment is now again primary

69
00:03:29,640 --> 00:03:32,950
and your secondary is not being replicated to.

70
00:03:32,950 --> 00:03:36,440
So at the end of that "failback",

71
00:03:36,440 --> 00:03:41,070
don't forget to reverse the direction of replication again so that

72
00:03:41,070 --> 00:03:43,680
in the future you can initiate another failover.

73
00:03:43,680 --> 00:03:46,890
You'll see in just a second when we do the demo that you can

74
00:03:46,890 --> 00:03:50,640
conduct and should conduct test failovers.

75
00:03:50,640 --> 00:03:52,140
This is a way for you, again,

76
00:03:52,140 --> 00:03:55,580
to do fire drills so you and your team are familiar with how

77
00:03:55,580 --> 00:03:59,840
to work with failover recovery in ASR, you know how to do the work,

78
00:03:59,840 --> 00:04:03,540
and it becomes more of a muscle memory thing.

79
00:04:03,540 --> 00:04:04,380
Well, here you go.

80
00:04:04,380 --> 00:04:05,560
I surprised myself.

81
00:04:05,560 --> 00:04:08,480
It looks like I do have a separate slide for failback.

82
00:04:08,480 --> 00:04:10,410
Let's go over it now, shall we?

83
00:04:10,410 --> 00:04:13,740
It's good I've already walked you through it verbally.

84
00:04:13,740 --> 00:04:15,100
Let's take a look graphically.

85
00:04:15,100 --> 00:04:15,730
Yes.

86
00:04:15,730 --> 00:04:19,260
So after you've committed your failover and you're ready to failback,

87
00:04:19,260 --> 00:04:22,340
you'll reconfigure replication in the opposite direction,

88
00:04:22,340 --> 00:04:27,740
wait for protected status, initiate a failover in the reverse direction,

89
00:04:27,740 --> 00:04:31,510
commit, and then re‑enable replication in the original direction.

90
00:04:31,510 --> 00:04:33,390
There it is, so I did have it laid out.

91
00:04:33,390 --> 00:04:37,000
Okay, enough theory. Let's get into the demo.