1 00:00:02,120 --> 00:00:07,229 welcome back to BackSpace Academy. In this lecture I'm going to run through 2 00:00:07,229 --> 00:00:12,540 architecture design for solutions architects we'll start by going through 3 00:00:12,540 --> 00:00:18,120 the five pillars of a well architected framework those being security 4 00:00:18,120 --> 00:00:22,500 reliability performance cost optimization and operational excellence 5 00:00:22,500 --> 00:00:27,900 then we'll have a look at an example architecture and then we'll also point 6 00:00:27,900 --> 00:00:31,830 to where you can go to find more of these example architectures that you 7 00:00:31,830 --> 00:00:39,030 should have a look at before you sit the exam there are a number of general 8 00:00:39,030 --> 00:00:43,380 design principles that we should take into consideration when we're doing our 9 00:00:43,380 --> 00:00:49,170 architecture we should scale automatically and scale horizontally we 10 00:00:49,170 --> 00:00:55,440 don't want to be replacing servers with new servers of a bigger capacity that 11 00:00:55,440 --> 00:00:59,160 takes time it's not cost-efficient and it's not the right thing to do we should 12 00:00:59,160 --> 00:01:04,019 scale horizontally and do that automatically based on demand we should 13 00:01:04,019 --> 00:01:09,060 have production scale testing we should be testing in an environment similar to 14 00:01:09,060 --> 00:01:14,490 what we would expect from production we need to automate our creation and 15 00:01:14,490 --> 00:01:19,560 replication of systems obviously to make it easier to tear down and recreate 16 00:01:19,560 --> 00:01:25,609 systems using services such as cloud formation we need to allow for 17 00:01:25,609 --> 00:01:30,689 evolutionary architectures making sure that our design is not set in stone and 18 00:01:30,689 --> 00:01:35,909 that we are flexible to new technology as it comes in because AWS is very much 19 00:01:35,909 --> 00:01:43,469 an evolutionary service data driven architectures we need to take away the 20 00:01:43,469 --> 00:01:49,920 emotion about what services we choose for example you know you might want 21 00:01:49,920 --> 00:01:55,020 Oracle or you might want no sequel as a database service so you need to make 22 00:01:55,020 --> 00:02:00,119 sure that the data that you're using drives that decision and and to make 23 00:02:00,119 --> 00:02:04,009 sure that you use data to drive improvement going forward as well 24 00:02:04,009 --> 00:02:10,439 you should also regularly simulate simulate your events in a production 25 00:02:10,439 --> 00:02:14,700 environment and a good example of that is chaos monkey now that is a an open 26 00:02:14,700 --> 00:02:17,880 source project that created by net six and if you go to the 27 00:02:17,880 --> 00:02:22,980 Netflix github repository for chaos monkey you can download that and install 28 00:02:22,980 --> 00:02:28,020 that on your infrastructure and what it does it it will basically go inside your 29 00:02:28,020 --> 00:02:32,040 infrastructure we running around shutting down instances and basically 30 00:02:32,040 --> 00:02:36,690 creating chaos inside your environment and we'll be looking to see how 31 00:02:36,690 --> 00:02:41,010 resilient your environment is and how quickly those ec2 instances that are 32 00:02:41,010 --> 00:02:45,510 shut down I've recreated and back on deck and so that's something that you 33 00:02:45,510 --> 00:02:48,959 can do in a production environment we and we certainly database certainly does 34 00:02:48,959 --> 00:02:51,810 recommend doing that sort of thing in a production environment if it's far 35 00:02:51,810 --> 00:02:59,820 better to do it when it's planned rather than when it's unplanned so the five 36 00:02:59,820 --> 00:03:05,760 pillars of the AWS well architected framework we have security we need to 37 00:03:05,760 --> 00:03:10,769 protect our information our systems and our assets we need to conduct regular 38 00:03:10,769 --> 00:03:15,239 risk assessments of any changes that we make and of any deployments that we make 39 00:03:15,239 --> 00:03:20,660 we need to have a mitigation strategies in place if something goes wrong 40 00:03:20,660 --> 00:03:24,150 reliability making sure that our infrastructure is 41 00:03:24,150 --> 00:03:28,709 highly available it is fault tolerant it is elastic it can cope with changes in 42 00:03:28,709 --> 00:03:32,340 demand we need to have good configuration management so that if 43 00:03:32,340 --> 00:03:37,320 something does go wrong we understand what the configuration should look like 44 00:03:37,320 --> 00:03:41,310 and if there are any changes to configuration we can look at those as 45 00:03:41,310 --> 00:03:48,120 being causes of any problems performance efficiency we need to obviously make 46 00:03:48,120 --> 00:03:51,570 sure that we're getting the best bang for our buck making sure that our 47 00:03:51,570 --> 00:03:56,190 infrastructure is efficient making sure that it can cope with changes in demand 48 00:03:56,190 --> 00:04:01,709 and also making sure that it can cope with changes in technology cost 49 00:04:01,709 --> 00:04:08,010 optimization reducing costs making sure that we have billing alerts making sure 50 00:04:08,010 --> 00:04:12,090 that we have our resources tagged and making sure that we understand where our 51 00:04:12,090 --> 00:04:17,160 sub optimal resources are and looking to downgrade those resources or terminate 52 00:04:17,160 --> 00:04:21,329 those sources if they're not required and operational excellence making sure 53 00:04:21,329 --> 00:04:26,280 that we have a maximum business value from our infrastructure and we 54 00:04:26,280 --> 00:04:31,270 continuously improve our architecture we make small gains on 55 00:04:31,270 --> 00:04:36,460 a regular basis I mean we do not assume that we have the best architecture 56 00:04:36,460 --> 00:04:42,270 available and we make sure that we continuously improve that architecture 57 00:04:42,270 --> 00:04:47,710 the design principles that we need for security we need to apply security at 58 00:04:47,710 --> 00:04:53,530 all layers not just the firewall we need to also assume that a threat might come 59 00:04:53,530 --> 00:05:00,040 from within our own infrastructure our firewall could be penetrated or we could 60 00:05:00,040 --> 00:05:05,880 have for example a contractor or a disgruntled employee or someone who may 61 00:05:05,880 --> 00:05:09,940 instigate an attack within our infrastructure we need to enable 62 00:05:09,940 --> 00:05:14,560 traceability making sure that we know what's going on inside of our 63 00:05:14,560 --> 00:05:20,560 infrastructure granting least privilege making sure that people only have the 64 00:05:20,560 --> 00:05:26,320 access that they need and nothing more we need to focus on our responsibility 65 00:05:26,320 --> 00:05:33,190 to secure our system it is not a WSS responsibility to secure our system it 66 00:05:33,190 --> 00:05:37,960 is a shared responsibility model and we need to take responsibility for what we 67 00:05:37,960 --> 00:05:44,080 can affect we need to automate our security best practices and those best 68 00:05:44,080 --> 00:05:49,570 practices being Identity and Access Management root access don't use it lock 69 00:05:49,570 --> 00:05:56,220 those credentials away and don't use them create users groups and roles and 70 00:05:56,220 --> 00:06:00,760 make sure wherever possible that you have implemented multi-factor 71 00:06:00,760 --> 00:06:07,810 authentication use identity identity Federation when you need access from 72 00:06:07,810 --> 00:06:11,650 outside of your system when you need access to a large number of people and 73 00:06:11,650 --> 00:06:15,240 you need to control that identity Federation is the best way to do that 74 00:06:15,240 --> 00:06:21,750 implement detected controls cloud trail is a great service and for auditing and 75 00:06:21,750 --> 00:06:26,380 generating logs of exactly what is going on inside of your infrastructure and you 76 00:06:26,380 --> 00:06:31,210 can link that to Kinesis streams and cloud watch alarms you can also link 77 00:06:31,210 --> 00:06:36,250 that to further to a lambda function that could implement some corrective 78 00:06:36,250 --> 00:06:41,320 action and infrastructure protection so making sure our 79 00:06:41,320 --> 00:06:47,110 is secure we have a VPN if needed Direct Connect we can implement a well a if we 80 00:06:47,110 --> 00:06:53,680 require and cloud front which is great for denial of service attacks and we 81 00:06:53,680 --> 00:06:57,580 should also look at penetration testing but if we do go down the path of 82 00:06:57,580 --> 00:07:01,990 penetration testing we need to make sure that we notify AWS that we're going to 83 00:07:01,990 --> 00:07:05,410 be doing that so that they don't think that that penetration testing is 84 00:07:05,410 --> 00:07:13,360 actually a real penetration attack data protection making sure that we classify 85 00:07:13,360 --> 00:07:18,250 our data what is sensitive data what do we need to encrypt and what the window 86 00:07:18,250 --> 00:07:20,500 data encrypt what should be public and what should be 87 00:07:20,500 --> 00:07:25,390 private how are we going to encrypt our data and how are we going to manage the 88 00:07:25,390 --> 00:07:30,130 encryption keys for that implementing version control making sure that if 89 00:07:30,130 --> 00:07:34,930 something does go wrong we can always go back to a previous version an incident 90 00:07:34,930 --> 00:07:40,690 response marking offending IP addresses in our in our networks access control 91 00:07:40,690 --> 00:07:46,180 this for example we can conduct forensics in a clean environment we 92 00:07:46,180 --> 00:07:50,470 don't have to use our production environments such we can quarantine an 93 00:07:50,470 --> 00:07:55,210 environment and conduct our forensics on that environment we can look at DNS 94 00:07:55,210 --> 00:07:59,800 failover in the event that we have a major problem with our infrastructure 95 00:07:59,800 --> 00:08:05,230 just looking at the diagram there with kind of shows the different layers that 96 00:08:05,230 --> 00:08:10,270 we can implement at the security group level so at the web layer there we've 97 00:08:10,270 --> 00:08:15,730 got port 80 and 443 access open to the Internet then we go down to our app 98 00:08:15,730 --> 00:08:20,910 server layer which is only permitted to the web server it can access that or 99 00:08:20,910 --> 00:08:27,760 from outside only port 22 so SSH into that application server and that would 100 00:08:27,760 --> 00:08:33,940 normally be linked to an IP address of of your corporate office network and 101 00:08:33,940 --> 00:08:39,969 then we go down to the next layer there is the DB server and that is only access 102 00:08:39,969 --> 00:08:45,310 is only permitted to that from the app server and any other traffic there is no 103 00:08:45,310 --> 00:08:50,380 traffic over the wide internet that is allowed access to our database server so 104 00:08:50,380 --> 00:08:53,290 that's obviously a good practice to intermediate you've got a web 105 00:08:53,290 --> 00:08:55,950 application 106 00:08:56,640 --> 00:09:02,410 you will hear a lot in discussions with in AWS around reliability reliability 107 00:09:02,410 --> 00:09:08,800 you will hear the comment made to design for failure if you design for failure 108 00:09:08,800 --> 00:09:14,500 then nothing or fail what that means is that we need to not assume that our 109 00:09:14,500 --> 00:09:20,350 infrastructure is bulletproof and that no nothing will cause that the file we 110 00:09:20,350 --> 00:09:22,990 need to assume that something is going to go wrong and we need to make sure 111 00:09:22,990 --> 00:09:27,610 that we have appropriate mitigation strategies in place for that so the dot 112 00:09:27,610 --> 00:09:33,040 design principles around reliability we need to test recovery procedures and we 113 00:09:33,040 --> 00:09:38,020 need to make sure that we automatically recover from failure using things such 114 00:09:38,020 --> 00:09:43,240 as health checks on our Al B's on our ec2 instances and making sure that we 115 00:09:43,240 --> 00:09:48,520 recover quickly from a failed ec2 instance for example we need to scale 116 00:09:48,520 --> 00:09:53,350 horizontally to increase aggregate aggregate system availability so if one 117 00:09:53,350 --> 00:09:57,820 of those instances taken out is not going to affect the aggregate system 118 00:09:57,820 --> 00:10:03,900 overall and we can recover from that easier than if we have a small amount of 119 00:10:03,900 --> 00:10:09,490 instances that we're taking out stop guessing capacity we we shouldn't try 120 00:10:09,490 --> 00:10:15,340 and predict the future we should be having strategies in place to change our 121 00:10:15,340 --> 00:10:20,410 capacity in relation to demand so obviously having cloud watch alarms 122 00:10:20,410 --> 00:10:24,760 auto-scaling instances and that sort of thing use 123 00:10:24,760 --> 00:10:30,850 automation for changes use cloud formation use elastic Beanstalk use 124 00:10:30,850 --> 00:10:37,150 container services but automate those changes and manage those changes within 125 00:10:37,150 --> 00:10:43,720 the automation itself the best practices around reliability the foundations 126 00:10:43,720 --> 00:10:49,860 making sure that our VPC isolate the environment that we have for testing and 127 00:10:49,860 --> 00:10:54,010 make sure we identify the on-premise issues make sure they identified do we 128 00:10:54,010 --> 00:10:59,560 have appropriate bad which available for our connection to Amazon Web Services 129 00:10:59,560 --> 00:11:04,090 change management making sure that we can respond to changes making sure that 130 00:11:04,090 --> 00:11:09,000 when demand exceeds what we expected we had elasticity in place 131 00:11:09,000 --> 00:11:13,680 to to react to that making sure that we have a version control about a cloud 132 00:11:13,680 --> 00:11:18,120 formation templates making sure that we understand what has changed so we can go 133 00:11:18,120 --> 00:11:23,730 back to previous versions we can look at cloud trail logging and we can look at 134 00:11:23,730 --> 00:11:28,230 AWS config which details our resources and our configuration changes that have 135 00:11:28,230 --> 00:11:31,050 been made to our infrastructure so it's a very good 136 00:11:31,050 --> 00:11:34,740 that combined with cloud trail is a very good fault finding 137 00:11:34,740 --> 00:11:40,980 resource failure management obviously we need to have a good backup and recovery 138 00:11:40,980 --> 00:11:44,730 strategy we need to have highly available and fault tolerant 139 00:11:44,730 --> 00:11:51,380 architecture and we need to test we need to test on a regular basis 140 00:11:51,529 --> 00:11:55,889 performance efficiency and the design present principles around it are that we 141 00:11:55,889 --> 00:12:01,290 need to make sure that we use advanced technologies as a service so Amazon Web 142 00:12:01,290 --> 00:12:06,420 service one of the great features of that is that it is a very broad offering 143 00:12:06,420 --> 00:12:12,959 and it has some really quite high tech services that are available for you so a 144 00:12:12,959 --> 00:12:16,949 good example that would be machine learning for you to actually get 145 00:12:16,949 --> 00:12:21,029 something like that up and running and being able to understand the very 146 00:12:21,029 --> 00:12:25,589 complex algorithms around that would be quite difficult for you your if you're 147 00:12:25,589 --> 00:12:34,130 not a specialist in that area you can go global in minutes you should use service 148 00:12:34,130 --> 00:12:40,230 architectures s38 against lambda again you don't have to worry about your 149 00:12:40,230 --> 00:12:45,390 architecture then you can have a service environment experimentation again this 150 00:12:45,390 --> 00:12:48,839 is all part of they continually improve and experiment with things see what 151 00:12:48,839 --> 00:12:53,640 works see what doesn't work making sure that your architecture architecture has 152 00:12:53,640 --> 00:12:58,890 mechanical sympathy it is aligned with our business we need to understand what 153 00:12:58,890 --> 00:13:02,970 are our business objectives and is this architecture aligned towards those 154 00:13:02,970 --> 00:13:08,819 business objectives the best practice is around performance efficiency so 155 00:13:08,819 --> 00:13:15,329 selection making sure that we have the appropriate compute resources so what 156 00:13:15,329 --> 00:13:18,779 what are we using are we're using instances are using containers are we 157 00:13:18,779 --> 00:13:23,859 using lambda functions storage what are the active access 158 00:13:23,859 --> 00:13:29,589 messages access methods are using block file or reusing objects patterns of 159 00:13:29,589 --> 00:13:34,299 access random or sequential the throughput required the frequency of 160 00:13:34,299 --> 00:13:39,249 access the frequency of updates data and the availability and durability 161 00:13:39,249 --> 00:13:43,809 constraints that we're working within with databases we need to have a 162 00:13:43,809 --> 00:13:48,339 data-driven approach again no emotion there we need the data to tell us are we 163 00:13:48,339 --> 00:13:51,729 going to go with no sequel are we going to go with sequel and what is the best 164 00:13:51,729 --> 00:13:57,999 solution for us also look at non database solutions using a search engine 165 00:13:57,999 --> 00:14:04,319 using an elastic search what using a data warehouse such as redshift and 166 00:14:04,319 --> 00:14:11,649 network looking at cloud front using rope 53 for latency routing and also 167 00:14:11,649 --> 00:14:15,869 don't forget about Direct Connect certainly the gold standard of 168 00:14:15,869 --> 00:14:22,809 networking with Amazon Web Services we need to have regular review making sure 169 00:14:22,809 --> 00:14:27,279 that we are up to date with new innovations and releases from AWS and 170 00:14:27,279 --> 00:14:32,739 any other applications that are running on our AWS infrastructure we need to 171 00:14:32,739 --> 00:14:36,189 monitor we need to use cloud watch we need to have performance metrics and 172 00:14:36,189 --> 00:14:41,499 have those performance metrics aligned with what we're trying to achieve and we 173 00:14:41,499 --> 00:14:46,720 need to have trade-offs in place so we need to have an optimal approach we and 174 00:14:46,720 --> 00:14:50,979 that needs to be variable over time so we don't have something set in stone we 175 00:14:50,979 --> 00:14:56,919 need to look at the consistency the durability space versus time latency and 176 00:14:56,919 --> 00:15:00,399 what what are we trading off here and what is the best thing that's going to 177 00:15:00,399 --> 00:15:05,589 give us the the best bang for our buck in relation to our business objectives 178 00:15:05,589 --> 00:15:11,409 that we're trying to achieve cost optimization design principles around 179 00:15:11,409 --> 00:15:17,619 that we have a consumptive consumption model we pay as we use it we had 180 00:15:17,619 --> 00:15:22,989 economies of scale with AWS it's a great a great tool for that we can really 181 00:15:22,989 --> 00:15:26,499 reduce our costs because we're not employing people to look after our 182 00:15:26,499 --> 00:15:29,680 infrastructure and we don't have the risk involved in that 183 00:15:29,680 --> 00:15:34,640 so again stop spending money on data center operations outsource are 184 00:15:34,640 --> 00:15:39,850 completely put it on AWS it just makes sense analyze and attribute attribute 185 00:15:39,850 --> 00:15:44,330 expenditure making sure that we have resource tagging making sure that we 186 00:15:44,330 --> 00:15:49,370 have billing alerts and using managed services to reduce the cost of ownership 187 00:15:49,370 --> 00:15:54,980 so again if we're looking for something like machine learning or one of these or 188 00:15:54,980 --> 00:16:00,980 Hadoop or Amazon quick side for analysis these are great many services that will 189 00:16:00,980 --> 00:16:06,410 greatly reduce our cost of ownership for that and we have a number of best 190 00:16:06,410 --> 00:16:10,190 practices around it so using cost-effective resorts resources and a 191 00:16:10,190 --> 00:16:14,780 good example of that is ec2 instances we have a number of different options 192 00:16:14,780 --> 00:16:19,820 available there we can we can look at using spot instances to bid on capacity 193 00:16:19,820 --> 00:16:25,160 if we find that we don't particularly need something done at this particular 194 00:16:25,160 --> 00:16:29,000 time we can do it when when the price is good so don't forget about spot 195 00:16:29,000 --> 00:16:35,780 instances matching supply and demand auto scaling making sure that we have 196 00:16:35,780 --> 00:16:40,910 cloud watch alarms that we're going to trigger auto scaling events making sure 197 00:16:40,910 --> 00:16:47,090 that we have a decoupled environment if we're using sqs for example d to 198 00:16:47,090 --> 00:16:51,740 decouple our environment we need to make sure that we're aware of our expenditure 199 00:16:51,740 --> 00:16:55,400 again billing alerts and resource tagging and we need to optimize over 200 00:16:55,400 --> 00:17:01,100 time making sure that we are fully understanding what the new technology 201 00:17:01,100 --> 00:17:04,940 that is out there we're making sure that we're decommissioning resources that are 202 00:17:04,940 --> 00:17:10,160 not required and we do that using AWS trusted adviser which is great tool but 203 00:17:10,160 --> 00:17:14,510 you should will identify areas that we can improve and resources that we're not 204 00:17:14,510 --> 00:17:22,180 particularly utilizing properly so here's a simple example of cost 205 00:17:22,180 --> 00:17:29,570 optimization immolation to elasticity so we can see there in the on the on the 206 00:17:29,570 --> 00:17:34,269 red line we've got the actual demand and we don't try and predict that in any 207 00:17:34,269 --> 00:17:37,330 way to do that we it's not cost-effective for us to try and predict 208 00:17:37,330 --> 00:17:42,220 that because it was always going to be an error with that so the blue line 209 00:17:42,220 --> 00:17:46,630 there the blue dotted line is a scale-up approach it's our vertical scaling 210 00:17:46,630 --> 00:17:54,700 approach and what happens is that when we find that our demand is starting to 211 00:17:54,700 --> 00:17:59,559 creep in on our capacity and we are at risk of of having problems because of 212 00:17:59,559 --> 00:18:04,149 that we go out and we replace a server with another bigger server and we have a 213 00:18:04,149 --> 00:18:08,649 big step change that goes on there the problem with that is that we have a big 214 00:18:08,649 --> 00:18:14,320 gap between when we do have that step we have a big gap between our actual demand 215 00:18:14,320 --> 00:18:19,210 and what we're then what our actual capacity to service at the mat is so 216 00:18:19,210 --> 00:18:23,590 it's expensive to do that it's costly to do that then we can go to a scale out 217 00:18:23,590 --> 00:18:28,179 approach and manual scale out approach so we're adding smaller instances as the 218 00:18:28,179 --> 00:18:32,559 main increases so we're still going to get that step approach but it's going to 219 00:18:32,559 --> 00:18:36,549 be better than scaling out vertically and then we can look at automated 220 00:18:36,549 --> 00:18:41,440 elasticity so where we would have an elastic load balancer where we'd have an 221 00:18:41,440 --> 00:18:47,409 auto scaling group that would scale out the scaling depending on our demand and 222 00:18:47,409 --> 00:18:55,539 so that would actually follow our actual demand and the gap between our capacity 223 00:18:55,539 --> 00:18:59,679 that has been implemented from automated elasticity and our actual amande will be 224 00:18:59,679 --> 00:19:03,220 significantly less than we will obviously be saving a significant amount 225 00:19:03,220 --> 00:19:09,460 of money if we go down with that approach operational excellence and that 226 00:19:09,460 --> 00:19:14,830 is the last of the the five deep pillars so the design principle around that 227 00:19:14,830 --> 00:19:20,169 perform operations with code don't manually perform operations try and do 228 00:19:20,169 --> 00:19:24,399 it with code and then you'll have good version control around it and you can 229 00:19:24,399 --> 00:19:28,570 identify what an appropriate configuration is and you can go back and 230 00:19:28,570 --> 00:19:34,269 and identify when problems occur so always perform operations with code make 231 00:19:34,269 --> 00:19:37,389 sure that you're aligned what you're doing with your business objectives and 232 00:19:37,389 --> 00:19:41,919 of course make sure that if you have performance metrics that they are 233 00:19:41,919 --> 00:19:44,970 closely aligned with what your business strategy 234 00:19:44,970 --> 00:19:50,250 is make regular small incremental changes don't make big changes make 235 00:19:50,250 --> 00:19:56,909 small ones but and more of them test for responses to unexpected events and learn 236 00:19:56,909 --> 00:20:03,960 from unexpected events and failures and keep your operations procedures current 237 00:20:03,960 --> 00:20:08,159 dangerous writer months don't make them in one situation they are a living 238 00:20:08,159 --> 00:20:13,289 document and they should be updated and regular updated as our architecture is 239 00:20:13,289 --> 00:20:17,640 also regularly updated so best practices around it's like preparation we need to 240 00:20:17,640 --> 00:20:26,220 identify problems at the design stage it is far better and far less expensive to 241 00:20:26,220 --> 00:20:30,870 fix problems at the design stage it's very easy but to do it after it rolled 242 00:20:30,870 --> 00:20:36,750 out in our infrastructure it's far more difficult and explosive exposes us to 243 00:20:36,750 --> 00:20:42,929 far more risk we have operations checklists how we're going to run how 244 00:20:42,929 --> 00:20:47,190 we're going to operate this and what do we need to check off to make sure that 245 00:20:47,190 --> 00:20:52,020 their infrastructure is operated correctly we have run books that clearly 246 00:20:52,020 --> 00:20:55,770 detail how to run a mature infrastructure how to operating 247 00:20:55,770 --> 00:21:01,440 infrastructure and play books which are what we use if something goes wrong we 248 00:21:01,440 --> 00:21:07,260 need to be able to respond if we have a business continuity event we need to we 249 00:21:07,260 --> 00:21:11,280 need to look at our play books and see how we're going to mitigate that damage 250 00:21:11,280 --> 00:21:16,080 and how we're going to recover from that event and we have cloud formation again 251 00:21:16,080 --> 00:21:22,350 making sure that we use cloud formation to to define our architecture is always 252 00:21:22,350 --> 00:21:27,690 a good thing in operations making sure that we had a continuous integration 253 00:21:27,690 --> 00:21:33,720 continuous deployment environment and making sure that our monitoring that we 254 00:21:33,720 --> 00:21:39,900 implement is aligned with our business objectives and in relation to responses 255 00:21:39,900 --> 00:21:46,409 we need to automate responses wherever possible using predefined playbooks that 256 00:21:46,409 --> 00:21:51,900 will detail what needs to occur when we have an unexpected event and that will 257 00:21:51,900 --> 00:21:56,190 include a list of the appropriate stakeholders it'll have the escalation 258 00:21:56,190 --> 00:21:58,770 process that we need to go through to make sure the 259 00:21:58,770 --> 00:22:03,570 deliver aware of what is going on and also the procedures that are going to be 260 00:22:03,570 --> 00:22:13,590 following so I'm just going to finish up here with having a look at a a model 261 00:22:13,590 --> 00:22:19,290 architecture for you on AWS reference architecture for a web application so 262 00:22:19,290 --> 00:22:24,030 what we've got there if we start if we go from the start we've got Amazon route 263 00:22:24,030 --> 00:22:28,830 53 which we're going to use for a DNS resolution for our domain name and then 264 00:22:28,830 --> 00:22:33,360 we're also going to have Amazon Cloud Front and that is going to be delivering 265 00:22:33,360 --> 00:22:38,429 our resources and static content that's not going to be changing too much we can 266 00:22:38,429 --> 00:22:43,110 also use it for our dynamic resources you'll be have a low TTL or else we can 267 00:22:43,110 --> 00:22:47,760 just let that pass straight through to our elastic load balancer our elastic 268 00:22:47,760 --> 00:22:54,480 load balancer is going to distribute requests across our ec2 instances that 269 00:22:54,480 --> 00:23:00,809 we're going to look after our web application layer and they will be added 270 00:23:00,809 --> 00:23:06,450 and terminated by an auto scaling group and then we have another layer our 271 00:23:06,450 --> 00:23:10,920 application layer and we're going to have an internal load balancer and that 272 00:23:10,920 --> 00:23:15,660 internal load balancer will distribute it requests from the web application 273 00:23:15,660 --> 00:23:20,880 layer across to again another auto scaling group of application ec2 servers 274 00:23:20,880 --> 00:23:27,600 and then those application servers will be able to communicate to an Amazon RDS 275 00:23:27,600 --> 00:23:32,309 master database and because it's operating in a multi AZ environment 276 00:23:32,309 --> 00:23:36,179 we're going to also have a standby instance there with synchronous 277 00:23:36,179 --> 00:23:40,980 replication between it so that is a really good environment that really good 278 00:23:40,980 --> 00:23:47,610 architecture that we just use if you are going to implement a web application in 279 00:23:47,610 --> 00:23:51,330 your enterprise so there's no point in reinventing the wheel here so there are 280 00:23:51,330 --> 00:23:55,830 a number of these reference architectures quite a lot of them and 281 00:23:55,830 --> 00:23:58,679 you need to have a look at those and make sure that you understand those 282 00:23:58,679 --> 00:24:04,530 because they are what AWS defined as being a good framework for certain app 283 00:24:04,530 --> 00:24:08,660 for for certain applications and use cases 284 00:24:08,880 --> 00:24:14,780 so those reference architectures will be available at the AWS architecture site 285 00:24:14,780 --> 00:24:21,600 so they go to aws.amazon.com slash architecture and that will have AWS 286 00:24:21,600 --> 00:24:25,200 reference architectures a whole heap of them so make sure that you have a look 287 00:24:25,200 --> 00:24:29,460 at those and understand those before you go ahead with setting an exam it will 288 00:24:29,460 --> 00:24:34,920 also have AWS QuickStart reference deployments you won't necessarily need 289 00:24:34,920 --> 00:24:39,540 to understand those for the exam but be aware that they are there and they're 290 00:24:39,540 --> 00:24:43,650 very very good because you again you're not reinventing the wheel so you want to 291 00:24:43,650 --> 00:24:49,530 deploy for example a a MongoDB database with replication or whatever there will 292 00:24:49,530 --> 00:24:53,760 be a QuickStart reference to point for that so again have a look at that as 293 00:24:53,760 --> 00:25:00,210 well and take advantage of that in your working environment so that's it for 294 00:25:00,210 --> 00:25:05,850 architecture as it relates to being a solute certified Solutions Architect 295 00:25:05,850 --> 00:25:11,030 associate I will see you in the next lesson