Provisioners Introduction
In this lesson we'll discuss Terraform provisioners in detail.
We'll cover the following
Terraform provisioner#
A provisioner in Terraform can run a script either remotely or locally after a resource has been created. Hashicorp added provisioners to Terraform to allow for certain scenarios that are not natively supported by the provider you are using. I include them here for completeness. However, Hashicorp recommends that they are a last resort as they are imperative. Therefore, Terraform has no way of knowing how to apply a change you make to your script to the real world like it does with normal resources.
For example, say you have a provisioner script that installs some yum packages on a Linux server and then runs a command. If you want to update that provisioner script, you need to ensure that the newly updated script works for a machine that has never had the script run just as it does for one that has had the old version run. This headache could be avoided by using Terraform in the first place, which is one of the main reasons they are discouraged.
Project example#
The example code for provisioners is quite a bit more in-depth than the examples we have been using so far. By this point, though, we are ready to take on a bit of a more ambitious Terraform project.
variables.tf
file#
To set up this project, we’ll create a new file called variables.tf
and
paste in the following code:
output.tf
file#
We’ll Create a file called output.tf
and paste in:
main.tf
file#
Then we’ll create a file called main.tf
and paste in the following code:
/
- main.tf
📝Note: Please don’t forget to run
terraform destroy
to destroy the project after running all the desired commands.
Code explanation#
Let’s start at the top.
my_ip
variable#
The variable my_ip
should be set to your IP address. To find out your current IP address, visit https://www.whatismyip.com/ or curl ifconfig.co
. You can then default your IP address as the value of the my_ip
variable either by setting the default
parameter
or by creating a terraform.tfvars
file and setting my_ip = {your ip}
(or any of the other methods we have discussed in the Variables chapter). We are using this variable to allow only your IP address through the firewall that we will set up in AWS since it is generally bad practice to open port
22
to the whole world (0.0.0.0/0
).
main.tf
file code explanation#
Turn to the file main.tf
. We are setting up a vpc
, which is a virtual private cloud in AWS. Essentially, this is a ring-fenced private network where we are going to put our infrastructure. The
internet gateway allows machines inside our VPC to access the internet. Next, we create a public subnet and a route table. We create a default route (0.0.0.0/0
) and point that to our internet gateway. This basically says that if you are looking for an IP address outside of the range of our VPC CIDR (10.0.0.0/16
), go out through the internet gateway. We then need to
associate this route table with our subnet using an aws_route_table_association
.
AWS security groups#
By default, AWS security groups, which are basically the firewall around a subnet, are DENY all. Thus, to allow traffic in and out of our subnet, we need to open up ports in (ingress) and out (egress).
This is what we are doing with the aws_security_group
resource. We will open port 22
ingress so that we can ssh
onto the server and run a script. We will use the my_ip
variable because we only want to open up port 22
to your IP.
We do this because we do not want to open port 22
to the world since that would mean anyone can try and get onto that server. We will open port 80
ingress to hit a website over HTTP running on the server. This is ok to open to any address since it’s bound to our web server.
Lastly, we will create a rule so that all traffic can egress our server on any port. In practice this isn’t great and you would want to lock it down, but it makes getting our example up and running easier.
aws_key_pair
resource#
The aws_key_pair
resource creates a key pair in AWS. A keypair is used to ssh
into a server. The
keypair that we are setting up is from a local file in the directory. This keypair will not exist on
your machine. To create it, follow these steps:
-
To generate the keypair, run
ssh-keygen
File name:
nginx_key
No passphrase
-
You should now have two files in this directory,
nginx_key
andnginx_key.pub
.
📝 Note that you should never check in the private key from a keypair into source control. This is the file without the
.pub
extension. It is also good practice to set the file permissions on your private key to400
.
aws_ami
block#
The aws_ami
block simply looks up the latest AWS AMI image to use for the EC2 instance that we are going to create.
aws_instance
#
The last part of the script is the aws_instance
. We have finally arrived at the resource where our
provisioner is defined. A provisioner is essentially a script that Terraform will execute. We start the provisioner block on the resource with the keyword provisioner
. After provisioner
, the value remote-exec
is the type of provisioner we want to use. remote-exec
means that Terraform will execute this provisioner remotely on the resource itself; in this case, on the EC2 machine. The other common option is local-exec
, where the script will run on the machine that Terraform is running from (i.e., your machine). There are a few other provisioners available as well such as chef
and puppet
, but we’re not going to discuss those in this course.
Setting the script#
We set the script that we want to run (which is a list of strings) as the inline parameter value.
Each element in the list corresponds to a line in our script. The script installs nginx on our server and starts it on port 80
. For those who do not know, nginx is a web server/proxy program that can do many cool things. For our purposes, though, we are simply using it as a standard web server. The script sets nginx to print the text Hello from nginx on AWS
when you hit it on port 80
.
Connection block#
The connection block tells the provisioner how to connect to the EC2 instance. We set the host
parameter to the server’s IP address that will be created (host = aws_instance.nginx.public_ip
) and type ssh
(as we want Terraform to ssh
onto the box) to run the script. The user is ec2-user
, which is the default user that AWS sets up for use with ssh
. We then set the private key to connect via ssh
. This is the private part of the keypair that we generated earlier.
Running the project#
First, you need to create nginx_key
file to run the project. Follow the following steps:
-
Click the RUN button
-
Generate keypair
ssh-keygen
Filename:nginx_key
No passphrase -
You should now have two files in the project directory:
nginx_key
andnginx_key.pub
. This keypair will be used with the EC2 instance that we create -
Run
terraform apply
Terraform will take a couple of minutes to run because it takes a little while to spin up a new instance in AWS. You will see Terraform ssh
onto the box and run the script that we defined. We have a handy output
in our Terraform code that prints out a curl
command for you to run and test that everything is working. When you run this curl, you should see Hello from nginx on AWS
. You can also visit the IP address of the instance you have created in a browser, where you should see the same text.
Let’s take a moment and absorb what we have just done. We have just created a fully private network, set up a firewall, created a web server, installed a custom script, and got it to print a message when we visit a page. This should start to show you how easy it is to configure infrastructure using Terraform.
Prototyping in Terraform#
It is often so fast to use Terraform that even when I am doing a quick prototype I use Terraform. This comes with the added advantage that if the prototype works out, you have already written the Terraform code. You can take that as a first draft to check, and if it does not work out then you can use Terraform to destroy everything.
ssh
onto the machine#
If you want to ssh
onto the machine yourself, then you can use the command: ssh -i nginx_key ec2-user@{ip}
where {ip}
is the IP address of the machine that you created. The -i
denotes to
ssh
that you want to use a keypair to connect to the machine. Once on the machine, you can curl localhost
to see the same output.
Do not forget to destroy this project when you are done by running terraform destroy
and
confirming with yes
.
Changing aws_instance
#
We started this chapter by saying that provisioners should be avoided. If that is the case, you may wonder if we can achieve the same outcome as we have just done by not using a provisioner. The
answer is yes! AWS has a built-in way to run a script upon a new machine start (most cloud providers do). To do this, set the script as the user_data
on the machine and remove the provisioner
code. Here is the changed aws_instance
to use userdata
:
I have altered the script slightly in user_data
. The reason being that in AWS, when user_data
is being executed, it is being run as the root user. Contrast that to running the script via a provisioned; we were running as ec2-user
which is a limited access user. In that case, to make system changes (such
as install nginx), we had to elevate to root by prefixing our commands with sudo
. The other small difference is that we are starting the script with a shebang line (#!/bin/bash
). This is so it will be executed with bash, and setting ex
means print each line that is executed and stop upon error.
📝Note: You can see the output of the startup script by examining
/var/log/cloud-init-output.log
.