Data Sources in Detail

This lesson will teach you about Terraform data and how it is to used.

Terraform data source#

A data source in Terraform is used to fetch data from a resource that is not managed by the current Terraform project. This allows it to be used in the current project. You can think of it as a read-only resource that already exists; the object exists, but you want to read specific properties of that object for use in your project.

A data source in Terraform is used to fetch data from a resource that is not managed by the current Terraform project

Project example#

Let’s dive into an example:

main.tf file of data source project example

As you can see from the above project, a data source block starts with the word “data”. The next word is the type of data source. We are using an aws_s3_bucket data source, which is used to lookup an S3 bucket. After the data source type, we give the data source an identifier, in this case "bucket". The identifier is used to reference the data source inside the Terraform project. The data source block is opened with a {. You then specify any properties you want Terraform to use to search for the resource. We are using the complete name of the S3 bucket we are looking for. You then close the data source block with }.

Referencing a bucket#

Rather than creating the bucket as we did before, here we are referencing a bucket that already exists. So, before you run the above project, you will need to create an S3 bucket with the name that you specify inside the data block. In the example above, the bucket would be called kevholditch-already-exists. Name the bucket anything you want, but make sure to paste the name into the bucket property in the data source.

AWS IAM policy#

At the bottom of this project, we create an AWS IAM policy that gives permissions to list the bucket that we looked up in the data source. There are a couple of new concepts in the aws_iam_policy resource that we first need to introduce. The IAM policy itself is a multi-line string enclosed in between <<POLICY and POLICY. This is how you define a multi-line string in Terraform. You open the multi-line string with << then place any identifier you wish as a single word. I have used POLICY in the example above since we are defining an IAM policy, but you can use any label like <<STATEMENT or <<IAM. You then start your multi-line string on the next line and finish it with the opening identifier without the <<.

📝Note: The closing marker must be at the start of a new line. Otherwise, it is a syntax error.

S3 bucket data source usage#

Inside the IAM policy, we are using the S3 bucket data source. We take the arn from the S3 bucket so that we can use it in our IAM policy. You will notice that to get the value, we use the interpolation syntax ${data.aws_s3_bucket.bucket.arn}. The opening ${ and closing } are necessary because we are inside a multi-line string. They tell terraform that we want it to evaluate this value and not use it as a string literal. The format of a data source expression is data.<data_type>.<data_identifier>.<attribute_name>. You can get a full list of the attributes that a data resource provides from the documentation website of the provider.

We are taking the arn from the S3 bucket so that we can use it in our IAM policy

Running the project#

Clicking on the terminal will run the terraform init to initialise Terraform followed by terraform apply.

📝Note: Kindly do change the bucket name before running the project.

This code requires the following environment variables to execute:
access_key_id
Not Specified...
secret_access_key
Not Specified...
/
main.tf
Running the Terraform project

You will see that Terraform only created a single resource, the IAM policy. This is because the S3 bucket we are using is created outside of Terraform

📝Note: Once you are done with the project, do not forget to run terraform destroy and then confirm with yes to delete all of the resources in this project.

🎉Quiz
How are Data Sources Useful?
Mark as Completed
Report an Issue