Terraform is an open-source tool developed by HashiCorp that allows developers and DevOps engineers to write infrastructure as code for most public cloud platforms, such as AWS, GCP, and Azure. The core purpose of Terraform is to allow users to write code for both cloud and on-premises resources in a human-readable format, mostly YAML or HCL. Since the infrastructure is defined as code, it can be added to version-control systems like Git, configured, automated, and easily shared with other users.
Terraform has a modular approach to writing infrastructure as code. It allows users to publish various Terraform providers into a registry for others to download for their projects. There are a plethora of providers available, some of the most popular ones being AWS, Kubernetes, Azure, and GCP.
To make it more robust and dynamic, Terraform lets users define variables and later refer to them in any of the resources. You can obtain and store output from resources in such variables. These outputs may serve as inputs while creating resources.
While there are a ton of resources available on the internet to work with Terraform, this article will dive a bit deeper to discuss some of Terraform’s advanced concepts:
To learn more about Terraform in detail, please refer to the official documentation.
The source code used in this article can be found here.
In a simple Terraform project, all the resources are declared in a root module that can be used to deploy resources to the cloud. However, with the increase in infrastructure and project complexities, it becomes almost impossible to manage all the resources under the root module.
Terraform introduces the concept of defining multiple modules that are sub-modules of the root module. These sub-modules can be defined and stored in separate directories and referenced from the root module. One of the critical aspects of using sub-modules is setting dependencies between the modules. For example, if module A depends on module B, then it can be specified that module A gets deployed only after all the resources from module B have been deployed successfully.
Let’s understand this better via a practical example. Consider a scenario where you need to deploy two buckets to S3. We can consider the following naming convention:
main-bucket
dependent-bucket
Although it is a reasonably simple deployment, the idea is to showcase how to define and use a multi-module setup in Terraform. The source code is available here. We will define the resources for both these buckets in separate modules, which will be referenced from the root module. The directory structure of the project is as follows:
Fig. 1: Directory structure for multi-module setupThe root.tf is the root module, and the provider.tf contains information about the AWS provider, the backend state, and credentials. Under the modules directory, we have two sub-modules defined as main-bucket and dependent-bucket:
<script src="https://gist.github.com/aveek22/5a4da55745b936b8128c7ecbc8842fb9.js"></script>
<script src="https://gist.github.com/aveek22/91b0894493d5b357dc365e156e491c25.js"></script>
<script src="https://gist.github.com/aveek22/20849cf8cbde4ec2956c35ff462a950c.js"></script>
Fig. 2: Terraform multi-module resource definitionsThe actual S3 resources are defined in the main.tf and dependent.main.tf files under each sub-module, whereas the root module has a source reference to the physical path of the sub-modules. Also, notice how the dependent bucket has a depends_on attribute that specifies that it is dependent on another sub-module.
From the terraform-module
directory, run the terraform init
command to initiate terraform. Next, run terraform plan
to generate the deployment plan. The output of this command will be as follows:
<script src="https://gist.github.com/aveek22/4f7407501d8225e4728ffd10fd393b48.js"></script>
Once confirmed, then run terraform apply
to deploy the resources to your AWS account. This will create two buckets in your AWS account with different names. This is a simple use case of using a multi-module setup.
To work with multiple state files in Terraform, it is important to first understand the concept of a state file. When you execute Terraform code, it generates a plan and then executes the plan to deploy resources to the cloud environment. The state file is a JSON mapping of the resources from the Terraform file and the infrastructure created in the cloud environment. Each time Terraform is executed, it will check the status of the infrastructure from the state file and generate a plan based on that.
Sometimes, you may need multiple state files to deploy the resources. A popular use case is when you must deploy the same resources to multiple environments—development, QA, and production. Although it is possible to deploy resources to multiple environments using a single state file, as a best practice, store it separately. This also allows developers to have more granular deployments over different environments on the cloud. Let’s take a look at a practical example.
Consider a scenario where you must deploy resources to two environments: production and QA. Both of these environments will reside on the same AWS account. In such cases, it would be nice to separate both environments using multiple state files, one for each environment. The directory structure of this approach could be as follows:
Fig. 3: Directory structure for multiple state filesFollowing the codebase here, we have two directories, env-prod
and env-qa
, that contain the root modules - root-prod.tf
and root-qa.tf
, respectively. The resources can be declared as sub-modules to keep them modular. In this case, we have an S3 bucket resource that can be referred from both environments. Also, notice variables.tf,
where we declare the environment variable as a suffix to the bucket name. The value of this variable is provided from the root module, which is then passed to the sub-module:
In this approach, we have declared two state files as the backend key, which will be created once the resources are created. Also, notice how the variable env_name
has different values assigned to it from both prod and QA.
To initiate terraform, follow the steps as discussed in the previous section. Run terraform init
from the terraform-state
directory. Continue with terraform plan
and then terraform apply
to deploy the resources.
The output will be as follows:
<script src="https://gist.github.com/aveek22/96c102e29010fa0d3548d3624e6aa8d4.js"></script>
There are certain reasons why an organization should set up multiple AWS accounts in Terraform. This may be done to segregate resources by environment, business line, or department within the organization.
In larger organizations, where there are multiple production workloads running, it is a best practice to isolate production and non-production resources. From a developer’s perspective, this means maintaining the Terraform resources in one codebase but being able to deploy resources to multiple AWS accounts in Terraform.
We’ll explore the following ways to implement multiple accounts with Terraform:
In this example, let’s take a look at how to set up a multi-account configuration using Terraform workspaces using provider aliases.
Terraform Workspaces is available with the Terraform CLI. Workspaces in Terraform allow developers to create isolated backend configurations. To view the existing workspaces, you can simply open a terminal and type terraform workspace list
. This will show you the list of available workspaces on your machine. By default, you should have a workspace named “default.”
Let’s create two more workspaces, dev and prod, using the following commands:
terraform workspace new dev
terraform workspace new prod
Snippet 1: Input Scripts
Select the development Terraform workspace using terraform workspace select dev
.
Next, we’ll define the terraform provider for dev and prod workspaces as follows:
<script src="https://gist.github.com/aveek22/19600c678b35c931764db8daf5aa3ede.js"></script>
Notice there are two provider blocks with different aliases and profiles. Each alias explicitly uses Terraform providers as and when required. We will use these aliases later to create our resources based on the workspace. The profile relates to the AWS profile available within your .aws directory. You need to set up your AWS credentials and config with access and secret keys.
For example, your .aws/credentials
file should look like this:
Now, we will define the main module for the multi-account setup. This module will invoke a sub-module that creates a bucket in S3. The bucket name and the target account depends on the AWS workspace. Modify the main module as follows:
<script src="https://gist.github.com/aveek22/5505b47a4886fe9cf06f5e8d39821f58.js"></script>
Here, we have two modules “s3_bucket_dev” and “s3_bucket_test” that refer to the same sub-module but with different parameters:
The sub-module for S3 contains the following script:
<script src="https://gist.github.com/aveek22/e349cfaf49c271de7e408c2560d50b9a.js"></script>
When all these pieces of code are put together, we can initiate Terraform and run terraform plan
and terraform apply
to deploy the changes. Make sure that the correct Terraform workspace is selected:
This will create an S3 bucket on the development AWS account. Similarly, to create the S3 bucket on the production account, select the production Terraform workspace and run Terraform:
Fig. 8: Terraform plan output - PRODTerragrunt is a wrapper around Terraform that allows developers and DevOps engineers to keep their Terraform configurations clean and reuse them in different modules. It supports the use of multiple modules and also helps manage the remote Terraform state.
Some of the key features that Terragrunt provides are the following.
For example, consider a use case where you need to define multiple Terraform state files in your project. By default, Terraform doesn’t allow using variables in the backend configuration of your project. However, with the help of Terragrunt, you can parameterize your Terraform backend configurations by writing the configurations inside a terragrunt.hcl
file.
While using the Terraform CLI, variables can be passed as an argument to the -var-file
parameter. If there are multiple variable files that need to be passed, the CLI gets cluttered, and it becomes tedious for a developer to remember all the variables during execution. Terragrunt allows you to define these variables in the terragrunt.hcl
configuration file and hence keeps your CLI dry.
If you want to maintain multiple environments within your Terraform project, there might be lots of code duplication within those environments. To keep your Terraform configuration dry, Terragrunt allows you to define reusable modules and even code to initiate a new module.
You must install Terragrunt as a separate library, which can be done using Homebrew on macOS/Linux or downloading the executables for Windows. Let’s take a look at how using Terragrunt might solve some of the challenges faced with Terraform.
While Terraform allows users to define multiple backend states for different modules, it does not allow the use of variables. This means you need to repeat the same statements to configure the backend state for every Terraform module you have in your project. This is not only time-consuming but also error-prone.
Take a scenario where you need to deploy a frontend application and a backend database to AWS using different state files and for different environments. You can create the directory structure as follows:
Fig. 9: Terragrunt project structureWe have a terragrunt.hcl
file at the root of the environment directory and one each under the frontend application and backend database. In this arrangement, you can refer Terragrunt’s find_in_parent_folders()
helper to dynamically generate the backend path for the state files.
In the root, create the backend configuration once and mention the key with the path_relative_to_include()
helper function. Inside each of the modules, you do not define a path; instead, you make a reference to the root Terragrunt configuration. During the initiation, Terragrunt will automatically generate the relative path and replace it with the bucket keys.
You can also configure Terraform to use dynamic backend states using Terragrunt.
Terraform is a powerful tool that helps developers and DevOps engineers automate deployments across multiple cloud environments. Terraform is highly configurable, so organizations can implement various strategies that suit their needs.
Developers can additionally integrate Terraform with CI/CD pipelines that check the state of resources during the build phase and thus automate the process of creating resources. In real life, a combination of multi-module and multiple-state files along with Terragrunt can help organizations optimize their build pipelines.
The article starts by stating a brief introduction about Terraform. Without going much in details, we mention some of the advanced topics with Terraform that aren’t usually covered in most of the articles. We will talk in brief about multi-module setup, handling multiple state files, managing multiple AWS Accounts and some highlights about Terraform Workspaces.
In the next section, we go in detail about each of the topics mentioned above. We will start with a better understanding, benefits of the setup and how to practically achieve it. In the final section, we will mention using Terragrunt as an option to use as a wrapper for Terraform workloads with some examples. Finally, we conclude by highlighting the key areas from the article and some next steps for the reader.
Write for Site24x7 is a special writing program that supports writers who create content for Site24x7 “Learn” portal. Get paid for your writing.
Apply Now