Testing & Validation
Ensure your Terraform configurations are correct before apply
Testing & Validation
In the previous tutorial, we learned how to import existing resources. Now let's talk about something that'll save your bacon: testing your infrastructure before it hits production.
Deploying broken infrastructure is expensive. Testing catches problems before they cost you money, downtime, or your sanity. Terraform has built-in validation, native testing, and integrates with third-party tools. Let's explore all of them.
Built-in Validation
These are free and built right into Terraform. No excuses not to use them.
terraform validate
Checks syntax and internal consistency ā the "does this even make sense?" check:
terraform validate
# Success! The configuration is valid.
What it checks:
- HCL syntax
- Provider requirements
- Required arguments
- Type constraints
- Internal references
What it doesn't check:
- Actual cloud resources (it doesn't talk to AWS)
- State consistency
- Variable values
- Provider credentials
So it's more like a spell-checker than a fact-checker.
terraform fmt
Consistent formatting is a form of validation ā messy code hides bugs:
# Check formatting
terraform fmt -check
# Auto-fix formatting
terraform fmt
# Recursive
terraform fmt -recursive
terraform plan
The real validation ā shows what will actually change. This is your last line of defense:
terraform plan
# Save plan for later apply
terraform plan -out=tfplan
terraform apply tfplan
Variable Validation
"Can I prevent people from deploying with bad inputs?"
Absolutely. Catch garbage in, prevent garbage out:
variable "environment" {
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
variable "instance_type" {
type = string
validation {
condition = can(regex("^t[23]\\.", var.instance_type))
error_message = "Only t2 or t3 instance types allowed."
}
}
variable "cidr_block" {
type = string
validation {
condition = can(cidrhost(var.cidr_block, 0))
error_message = "Must be a valid CIDR block."
}
}
variable "port" {
type = number
validation {
condition = var.port >= 1 && var.port <= 65535
error_message = "Port must be between 1 and 65535."
}
}
Multiple Validations
variable "bucket_name" {
type = string
validation {
condition = length(var.bucket_name) >= 3 && length(var.bucket_name) <= 63
error_message = "Bucket name must be 3-63 characters."
}
validation {
condition = can(regex("^[a-z0-9][a-z0-9.-]*[a-z0-9]$", var.bucket_name))
error_message = "Bucket name must start and end with lowercase letter or number."
}
validation {
condition = !can(regex("\\.\\.", var.bucket_name))
error_message = "Bucket name cannot contain consecutive periods."
}
}
Preconditions and Postconditions
"Can I validate things during plan/apply, not just at variable time?"
Terraform 1.2+ lets you add checks on resources themselves.
Preconditions
Check assumptions before creating ("make sure this is right before you build it"):
resource "aws_instance" "web" {
ami = var.ami_id
instance_type = var.instance_type
lifecycle {
precondition {
condition = data.aws_ami.selected.architecture == "x86_64"
error_message = "AMI must be x86_64 architecture."
}
precondition {
condition = data.aws_ami.selected.root_device_type == "ebs"
error_message = "AMI must use EBS root device."
}
}
}
Postconditions
Verify the resource was created correctly ("did we actually get what we asked for?"):
resource "aws_instance" "web" {
ami = var.ami_id
instance_type = var.instance_type
lifecycle {
postcondition {
condition = self.public_ip != null
error_message = "Instance must have a public IP."
}
}
}
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"]
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-*-amd64-server-*"]
}
lifecycle {
postcondition {
condition = self.image_id != null
error_message = "No matching AMI found."
}
}
}
terraform test (Native Testing)
"Wait, Terraform has an actual test framework?!"
Yep! Terraform 1.6+ includes native testing. This is a big deal.
Test Structure
project/
āāā main.tf
āāā variables.tf
āāā outputs.tf
āāā tests/
āāā basic.tftest.hcl
āāā validation.tftest.hcl
Basic Test
# tests/basic.tftest.hcl
# Setup - optional variables for test
variables {
environment = "test"
instance_type = "t2.micro"
}
# Test run
run "create_instance" {
command = plan # or apply
assert {
condition = aws_instance.web.instance_type == "t2.micro"
error_message = "Instance type should be t2.micro"
}
assert {
condition = aws_instance.web.tags["Environment"] == "test"
error_message = "Environment tag should be 'test'"
}
}
Run Tests
terraform test
# tests/basic.tftest.hcl... pass
# run "create_instance"... pass
# Verbose output
terraform test -verbose
Testing Modules
# tests/vpc_module.tftest.hcl
run "vpc_creation" {
command = plan
variables {
name = "test-vpc"
cidr = "10.0.0.0/16"
azs = ["us-west-2a", "us-west-2b"]
}
assert {
condition = module.vpc.vpc_id != null
error_message = "VPC should be created"
}
assert {
condition = length(module.vpc.public_subnet_ids) == 2
error_message = "Should create 2 public subnets"
}
}
Mock Providers
"Can I test without creating real AWS resources?"
Yes! Mock providers let you test logic without spending money:
# tests/mocked.tftest.hcl
mock_provider "aws" {
mock_data "aws_ami" {
defaults = {
id = "ami-mock12345"
architecture = "x86_64"
}
}
}
run "with_mocked_ami" {
command = plan
assert {
condition = aws_instance.web.ami == "ami-mock12345"
error_message = "Should use mocked AMI"
}
}
Setup and Cleanup
# tests/integration.tftest.hcl
# Run this first to create dependencies
run "setup_vpc" {
command = apply
module {
source = "./tests/fixtures/vpc"
}
}
# Main test uses setup output
run "create_instance" {
command = apply
variables {
vpc_id = run.setup_vpc.vpc_id
subnet_id = run.setup_vpc.subnet_id
}
assert {
condition = aws_instance.web.subnet_id == run.setup_vpc.subnet_id
error_message = "Instance should be in the test VPC"
}
}
# Cleanup happens automatically after all tests
TFLint
"Is there something that catches mistakes Terraform validate misses?"
TFLint is a static analysis tool that goes deeper than terraform validate:
Installation
# macOS
brew install tflint
# Linux
curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bash
Configuration
# .tflint.hcl
plugin "aws" {
enabled = true
version = "0.27.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
rule "terraform_deprecated_interpolation" {
enabled = true
}
rule "terraform_unused_declarations" {
enabled = true
}
rule "terraform_naming_convention" {
enabled = true
format = "snake_case"
}
Run TFLint
# Initialize plugins
tflint --init
# Run linting
tflint
# With specific path
tflint ./modules/vpc
AWS-Specific Rules
TFLint catches AWS mistakes that Terraform wouldn't catch until apply:
# TFLint warns: instance type is invalid
resource "aws_instance" "web" {
instance_type = "t2.superlarge" # Doesn't exist!
}
# TFLint warns: deprecated argument
resource "aws_s3_bucket" "data" {
acl = "private" # Deprecated in AWS provider 4.0+
}
Checkov
"What about security? How do I know my config isn't leaving things wide open?"
Checkov scans for security and compliance issues:
Installation
pip install checkov
Run Scan
checkov -d .
Output:
Passed checks: 10, Failed checks: 3, Skipped checks: 0
Check: CKV_AWS_8: "Ensure all data stored in the S3 bucket is encrypted"
FAILED for resource: aws_s3_bucket.data
File: /main.tf:15-20
Check: CKV_AWS_21: "Ensure S3 bucket has versioning enabled"
FAILED for resource: aws_s3_bucket.data
File: /main.tf:15-20
Configuration
# .checkov.yaml
framework:
- terraform
skip-check:
- CKV_AWS_8 # Skip this check
check:
- CKV_AWS_21 # Only run this check
compact: true
Inline Skip
resource "aws_s3_bucket" "logs" {
#checkov:skip=CKV_AWS_8:Logging bucket doesn't need encryption
bucket = "my-logs-bucket"
}
Terratest
"Can I write actual Go tests that deploy real infrastructure?"
Terratest is a Go-based testing framework for the brave ā it deploys real infrastructure, tests it, then tears it down:
Installation
go get github.com/gruntwork-io/terratest/modules/terraform
Test File
// test/vpc_test.go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestVPCModule(t *testing.T) {
terraformOptions := &terraform.Options{
TerraformDir: "../modules/vpc",
Vars: map[string]interface{}{
"name": "test-vpc",
"cidr": "10.0.0.0/16",
},
}
// Clean up after test
defer terraform.Destroy(t, terraformOptions)
// Deploy infrastructure
terraform.InitAndApply(t, terraformOptions)
// Get outputs
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
// Assert
assert.NotEmpty(t, vpcId)
}
Run Tests
cd test
go test -v -timeout 30m
Pre-commit Hooks
"Can I automate all this on every commit?"
Yep! Pre-commit hooks catch problems before they even reach the repository:
Installation
pip install pre-commit
Configuration
# .pre-commit-config.yaml
repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: v1.83.5
hooks:
- id: terraform_fmt
- id: terraform_validate
- id: terraform_tflint
args:
- --args=--config=__GIT_WORKING_DIR__/.tflint.hcl
- id: terraform_checkov
args:
- --args=--config-file __GIT_WORKING_DIR__/.checkov.yaml
Enable
pre-commit install
Now terraform fmt, validate, tflint, and checkov run on every commit. Problems get caught before they even make it to a PR. How cool is that?
Test Strategy
Unit Tests
Test modules in isolation with terraform test:
# Mocked providers, just check logic
run "check_naming" {
command = plan
assert {
condition = aws_instance.web.tags["Name"] == "${var.environment}-web"
error_message = "Name should follow convention"
}
}
Integration Tests
Test modules together with real providers:
run "full_stack" {
command = apply
# Uses real AWS, creates resources
assert {
condition = can(aws_instance.web.public_ip)
error_message = "Instance should have public IP"
}
}
End-to-End Tests
Terratest or similar ā deploy everything, verify it works:
// Actually curl the deployed endpoint
func TestWebServer(t *testing.T) {
ip := terraform.Output(t, options, "public_ip")
url := fmt.Sprintf("http://%s:80", ip)
http_helper.HttpGetWithRetry(t, url, nil, 200, "Hello", 30, 5*time.Second)
}
CI Pipeline Example
# .github/workflows/terraform.yml
name: Terraform
on:
pull_request:
paths:
- 'terraform/**'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- name: Format Check
run: terraform fmt -check -recursive
- name: Init
run: terraform init -backend=false
- name: Validate
run: terraform validate
- name: TFLint
uses: terraform-linters/setup-tflint@v4
- run: tflint --init && tflint
- name: Checkov
uses: bridgecrewio/checkov-action@master
with:
directory: .
framework: terraform
What's Next?
You now have a complete testing toolkit. No more "deploy and pray." You learned:
- Built-in validation (validate, fmt, plan)
- Variable validation for catching bad inputs
- Pre/postconditions for runtime checks
- Native
terraform testframework - TFLint and Checkov for static analysis and security
- Terratest for full integration testing
- Pre-commit hooks to automate everything
But what do you do when things go wrong despite all this testing? Let's learn debugging techniques for those inevitable 2 AM moments. Let's go!