Debugging
Troubleshoot Terraform errors and unexpected behavior
Debugging
In the previous tutorial, we built a whole testing toolkit to catch problems early. But let's be honest — things will still go wrong. Terraform will throw cryptic errors. Resources won't create. State will get weird. It happens to everyone.
Let's learn how to diagnose and fix common problems so you're not panicking at 2 AM.
Enable Logging
"How do I see what Terraform is actually doing under the hood?"
Basic Logging
# Enable detailed logs
export TF_LOG=DEBUG
terraform apply
# Or inline
TF_LOG=DEBUG terraform apply
Log Levels
From "tell me everything" to "only scream if something's on fire":
| Level | Description |
|---|---|
| TRACE | Most verbose, includes all calls |
| DEBUG | Detailed debugging info |
| INFO | General operational entries |
| WARN | Warnings only |
| ERROR | Errors only |
Log to File
export TF_LOG=DEBUG
export TF_LOG_PATH=./terraform.log
terraform apply
Provider-Specific Logging
"What if I only want logs from the AWS provider, not all the noise?"
# Only AWS provider logs
export TF_LOG_PROVIDER=DEBUG
terraform apply
Common Errors
Here's the hall of fame. You will hit these. Let's learn to fix them fast.
"Resource Already Exists"
Error: error creating S3 bucket: BucketAlreadyExists
Problem: The resource exists in AWS but not in Terraform state. Classic "someone created it manually" situation.
Solutions:
- Import the existing resource:
terraform import aws_s3_bucket.data my-bucket-name
- Use a different name:
resource "aws_s3_bucket" "data" {
bucket = "my-bucket-name-v2" # Different name
}
- Delete the existing resource (if safe):
aws s3 rb s3://my-bucket-name
"Resource Not Found"
Error: error reading S3 bucket: NoSuchBucket
Problem: Terraform state says "this exists" but AWS says "never heard of it." The opposite of the previous error.
Solution: Remove from state:
terraform state rm aws_s3_bucket.data
"Cycle Detected"
Error: Cycle: aws_security_group.web, aws_security_group.db
Problem: Circular dependency. Resource A depends on B, and B depends on A. Terraform's brain short-circuits.
Solution: Break the cycle with separate rule resources:
# Instead of inline rules with circular references
resource "aws_security_group" "web" {
name = "web"
# No inline rules
}
resource "aws_security_group" "db" {
name = "db"
# No inline rules
}
# Add rules separately
resource "aws_security_group_rule" "web_to_db" {
type = "egress"
from_port = 5432
to_port = 5432
protocol = "tcp"
security_group_id = aws_security_group.web.id
source_security_group_id = aws_security_group.db.id
}
"Provider Configuration Not Present"
Error: Provider configuration not present
Problem: You're referencing a provider that isn't configured. Terraform's like, "Who is this?"
Solution: Add the provider:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "us-west-2"
}
"Invalid Reference"
Error: Reference to undeclared resource
Problem: Typo in a resource reference. We've all been there.
Solution: Check your spelling (dude, seriously):
# Wrong
subnet_id = aws_subnt.public.id # Typo!
# Right
subnet_id = aws_subnet.public.id
"Inconsistent Dependency Lock File"
Error: Inconsistent dependency lock file
Problem: Your lock file doesn't match the required providers. Usually happens when someone updated providers without committing the lock file.
Solution:
terraform init -upgrade
Or delete and regenerate:
rm .terraform.lock.hcl
terraform init
State Problems
"My state is messed up. What do I do?!"
Deep breaths. Let's fix it.
State Lock Stuck
Error: Error acquiring the state lock
Lock ID: xxx
Problem: A previous Terraform run crashed or timed out, leaving the lock behind. Like someone leaving their stuff on a gym machine.
Solution:
# Force unlock (use carefully!)
terraform force-unlock LOCK_ID
State Corruption
Symptoms: Unexpected drift, missing resources, strange errors. Basically, Terraform is confused.
Solutions:
- Refresh state (the gentle approach):
terraform refresh
- Pull fresh state:
terraform state pull > state.json
# Inspect state.json
- Last resort — recreate state (the nuclear option):
# Remove all from state
terraform state list | xargs -n1 terraform state rm
# Re-import everything
terraform import aws_vpc.main vpc-123
# ... repeat for all resources
State Drift
"Terraform plan is showing changes I didn't make!"
Someone (or something) changed resources outside of Terraform. It happens.
Solutions:
- Accept the change (update config):
resource "aws_instance" "web" {
tags = {
SomeTag = "Added outside Terraform"
}
}
- Revert the change (apply Terraform config):
terraform apply
- Ignore the attribute:
lifecycle {
ignore_changes = [tags["SomeTag"]]
}
Debugging Techniques
"Okay, but how do I actually figure out what's going on?"
Here's your debugging toolkit.
Targeted Apply
Test one resource at a time instead of the whole enchilada:
terraform apply -target=aws_vpc.main
terraform apply -target=aws_subnet.public
Plan Output
Save and inspect plan:
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
# Pretty print
cat plan.json | jq '.resource_changes[] | {address, change: .change.actions}'
State Inspection
# List all resources
terraform state list
# Show specific resource
terraform state show aws_instance.web
# Export full state
terraform state pull > state.json
Console
Interactive expression testing — great for "what does this value look like?" moments:
terraform console
> var.environment
"dev"
> aws_instance.web.public_ip
"54.123.45.67"
> [for s in aws_subnet.public : s.id]
["subnet-123", "subnet-456"]
> length(var.subnets)
3
Graph
Visualize dependencies — super helpful when you're trying to understand the cycle error from earlier:
terraform graph | dot -Tsvg > graph.svg
Verbose Plan
# Show all attributes, including computed
terraform plan -detailed-exitcode
Provider-Specific Debugging
AWS
"How do I debug AWS-specific issues?"
# AWS SDK logging
export AWS_SDK_LOAD_CONFIG=1
export TF_LOG=DEBUG
# Check credentials
aws sts get-caller-identity
Common AWS Errors
"Access Denied"
# Check IAM permissions
aws iam simulate-principal-policy \
--policy-source-arn arn:aws:iam::123456789:user/terraform \
--action-names ec2:CreateVpc \
--resource-arns "*"
"InvalidParameterValue"
# Usually wrong region or invalid values
TF_LOG=DEBUG terraform apply 2>&1 | grep -i "request"
Testing Changes Safely
"I'm scared to apply this. How do I not break production?"
Plan Before Apply (Always)
This isn't optional. This is survival.
terraform plan
# Review carefully!
terraform apply
Use Workspaces for Testing
terraform workspace new test
# Make changes
terraform apply
# If good, switch back
terraform workspace select prod
terraform apply
Dry Run with -refresh-only
# See what's different without making changes
terraform apply -refresh-only
Recovering from Mistakes
"I just destroyed something I shouldn't have. Help!"
Don't panic. Let's fix it.
Wrong Resource Destroyed
If you accidentally destroyed a resource:
- Check if it can be recovered (S3 versioning, RDS snapshots)
- Remove from state:
terraform state rm aws_thing.name - Re-create with apply
Applied Wrong Configuration
# Revert to previous state
# (if you have backups or S3 versioning on state bucket)
# Or fix config and reapply
terraform apply
Force Replace
# Force recreation of specific resource
terraform apply -replace=aws_instance.web
Debugging Checklist
When something goes wrong, follow this recipe:
- Read the error message — It usually tells you exactly what's wrong (seriously, read it)
- Enable DEBUG logging —
TF_LOG=DEBUG - Check state —
terraform state list,terraform state show - Validate config —
terraform validate - Try targeted apply —
terraform apply -target=resource - Check provider docs — Arguments, requirements, limitations
- Check the cloud console — The resource might exist or have issues
- Google the error — Someone's probably seen it before (you're never the first)
Preventing Problems
The best debugging session is the one you never have.
Version Pinning
Pin everything so updates don't surprise you:
terraform {
required_version = "~> 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
Lock File
Commit .terraform.lock.hcl to version control. Seriously, do it.
State Backups
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "terraform.tfstate"
region = "us-west-2"
encrypt = true
# Enable versioning on the bucket!
}
}
Review Plans
"Can I just auto-approve everything?"
In CI, sure. For manual runs? Read the plan.
# Always review before apply
terraform plan
# Require approval in CI
terraform apply -auto-approve # Only in automted pipelines
Use Modules
Tested, reusable modules reduce bugs. Don't reinvent the wheel:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.1.0" # Pinned!
}
Example Debug Session
Let's walk through a real debugging scenario, step by step:
# Something went wrong
$ terraform apply
Error: error creating EC2 Instance: InvalidAMIID.NotFound
# Enable logging
$ export TF_LOG=DEBUG
$ terraform apply 2>&1 | tee terraform.log
# Search for the request
$ grep "amis" terraform.log
... "ImageId": "ami-12345678" ...
# Check if AMI exists
$ aws ec2 describe-images --image-ids ami-12345678
No images found
# AMI doesn't exist in this region
# Fix: use data source to find correct AMI
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"]
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-*-22.04-amd64-server-*"]
}
}
What's Next?
You're now equipped to tackle the weirdest Terraform errors. You learned:
- How to enable and use Terraform logs
- Common errors and their fixes
- State troubleshooting
- Debugging techniques (console, graph, targeted apply)
- Provider-specific debugging
- Recovering from mistakes
- Prevention best practices
Remember: every Terraform expert has Googled "terraform error" at 2 AM. You're in good company.
Ready for the final boss? Let's learn how to automate everything with CI/CD for production deployments. Let's go!