Introduction
Terraform by HashiCorp is the leading infrastructure as code tool, supporting all major cloud providers. This comprehensive guide covers Terraform from basics to advanced patterns, enabling you to provision, update, and version cloud infrastructure safely and efficiently.
Core Concepts and Installation
Install Terraform: wget https://releases.hashicorp.com/terraform/1.7.0/terraform_1.7.0_linux_amd64.zip, unzip, sudo mv terraform /usr/local/bin/. Verify: terraform version. Core workflow: init (initialize providers), plan (preview changes), apply (deploy infrastructure), destroy (remove resources). HCL (HashiCorp Configuration Language) syntax: resource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0", instance_type = "t2.micro", tags = { Name = "WebServer" } }. Providers: source and version constraints: terraform { required_providers { aws = { source = "hashicorp/aws", version = "~> 5.0" } } }.
Provider Configuration
AWS provider: provider "aws" { region = "us-east-1", access_key = var.aws_access_key, secret_key = var.aws_secret_key }. Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION. Azure: provider "azurerm" { features {}, subscription_id = var.subscription_id }. GCP: provider "google" { project = var.project_id, region = "us-central1" }. Multiple provider instances: provider "aws" { alias = "west", region = "us-west-2" }, resource "aws_instance" "example" { provider = aws.west, ... }. Assume role configuration, cross-account access. Use partial configuration for security: omit secrets in code, provide via environment or .tfvars.
Resources and Data Sources
Resources define infrastructure objects: resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" }. Resource arguments: required (ami, instance_type) and optional (tags, user_data). Meta-arguments: depends_on (explicit dependency), count (multiple instances), for_each (multiple with key), provider (alternate provider), lifecycle (create_before_destroy, prevent_destroy, ignore_changes). Data sources query existing infrastructure: data "aws_ami" "ubuntu" { most_recent = true, filter { name = "name", values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"] }, owners = ["099720109477"] }. Use data.aws_ami.ubuntu.id in resource.
Terraform State Management
State file (terraform.tfstate) maps resources to configuration. Remote backends for team collaboration: S3 backend: terraform { backend "s3" { bucket = "my-terraform-state", key = "prod/network/terraform.tfstate", region = "us-east-1", dynamodb_table = "terraform-locks" } }. Azure backend: azurerm, Google Cloud: gcs. State locking prevents concurrent modifications. State commands: terraform state list (show resources), terraform state show aws_instance.web (details), terraform state mv (rename resources), terraform state rm (remove from state, not cloud), terraform import (bring existing resources into state). Avoid manual state edits; use terraform state commands.
Variables and Outputs
Input variables: variable "instance_type" { type = string, default = "t2.micro", description = "EC2 instance type" }. Variable types: string, number, bool, list(string), map(string), object({...}), tuple([...]). Variable definition files: terraform.tfvars: instance_type = "t3.medium", environment = "prod". Environment variables: export TF_VAR_instance_type="t3.large". Output values: output "instance_ip" { value = aws_instance.web.public_ip, description = "Public IP of web server", sensitive = true (hides from logs) }. Access outputs: terraform output, terraform output instance_ip. Outputs for module composition and root module.
Modules for Reusability
Modules encapsulate resource groups. Module structure: modules/vpc/ (main.tf, variables.tf, outputs.tf). Call module: module "vpc" { source = "./modules/vpc", version = "1.2.0" (when using registry), name = "my-vpc", cidr = "10.0.0.0/16", environment = "prod" }. Module outputs: vpc_id = module.vpc.vpc_id. Terraform Registry for public modules (AWS VPC, AWS EC2, AWS RDS). Module versioning for stability: source = "terraform-aws-modules/vpc/aws", version = "5.0.0". Create reusable modules for common patterns: web-server cluster, database, networking. Module composition: root module calls child modules, child modules can call other modules (nested).
Workspaces for Environment Isolation
Workspaces manage multiple environments with same configuration: terraform workspace new dev, terraform workspace new prod, terraform workspace select prod. Workspace state files: terraform.tfstate.d/dev/, terraform.tfstate.d/prod/. Use workspace name in configuration: tags = { Environment = terraform.workspace }. Conditionals: resource "aws_instance" "web" { instance_type = terraform.workspace == "prod" ? "t3.large" : "t2.micro" }. Workspace limitations: backend must support workspaces (S3 does, local does not). Alternative: directory structure approach (envs/dev/, envs/prod/) with symlinks or terragrunt. Avoid excessive workspaces (over 20) due to performance.
Provisioners: Configuration Management
Provisioners run scripts after resource creation: resource "aws_instance" "web" { ..., provisioner "file" { source = "script.sh", destination = "/tmp/script.sh", connection { type = "ssh", user = "ubuntu", private_key = file("~/.ssh/id_rsa") } }, provisioner "remote-exec" { inline = ["chmod +x /tmp/script.sh", "/tmp/script.sh"] }, provisioner "local-exec" { command = "echo ${self.public_ip} >> inventory.txt" } }. Provisioner anti-pattern: prefer configuration management (Ansible) or immutable infrastructure (Packer) over provisioners. Provisioners lack idempotency, run only on creation. Use local-exec for logging, external data sources, or notifications.
Lifecycle Rules and Meta-Arguments
Lifecycle for resource behavior: lifecycle { create_before_destroy = true (zero-downtime updates), prevent_destroy = true (protect critical resources), ignore_changes = [tags, user_data] (ignore external changes) }. count for multiple similar resources: resource "aws_instance" "web" { count = 3, name = "web-${count.index}", ami = data.aws_ami.ubuntu.id }. Access via aws_instance.web[0], aws_instance.web[*].id. for_each for map-based resources: resource "aws_instance" "web" { for_each = { web1 = "t2.micro", web2 = "t3.small" }, instance_type = each.value, tags = { Name = each.key } }. depends_on for hidden dependencies: resource "aws_instance" "app" { depends_on = [aws_db_instance.database] } (even if no direct reference).
Functions and Expressions
Numeric functions: max(5, 12), min(7, 3), ceil(5.2). String functions: join(",", ["a","b"]), split(",", "a,b"), replace("old string", "old", "new"), lower("HELLO"), upper("hello"). Collection functions: length(list), element(["a","b"], 1), lookup(map, "key", "default"), keys(map), values(map), merge(map1, map2), flatten(list_of_lists). File functions: file("path/to/file"), fileexists("path"). Path functions: dirname("path"), basename("path"). Encoding: base64encode, base64decode, jsonencode, yamlencode. Templatefile for complex templating: templatefile("${path.module}/user_data.sh.tpl", { app_port = 8080 }).
Dynamic Blocks
Dynamic blocks generate nested configuration: resource "aws_security_group" "web" { name = "web-sg", dynamic "ingress" { for_each = [ { from = 80, to = 80, proto = "tcp", desc = "HTTP" }, { from = 443, to = 443, proto = "tcp", desc = "HTTPS" } ], content { from_port = ingress.value.from, to_port = ingress.value.to, protocol = ingress.value.proto, description = ingress.value.desc, cidr_blocks = ["0.0.0.0/0"] } } }. Dynamic for_each with maps, nested dynamic blocks (route53 records, aws_lb_listener rules). Combine with count for conditionals: dynamic "ebs_block" { for_each = var.attach_extra_disk ? [1] : [], content { device_name = "/dev/xvdb", volume_size = 100 } }.
Testing and Validation
Terraform validate: syntax and basic validation. Terraform plan: detailed preview. Terratest (Go library) for integration tests: test runs apply, asserts outputs, verifies resources, then destroy. tftest (Python) similar. terraform-compliance for BDD-style tests: check tags, CIDR blocks, encryption settings. Pre-commit hooks: terraform fmt, terraform validate, tflint (linting), tfsec (security scanning), checkov (policy-as-code). Conftest (Open Policy Agent) for custom policies. Plan assertions: show -json, parse with jq. CI/CD integration: automated plan generation on PRs, apply on merge.
Real-World Examples
Multi-tier AWS architecture: VPC with public/private subnets, Internet Gateway, NAT Gateway, ALB, Auto Scaling Group, RDS database, ElastiCache Redis. Security groups with least privilege. S3 bucket for artifacts. Route53 DNS records. CloudWatch alarms. Complete example spans 200+ lines across modules. Use terraform_remote_state data source to share outputs between root modules. Example: network module outputs VPC ID, security groups, subnets; app module uses remote state to reference them. Terragrunt (tool wrapper) for DRY configurations across environments.
Best Practices and Anti-Patterns
Best practices: Use remote state with locking, pin provider and module versions, format with terraform fmt, structure with modules, use variable validation, tag resources for cost tracking, implement CI/CD pipeline, run plan in CI (never plan manually). Naming conventions: snake_case for resources, variables, outputs. Use description for all variables. Anti-patterns: hardcoding secrets, using provisioners instead of immutable images, storing state in git, applying without plan, environment duplication (use workspaces or modules), mutable infrastructure (recreate not modify), ignoring destroy-time behavior.
Conclusion
Terraform brings discipline and repeatability to infrastructure. Start with single-cloud simple resources, add modules for reusability, implement workspaces for environments, and finally adopt state backends for team collaboration. Regular state backups and plan reviews prevent production surprises.