Back to list
bobmatnyc

terraform-infrastructure-as-code

by bobmatnyc

Dynamic RAG-powered skills service for code assistants via MCP - Vector + Knowledge Graph hybrid search for intelligent skill discovery

8🍴 0📅 Jan 18, 2026

SKILL.md


name: Terraform Infrastructure as Code skill_id: terraform-infrastructure version: 1.0.0 description: Production-grade Terraform development with HCL best practices, module design, state management, multi-cloud patterns, and AI-enhanced infrastructure as code for scalable cloud deployments category: DevOps & Infrastructure tags:

  • terraform
  • infrastructure-as-code
  • iac
  • devops
  • cloud
  • aws
  • azure
  • gcp
  • multi-cloud
  • state-management
  • modules author: mcp-skillset license: MIT created: 2025-11-25 last_updated: 2025-11-25 toolchain:
  • Terraform 1.5+
  • HCL
  • terraform-docs
  • tflint
  • checkov frameworks:
  • Terraform
  • Terragrunt (optional)
  • Atlantis (optional) related_skills:
  • aws-cdk-development
  • systematic-debugging
  • security-testing
  • test-driven-development

Terraform Infrastructure as Code

Overview

This skill provides comprehensive guidance for building production-grade infrastructure with Terraform following 2024-2025 best practices. Terraform is the industry standard for Infrastructure as Code (IaC), enabling version-controlled, reproducible, and automated cloud infrastructure management across AWS, Azure, GCP, and 100+ providers.

When to Use This Skill

Use this skill when:

  • Provisioning cloud infrastructure (compute, storage, networking, databases)
  • Managing multi-environment deployments (dev, staging, production)
  • Implementing immutable infrastructure patterns
  • Orchestrating complex multi-cloud architectures
  • Automating infrastructure changes with CI/CD pipelines
  • Creating reusable infrastructure modules for teams
  • Migrating from manual cloud console provisioning to IaC

Core Principles

1. State Management is Critical

Terraform state is the source of truth - protect it

# CORRECT: Remote state with locking
terraform {
  backend "s3" {
    bucket         = "myapp-terraform-state"
    key            = "prod/vpc/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"  # Prevents concurrent modifications

    # State versioning for recovery
    versioning = true
  }
}

# WRONG: Local state in production
# terraform {
#   backend "local" {
#     path = "terraform.tfstate"  # Never use local state in teams!
#   }
# }

State Best Practices:

  • ✅ Always use remote state backends (S3, Azure Blob, GCS, Terraform Cloud)
  • ✅ Enable state locking to prevent concurrent runs
  • ✅ Enable encryption at rest for sensitive data
  • ✅ Use versioning for state file recovery
  • ✅ Separate state files per environment and major component
  • ❌ Never commit .tfstate files to version control
  • ❌ Never share state files via email or Slack

2. Module Design for Reusability

Build composable, tested modules with clear interfaces

# modules/vpc/main.tf - Well-designed module
variable "vpc_cidr" {
  description = "CIDR block for VPC"
  type        = string
  validation {
    condition     = can(cidrhost(var.vpc_cidr, 0))
    error_message = "VPC CIDR must be valid IPv4 CIDR block"
  }
}

variable "environment" {
  description = "Environment name (dev, staging, prod)"
  type        = string
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be dev, staging, or prod"
  }
}

variable "tags" {
  description = "Additional tags for all resources"
  type        = map(string)
  default     = {}
}

resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = merge(
    {
      Name        = "${var.environment}-vpc"
      Environment = var.environment
      ManagedBy   = "Terraform"
    },
    var.tags
  )
}

output "vpc_id" {
  description = "ID of the created VPC"
  value       = aws_vpc.main.id
}

output "vpc_cidr" {
  description = "CIDR block of the VPC"
  value       = aws_vpc.main.cidr_block
}

# Usage in root module
module "vpc" {
  source = "./modules/vpc"

  vpc_cidr    = "10.0.0.0/16"
  environment = "prod"
  tags = {
    Team    = "platform"
    Project = "myapp"
  }
}

Module Design Checklist:

  • ✅ Single responsibility - one purpose per module
  • ✅ Input validation with variable validation blocks
  • ✅ Descriptive variable names and descriptions
  • ✅ Sensible defaults where appropriate
  • ✅ Outputs for all important resource attributes
  • ✅ README.md with examples and terraform-docs
  • ✅ Versioning for published modules

3. Resource Naming and Tagging

Consistent naming prevents confusion and enables automation

locals {
  # Standardized naming convention
  name_prefix = "${var.project}-${var.environment}"

  # Common tags applied to all resources
  common_tags = {
    Project     = var.project
    Environment = var.environment
    ManagedBy   = "Terraform"
    Team        = var.team
    CostCenter  = var.cost_center
    CreatedAt   = timestamp()
  }
}

resource "aws_instance" "app_server" {
  # Clear, consistent naming
  ami           = var.ami_id
  instance_type = var.instance_type

  tags = merge(
    local.common_tags,
    {
      Name = "${local.name_prefix}-app-server"
      Role = "application"
    }
  )
}

resource "aws_s3_bucket" "data" {
  # DNS-compliant naming
  bucket = "${local.name_prefix}-data-${data.aws_caller_identity.current.account_id}"

  tags = merge(
    local.common_tags,
    {
      Name       = "${local.name_prefix}-data-bucket"
      DataClass  = "sensitive"
      Encryption = "required"
    }
  )
}

Naming Conventions:

  • Use lowercase with hyphens: myapp-prod-web-server
  • Include environment: dev-, staging-, prod-
  • Add resource type suffix: -vpc, -subnet, -sg
  • Ensure uniqueness where required (S3 buckets)

4. Data Sources vs Resources

Use data sources to reference existing infrastructure

# CORRECT: Use data source for existing resources
data "aws_vpc" "existing" {
  tags = {
    Name = "legacy-vpc"
  }
}

resource "aws_subnet" "new_subnet" {
  vpc_id     = data.aws_vpc.existing.id  # Reference existing VPC
  cidr_block = "10.0.100.0/24"
}

# CORRECT: Use data sources for AMIs, availability zones
data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]  # Canonical

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

data "aws_availability_zones" "available" {
  state = "available"
}

# WRONG: Don't import existing resources unless migrating
# resource "aws_vpc" "imported" {
#   # This will try to CREATE, not reference
#   cidr_block = "10.0.0.0/16"
# }

5. Variable Hierarchy and Precedence

Understand variable precedence for flexible configuration

# 1. variables.tf - Define with defaults
variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
}

# 2. terraform.tfvars - Common values
instance_type = "t3.small"

# 3. prod.tfvars - Environment-specific
# terraform apply -var-file="prod.tfvars"
instance_type = "t3.large"

# 4. Environment variables - CI/CD pipelines
# export TF_VAR_instance_type="t3.xlarge"

# 5. CLI flags - Highest precedence
# terraform apply -var="instance_type=t3.2xlarge"

# Precedence order (highest to lowest):
# CLI -var > Environment TF_VAR_ > .tfvars files > defaults

Best Practices

Project Structure

terraform-project/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── staging/
│   │   └── ... (same structure)
│   └── prod/
│       └── ... (same structure)
├── modules/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   ├── README.md
│   │   └── versions.tf
│   ├── compute/
│   │   └── ...
│   └── database/
│       └── ...
├── global/
│   ├── iam/
│   │   └── main.tf
│   └── route53/
│       └── main.tf
├── .tflint.hcl
├── .terraform-version
└── README.md

Dependency Management

# CORRECT: Explicit depends_on for non-obvious dependencies
resource "aws_iam_role_policy_attachment" "lambda_logs" {
  role       = aws_iam_role.lambda.name
  policy_arn = aws_iam_policy.lambda_logging.arn
}

resource "aws_lambda_function" "app" {
  # Lambda needs policy attached before creation
  depends_on = [aws_iam_role_policy_attachment.lambda_logs]

  function_name = "my-function"
  role          = aws_iam_role.lambda.arn
  # ...
}

# CORRECT: Use implicit dependencies via references
resource "aws_subnet" "private" {
  vpc_id     = aws_vpc.main.id  # Implicit dependency on VPC
  cidr_block = "10.0.1.0/24"
}

# WRONG: Unnecessary explicit depends_on
resource "aws_subnet" "bad_example" {
  vpc_id     = aws_vpc.main.id
  depends_on = [aws_vpc.main]  # Redundant! Reference creates dependency
  cidr_block = "10.0.2.0/24"
}

Conditional Resources

variable "create_database" {
  description = "Whether to create RDS database"
  type        = bool
  default     = true
}

variable "environment" {
  type = string
}

# Create resource conditionally
resource "aws_db_instance" "main" {
  count = var.create_database ? 1 : 0

  identifier     = "myapp-db"
  engine         = "postgres"
  engine_version = "15.3"
  instance_class = "db.t3.micro"
  # ...
}

# Reference conditional resource
output "database_endpoint" {
  value = var.create_database ? aws_db_instance.main[0].endpoint : null
}

# Dynamic blocks for repeated nested blocks
resource "aws_security_group" "app" {
  name   = "app-sg"
  vpc_id = aws_vpc.main.id

  dynamic "ingress" {
    for_each = var.ingress_rules
    content {
      from_port   = ingress.value.port
      to_port     = ingress.value.port
      protocol    = "tcp"
      cidr_blocks = ingress.value.cidr_blocks
    }
  }
}

Lifecycle Management

resource "aws_instance" "app" {
  ami           = var.ami_id
  instance_type = var.instance_type

  lifecycle {
    # Create new resource before destroying old
    create_before_destroy = true

    # Prevent accidental deletion
    prevent_destroy = true

    # Ignore changes to specific attributes
    ignore_changes = [
      ami,  # Allow manual AMI updates
      tags["CreatedAt"],
    ]
  }
}

# Prevent deletion of critical resources
resource "aws_s3_bucket" "critical_data" {
  bucket = "critical-data-bucket"

  lifecycle {
    prevent_destroy = true
  }
}

Common Patterns

Multi-Environment with Workspaces

# Use Terraform workspaces for environment separation
# terraform workspace new dev
# terraform workspace new prod

locals {
  environment = terraform.workspace

  # Environment-specific configuration
  config = {
    dev = {
      instance_type = "t3.micro"
      instance_count = 1
    }
    prod = {
      instance_type = "t3.large"
      instance_count = 3
    }
  }

  current_config = local.config[local.environment]
}

resource "aws_instance" "app" {
  count         = local.current_config.instance_count
  ami           = var.ami_id
  instance_type = local.current_config.instance_type

  tags = {
    Name        = "${local.environment}-app-${count.index}"
    Environment = local.environment
  }
}

For_Each for Resource Sets

# CORRECT: Use for_each for sets of similar resources
variable "availability_zones" {
  default = ["us-east-1a", "us-east-1b", "us-east-1c"]
}

locals {
  subnet_cidrs = {
    "us-east-1a" = "10.0.1.0/24"
    "us-east-1b" = "10.0.2.0/24"
    "us-east-1c" = "10.0.3.0/24"
  }
}

resource "aws_subnet" "private" {
  for_each = local.subnet_cidrs

  vpc_id            = aws_vpc.main.id
  cidr_block        = each.value
  availability_zone = each.key

  tags = {
    Name = "private-${each.key}"
  }
}

# Reference outputs from for_each
output "subnet_ids" {
  value = { for k, v in aws_subnet.private : k => v.id }
}

Remote State Data Sources

# Reference outputs from another Terraform state
data "terraform_remote_state" "vpc" {
  backend = "s3"
  config = {
    bucket = "myapp-terraform-state"
    key    = "prod/vpc/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_instance" "app" {
  ami           = var.ami_id
  instance_type = var.instance_type
  subnet_id     = data.terraform_remote_state.vpc.outputs.private_subnet_ids[0]
}

Null Resource for Provisioning

resource "null_resource" "setup_cluster" {
  # Trigger on cluster configuration changes
  triggers = {
    cluster_id = aws_eks_cluster.main.id
    config_hash = md5(file("${path.module}/kubeconfig.yaml"))
  }

  provisioner "local-exec" {
    command = <<-EOT
      kubectl apply -f ${path.module}/manifests/
      helm install myapp ./charts/myapp
    EOT

    environment = {
      KUBECONFIG = aws_eks_cluster.main.kubeconfig
    }
  }
}

Anti-Patterns to Avoid

❌ DON'T: Hardcode values

# WRONG: Hardcoded configuration
resource "aws_instance" "app" {
  ami           = "ami-12345678"  # Will break in other regions!
  instance_type = "t3.micro"

  tags = {
    Environment = "production"  # Hardcoded environment
  }
}

# CORRECT: Use variables and data sources
data "aws_ami" "app" {
  most_recent = true
  owners      = ["self"]

  filter {
    name   = "name"
    values = ["myapp-*"]
  }
}

resource "aws_instance" "app" {
  ami           = data.aws_ami.app.id
  instance_type = var.instance_type

  tags = merge(local.common_tags, {
    Name = "${var.environment}-app"
  })
}

❌ DON'T: Use count with lists that may reorder

# WRONG: Count with list - reordering causes recreations
variable "instance_names" {
  default = ["web1", "web2", "web3"]
}

resource "aws_instance" "app" {
  count         = length(var.instance_names)
  ami           = var.ami_id
  instance_type = "t3.micro"

  tags = {
    Name = var.instance_names[count.index]
  }
}
# If you remove "web2", "web3" gets destroyed and recreated!

# CORRECT: Use for_each with map
variable "instances" {
  default = {
    web1 = { type = "t3.micro" }
    web2 = { type = "t3.small" }
    web3 = { type = "t3.micro" }
  }
}

resource "aws_instance" "app" {
  for_each = var.instances

  ami           = var.ami_id
  instance_type = each.value.type

  tags = {
    Name = each.key
  }
}
# Now you can safely add/remove instances without affecting others

❌ DON'T: Mix environments in same state

# WRONG: All environments in one state file
# terraform apply  # Applies to dev AND prod!

# CORRECT: Separate directories and state files
# environments/dev/main.tf
# environments/prod/main.tf

❌ DON'T: Use local-exec for critical operations

# WRONG: Critical operations in local-exec
resource "null_resource" "bad_db_init" {
  provisioner "local-exec" {
    command = "psql -c 'CREATE DATABASE app'"
  }
}
# If this fails, no error in Terraform!

# CORRECT: Use proper resources or automation tools
resource "postgresql_database" "app" {
  name = "app"
}

Testing Strategy

# 1. Validate syntax
# terraform validate

# 2. Format code
# terraform fmt -recursive

# 3. Plan before apply
# terraform plan -out=tfplan

# 4. Use Terratest for automated testing (Go)
package test

import (
    "testing"
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
)

func TestVPCCreation(t *testing.T) {
    terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
        TerraformDir: "../modules/vpc",
        Vars: map[string]interface{}{
            "vpc_cidr": "10.0.0.0/16",
            "environment": "test",
        },
    })

    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)

    vpcID := terraform.Output(t, terraformOptions, "vpc_id")
    assert.NotEmpty(t, vpcID)
}

Security & Compliance

Sensitive Data Handling

# Mark sensitive outputs
output "database_password" {
  value     = random_password.db_password.result
  sensitive = true  # Hides from logs
}

# Use AWS Secrets Manager or Parameter Store
data "aws_secretsmanager_secret_version" "db_password" {
  secret_id = "prod/db/password"
}

resource "aws_db_instance" "main" {
  # ...
  password = data.aws_secretsmanager_secret_version.db_password.secret_string
}

Policy as Code with Checkov

# Install checkov
pip install checkov

# Scan Terraform code
checkov -d ./terraform

# Example output:
# FAILED checks:
# - CKV_AWS_20: S3 Bucket has an ACL defined which allows public access
# - CKV_AWS_21: Ensure S3 bucket has versioning enabled

Terraform Security Checklist

  • ✅ Never commit sensitive values to version control
  • ✅ Use encrypted remote state backend
  • ✅ Enable MFA for state backend access
  • ✅ Scan code with Checkov, tfsec, or Snyk
  • ✅ Use least-privilege IAM roles for Terraform execution
  • ✅ Review and approve plans before apply
  • ✅ Use Sentinel or OPA for policy enforcement
  • ✅ Rotate credentials regularly

CI/CD Integration

# GitHub Actions example
name: Terraform

on:
  push:
    branches: [main]
  pull_request:

jobs:
  terraform:
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: ./environments/prod

    steps:
      - uses: actions/checkout@v3

      - uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.6.0

      - name: Terraform Format
        run: terraform fmt -check

      - name: Terraform Init
        run: terraform init
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

      - name: Terraform Validate
        run: terraform validate

      - name: Terraform Plan
        run: terraform plan -out=tfplan
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

      - name: Terraform Apply
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: terraform apply -auto-approve tfplan
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

Multi-Cloud Patterns

# Provider configuration for multi-cloud
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.0"
    }
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = var.aws_region
  alias  = "primary"
}

provider "azurerm" {
  features {}
  alias = "secondary"
}

# Use providers with aliases
resource "aws_s3_bucket" "primary" {
  provider = aws.primary
  bucket   = "myapp-primary-data"
}

resource "azurerm_storage_account" "secondary" {
  provider            = azurerm.secondary
  name                = "myappsecondarystorage"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
}
  • aws-cdk-development: Alternative IaC with programming languages
  • systematic-debugging: Debug Terraform state and plan issues
  • security-testing: Security scanning of Terraform configurations
  • test-driven-development: Test infrastructure code before deployment

Additional Resources

Example Questions to Ask

  • "How do I structure a multi-environment Terraform project?"
  • "What's the best way to manage Terraform state for a team?"
  • "Show me how to create reusable modules with input validation"
  • "How do I migrate existing AWS resources to Terraform?"
  • "What's the difference between count and for_each?"
  • "How do I implement a blue-green deployment with Terraform?"
  • "Show me how to integrate Terraform with GitHub Actions"

Score

Total Score

75/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

+10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon