使用 Terraform 构建安全高效的 AWS 容器化生产级基础设施
构建云基础设施往往涉及网络、安全、数据库和 CI/CD 流程的复杂协调。本文将通过 Terraform 实现一套符合 AWS 架构完善框架的容器化应用部署方案,涵盖 VPC 分层、Fargate 无服务器容器、托管 RDS PostgreSQL、GitHub Actions OIDC 无密钥认证等企业级实践。
整体架构技术栈
| 层级 | 技术选型 | 核心作用 |
|---|---|---|
| 基础设施即代码 | Terraform + AWS Provider | 全栈资源的声明式、可重复部署 |
| 计算服务 | Amazon ECS Fargate | 无需管理服务器的容器编排 |
| 数据存储 | Amazon RDS PostgreSQL 16 | 高可用、自动备份的关系型数据库 |
| 流量分发 | Application Load Balancer (ALB) | 公网 HTTP/HTTPS 入口负载均衡 |
| 镜像托管 | Amazon ECR | 私有 Docker 镜像存储与安全扫描 |
| 密钥管理 | AWS Secrets Manager | 自动生成、轮换数据库等敏感凭证 |
| 网络隔离 | VPC + PrivateLink 端点 | 内部资源无公网访问,流量留 AWS 内部 |
| 自动化部署 | GitHub Actions + OIDC | 无长期密钥的 CI/CD 流水线 |
Terraform 项目目录结构
为了提升可维护性,配置文件按功能拆分为以下模块:
cloud-notes-app-iac/
├── variables.tf # 可配置参数
├── locals.tf # 本地常量(如公共标签、区域)
├── vpc-module.tf # 网络基础(VPC、子网、路由表、网关)
├── sg-module.tf # 安全组分层防护
├── rds-module.tf # PostgreSQL 数据库实例
├── ecr-module.tf # 容器镜像仓库
├── privatelink-module.tf # VPC 私有端点
├── iam-module.tf # 执行/任务角色、GitHub OIDC
├── ecs-module.tf # ECS 集群、任务定义、服务、自动扩缩容
├── outputs.tf # 部署后输出信息
└── terraform.tfvars.example # 示例参数文件
1. 分层安全的 VPC 网络设计
网络设计采用深度防御原则:公网仅暴露 ALB,ECS、RDS 等核心资源部署在私有子网,无公网 IP 分配。
关键设计决策
| 决策项 | 实现方案 | 设计原因 |
|---|---|---|
| 多可用区高可用 | 2 公网 + 2 私有子网(跨 ap-northeast-1a/c) | 避免单点故障,符合架构完善框架的可靠性要求 |
| 私有资源无公网访问 | map_public_ip_on_launch = false |
防止意外暴露容器和数据库 |
| 启用 DNS 解析 | enable_dns_hostnames = true, enable_dns_support = true |
支持 ECR、CloudWatch 等 AWS 服务的私有域名访问 |
核心 Terraform 配置
locals {
azs = slice(data.aws_availability_zones.available.names, 0, 2)
cidr_base = "10.10.0.0/16"
pub_subnets = [cidrsubnet(local.cidr_base, 8, 0), cidrsubnet(local.cidr_base, 8, 1)]
priv_subnets = [cidrsubnet(local.cidr_base, 8, 2), cidrsubnet(local.cidr_base, 8, 3)]
tags = {
Project = "cloud-notes-api"
ManagedBy = "Terraform"
Environment = "UAT"
}
}
data "aws_availability_zones" "available" {}
resource "aws_vpc" "main" {
cidr_block = local.cidr_base
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(local.tags, { Name = "cloud-notes-vpc" })
}
resource "aws_subnet" "public" {
count = length(local.pub_subnets)
vpc_id = aws_vpc.main.id
cidr_block = local.pub_subnets[count.index]
availability_zone = local.azs[count.index]
map_public_ip_on_launch = true
tags = merge(local.tags, { Name = "cloud-notes-pub-subnet-${count.index + 1}" })
}
resource "aws_subnet" "private" {
count = length(local.priv_subnets)
vpc_id = aws_vpc.main.id
cidr_block = local.priv_subnets[count.index]
availability_zone = local.azs[count.index]
map_public_ip_on_launch = false
tags = merge(local.tags, { Name = "cloud-notes-priv-subnet-${count.index + 1}" })
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = merge(local.tags, { Name = "cloud-notes-igw" })
}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = merge(local.tags, { Name = "cloud-notes-pub-rtb" })
}
resource "aws_route_table_association" "public" {
count = length(local.pub_subnets)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
2. 安全组的最小权限分层
放弃使用 CIDR 块的内部信任关系,改用安全组 ID 引用,从源头上防止 IP 欺骗,每层仅开放必要端口:
Internet → ALB-SG(80/443) → ECS-SG(3001) → RDS-SG(5432)
↓
VPC-Endpoint-SG(443)
核心配置片段
resource "aws_security_group" "alb" {
vpc_id = aws_vpc.main.id
name = "cloud-notes-alb-sg"
description = "允许公网 HTTP 访问 ALB"
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0", "::/0"]
description = "公网 HTTP 入口"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0", "::/0"]
}
tags = merge(local.tags, { Name = "cloud-notes-alb-sg" })
}
resource "aws_security_group" "ecs_task" {
vpc_id = aws_vpc.main.id
name = "cloud-notes-ecs-task-sg"
description = "允许 ALB 访问容器应用端口"
ingress {
from_port = 3001
to_port = 3001
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
description = "ALB 到容器的 3001 端口流量"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0", "::/0"]
}
tags = merge(local.tags, { Name = "cloud-notes-ecs-task-sg" })
}
3. 托管 PostgreSQL 数据库
使用 RDS 托管服务的 manage_master_user_password 功能,完全避免硬编码或手动管理数据库凭证:
- 自动生成 20 字符以上的强密码
- 存储在 Secrets Manager 中,支持 30 天自动轮换
- ECS 执行角色通过 IAM 权限在运行时拉取
RDS 配置
resource "aws_db_subnet_group" "main" {
name = "cloud-notes-db-subnet-group"
subnet_ids = aws_subnet.private[*].id
tags = merge(local.tags, { Name = "cloud-notes-db-subnet-group" })
}
resource "aws_db_parameter_group" "pg16" {
name = "cloud-notes-pg16-params"
family = "postgres16"
description = "针对云笔记 API 优化的 PostgreSQL 16 参数组"
parameter {
name = "shared_buffers"
value = "262144"
apply_method = "pending-reboot"
}
parameter {
name = "effective_cache_size"
value = "786432"
apply_method = "pending-reboot"
}
lifecycle { create_before_destroy = true }
tags = merge(local.tags, { Name = "cloud-notes-pg16-params" })
}
resource "aws_db_instance" "main" {
identifier = "cloud-notes-db"
db_name = "cloud_notes_db"
engine = "postgres"
engine_version = "16.3"
instance_class = "db.t4g.micro"
username = "api_admin"
manage_master_user_password = true
master_user_secret_kms_key_id = aws_kms_key.db.arn
storage_encrypted = true
storage_type = "gp3"
allocated_storage = 20
max_allocated_storage = 100
db_subnet_group_name = aws_db_subnet_group.main.name
parameter_group_name = aws_db_parameter_group.pg16.name
vpc_security_group_ids = [aws_security_group.rds.id]
multi_az = false
publicly_accessible = false
skip_final_snapshot = true
backup_retention_period = 7
tags = merge(local.tags, { Name = "cloud-notes-db" })
}
resource "aws_kms_key" "db" {
description = "加密 RDS 主用户密码和存储"
deletion_window_in_days = 7
tags = merge(local.tags, { Name = "cloud-notes-db-kms-key" })
}
4. VPC 私有端点替代 NAT 网关
私有子网中的 ECS 任务访问 AWS 服务(如 ECR、CloudWatch)时,无需使用 NAT 网关,通过 PrivateLink 端点即可,既降低成本又提升安全性:
| 端点类型 | 服务名称 | 用途 |
|---|---|---|
| Interface | com.amazonaws.ap-northeast-1.ecr.dkr |
拉取 Docker 镜像层 |
| Interface | com.amazonaws.ap-northeast-1.ecr.api |
ECR API 操作 |
| Interface | com.amazonaws.ap-northeast-1.logs |
发送容器日志到 CloudWatch |
| Interface | com.amazonaws.ap-northeast-1.secretsmanager |
拉取数据库凭证 |
| Gateway | com.amazonaws.ap-northeast-1.s3 |
访问 S3(无需 ENI,路由表配置即可) |
端点配置
locals {
privatelink_services = ["ecr.dkr", "ecr.api", "logs", "secretsmanager"]
}
resource "aws_vpc_endpoint" "interface" {
for_each = toset(local.privatelink_services)
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.aws_region}.${each.value}"
vpc_endpoint_type = "Interface"
private_dns_enabled = true
subnet_ids = aws_subnet.private[*].id
security_group_ids = [aws_security_group.vpc_endpoint.id]
tags = merge(local.tags, { Name = "cloud-notes-${each.value}-ep" })
}
resource "aws_vpc_endpoint" "s3" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${var.aws_region}.s3"
vpc_endpoint_type = "Gateway"
route_table_ids = [aws_route_table.private.id]
tags = merge(local.tags, { Name = "cloud-notes-s3-ep" })
}
5. ECS Fargate 自动扩缩容服务
Fargate 任务部署在私有子网,无公网 IP,通过 ALB 暴露,配置自动扩缩容和部署断路器。
任务定义片段(环境变量与密钥注入)
resource "aws_ecs_task_definition" "api" {
family = "cloud-notes-api-task"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = 256
memory = 512
execution_role_arn = aws_iam_role.ecs_execution.arn
task_role_arn = aws_iam_role.ecs_task.arn
runtime_platform {
operating_system_family = "LINUX"
cpu_architecture = "ARM64"
}
container_definitions = jsonencode([{
name = "cloud-notes-api-container"
image = "${aws_ecr_repository.api.repository_url}:${var.default_image_tag}"
essential = true
portMappings = [{ containerPort = 3001, protocol = "tcp" }]
secrets = [{
name = "DB_PASSWORD"
valueFrom = "${aws_db_instance.main.master_user_secret[0].secret_arn}:password::"
}, {
name = "DB_USERNAME"
valueFrom = "${aws_db_instance.main.master_user_secret[0].secret_arn}:username::"
}]
environment = [
{ name = "DB_HOST", value = aws_db_instance.main.address },
{ name = "DB_NAME", value = "cloud_notes_db" },
{ name = "DB_PORT", value = "5432" }
]
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = aws_cloudwatch_log_group.ecs.name,
awslogs-region = var.aws_region,
awslogs-stream-prefix = "ecs",
awslogs-create-group = "true"
}
}
}])
lifecycle { ignore_changes = [container_definitions] }
tags = merge(local.tags, { Name = "cloud-notes-api-task" })
}
自动扩缩容策略
resource "aws_appautoscaling_target" "ecs_api" {
max_capacity = 6
min_capacity = 2
resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.api.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "ecs_api_cpu" {
name = "cpu-target-tracking"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs_api.resource_id
scalable_dimension = aws_appautoscaling_target.ecs_api.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs_api.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 60.0
scale_in_cooldown = 300
scale_out_cooldown = 120
}
}
6. GitHub Actions OIDC 无密钥 CI/CD
通过 GitHub 的 OIDC Provider 向 AWS STS 请求临时凭证(有效期 1 小时),无需在仓库中存储 AWS 访问密钥。
GitHub Actions 工作流(简化版)
name: 云笔记 API CI/CD
on:
push:
branches: [ "main" ]
permissions:
id-token: write
contents: read
jobs:
build-scan-push:
runs-on: ubuntu-latest
outputs:
img_tag: ${{ steps.meta.outputs.short-sha }}
ecr_reg: ${{ steps.ecr-login.outputs.registry }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with: { go-version: '1.23', cache: true }
- run: go test -v ./...
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_GITHUB_OIDC_ROLE_ARN }}
aws-region: ${{ secrets.AWS_REGION }}
- id: ecr-login
uses: aws-actions/amazon-ecr-login@v2
- id: meta
run: echo "short-sha=${GITHUB_SHA::7}" >> $GITHUB_OUTPUT
- run: |
docker build -t ${{ steps.ecr-login.outputs.registry }}/${{ secrets.ECR_REPO_NAME }}:${{ steps.meta.outputs.short-sha }} .
docker push ${{ steps.ecr-login.outputs.registry }}/${{ secrets.ECR_REPO_NAME }}:${{ steps.meta.outputs.short-sha }}
deploy:
needs: build-scan-push
runs-on: ubuntu-latest
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_GITHUB_OIDC_ROLE_ARN }}
aws-region: ${{ secrets.AWS_REGION }}
- run: |
FINDINGS=$(aws ecr describe-image-scan-findings --repository-name ${{ secrets.ECR_REPO_NAME }} --image-id imageTag=${{ needs.build-scan-push.outputs.img_tag }} --query 'imageScanFindings.findingSeverityCounts.HIGH || imageScanFindings.findingSeverityCounts.CRITICAL' --output text)
if [ "$FINDINGS" != "None" ] && [ "$FINDINGS" != "" ]; then exit 1; fi
- run: |
aws ecs describe-task-definition --task-definition cloud-notes-api-task --query taskDefinition > task.json
jq --arg IMG "${{ needs.build-scan-push.outputs.ecr_reg }}/${{ secrets.ECR_REPO_NAME }}:${{ needs.build-scan-push.outputs.img_tag }}" '.containerDefinitions[0].image = $IMG | del(.taskDefinitionArn,.revision,.status,.requiresAttributes,.compatibilities,.registeredAt,.registeredBy)' task.json > new-task.json
NEW_REV=$(aws ecs register-task-definition --cli-input-json file://new-task.json | jq -r '.taskDefinition.revision')
aws ecs update-service --cluster cloud-notes-cluster --service cloud-notes-api-service --task-definition cloud-notes-api-task:${NEW_REV}