Skip to main content

Blue-Green Lambda Deployment Guide

This repository demonstrates Blue-Green and Canary deployments for AWS Lambda functions using CDK, CodeDeploy, and trunk-based development.

🎯 Overview

The deployment strategy ensures:

  • Zero-downtime deployments using Lambda aliases
  • Gradual traffic shifting with canary deployments
  • Automatic rollback on errors or alarms
  • Environment-specific strategies (faster in dev, safer in prod)

🏗️ Architecture

📋 Deployment Strategies by Environment

EnvironmentStrategyTraffic ShiftDuration
dev/testALL_AT_ONCEImmediate 100%~1 min
qa/stagingCANARY_10PERCENT_5MINUTES10% → 100%5 mins
prod/preprodCANARY_10PERCENT_15MINUTES10% → 100%15 mins

🚀 Deployment Workflow

1. Deploy to Environment

# Deploy to dev
npm run deploy:dev

# Deploy to staging
npm run deploy:staging

# Deploy to prod
npm run deploy:prod

2. Monitor Deployment

# Check deployment status
npm run deployment:status -- \
--application example-api-dev \
--deployment-group SimpleStack-dev-ExampleApiDeploymentGroup

3. Rollback if Needed

# Manual rollback
npm run deployment:rollback -- \
--application example-api-dev \
--deployment-group SimpleStack-dev-ExampleApiDeploymentGroup

# Force rollback without confirmation
npm run deployment:rollback -- \
--application example-api-dev \
--deployment-group SimpleStack-dev-ExampleApiDeploymentGroup \
--force

🔄 Understanding Blue-Green Deployments

How Lambda Aliases Enable Blue-Green

  1. Version N (Current/Green)

    • Production alias points to this version
    • Receives 100% of traffic
    • Stable and tested
  2. Version N+1 (New/Blue)

    • New code deployed
    • Receives 0% traffic initially
    • Gradually receives more traffic
  3. Traffic Shifting

    • CodeDeploy gradually shifts traffic from N to N+1
    • Example (CANARY_10PERCENT_15MINUTES):
      • 0:00 - 10% to N+1, 90% to N
      • 15:00 - 100% to N+1, 0% to N
  4. Automatic Rollback

    • CloudWatch alarms monitor error rates
    • If errors exceed threshold, traffic shifts back to N
    • No code deployment needed for rollback

Key Components

Lambda Alias

const productionAlias = new Alias(this, 'ProductionAlias', {
aliasName: 'production',
version: lambda.currentVersion,
});
  • Points to a specific Lambda version
  • API Gateway routes traffic through the alias
  • Traffic shifting updates which version the alias points to

CodeDeploy Deployment Group

new LambdaDeploymentGroup(this, 'DeploymentGroup', {
alias: productionAlias,
deploymentConfig: LambdaDeploymentConfig.CANARY_10PERCENT_15MINUTES,
alarms: [errorAlarm],
autoRollback: {
failedDeployment: true,
deploymentInAlarm: true,
},
});
  • Manages traffic shifting between versions
  • Monitors CloudWatch alarms
  • Triggers automatic rollback on failures

CloudWatch Alarm

const errorAlarm = productionAlias.metricErrors({
period: Duration.minutes(1),
}).createAlarm(this, 'ErrorAlarm', {
threshold: 3,
evaluationPeriods: 2,
});
  • Monitors Lambda errors in real-time
  • Triggers rollback if threshold exceeded
  • Customizable per environment

🔍 Monitoring Deployments

CloudWatch Metrics

Monitor these metrics during deployments:

  1. Error Rate: Lambda invocation errors
  2. Duration: Lambda execution time
  3. Throttles: Rate limiting events
  4. Concurrent Executions: Active invocations

Deployment States

StateDescriptionAction
CreatedDeployment queuedWait
InProgressTraffic shiftingMonitor alarms
SucceededDeployment completeVerify functionality
FailedDeployment failedCheck logs, rollback
StoppedManually stoppedAuto-rollback triggered

🛠️ Troubleshooting

Deployment Failures

Issue: Deployment fails immediately

# Check CloudFormation stack events
aws cloudformation describe-stack-events \
--stack-name SimpleStack-dev

# Check Lambda function logs
aws logs tail /aws/lambda/example-api-dev --follow

Issue: High error rate during deployment

# Check CloudWatch alarms
aws cloudwatch describe-alarms \
--alarm-names example-api-dev-errors

# View Lambda metrics
aws cloudwatch get-metric-statistics \
--namespace AWS/Lambda \
--metric-name Errors \
--dimensions Name=FunctionName,Value=example-api-dev

Rollback Scenarios

Automatic Rollback Triggers:

  • Error alarm threshold exceeded
  • Deployment timeout
  • Manual stop deployment

Manual Rollback:

# Stop current deployment (triggers rollback)
npm run deployment:rollback -- \
--application example-api-dev \
--deployment-group SimpleStack-dev-ExampleApiDeploymentGroup

📊 Testing Blue-Green Deployment

1. Initial Deployment

# Deploy version 1
npm run deploy:dev

# Get API endpoint
aws cloudformation describe-stacks \
--stack-name SimpleStack-dev \
--query 'Stacks[0].Outputs[?OutputKey==`ApiUrl`].OutputValue' \
--output text

2. Make Code Changes

Edit src/lambda/exampleApi/index.ts:

// Change the message
message: 'Hello from Blue-Green Lambda v2!',

3. Deploy New Version

# This creates a new Lambda version and starts traffic shifting
npm run deploy:dev

4. Watch Traffic Shift

# Monitor deployment
npm run deployment:status -- \
--application example-api-dev \
--deployment-group SimpleStack-dev-ExampleApiDeploymentGroup

# Call API during deployment to see traffic distribution
while true; do
curl https://your-api-url/
sleep 1
done

5. Test Rollback

# Stop deployment to trigger rollback
npm run deployment:rollback -- \
--application example-api-dev \
--deployment-group SimpleStack-dev-ExampleApiDeploymentGroup

🌿 Git Workflow (Trunk-Based Development)

Branch Strategy

# Create feature branch
git checkout -b feature/update-lambda-response

# Make changes
# Commit and push

# Create pull request to main
# After approval, merge triggers deployment

CI/CD Pipeline (GitHub Actions example)

name: Deploy Lambda

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test

deploy:
needs: test
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Deploy to dev
run: npm run deploy:dev
- name: Check deployment status
run: |
npm run deployment:status -- \
--application example-api-dev \
--deployment-group SimpleStack-dev-ExampleApiDeploymentGroup

📚 Additional Resources

🔐 Security Considerations

  1. IAM Permissions: CodeDeploy requires permissions to update Lambda aliases
  2. API Gateway: Use API keys or authorizers for production
  3. Secrets: Store in AWS Secrets Manager, not environment variables
  4. VPC: Configure Lambda VPC settings for database access

💰 Cost Optimization

  • Dev Environment: ALL_AT_ONCE saves deployment time
  • Lambda Versions: Old versions are automatically deleted
  • CloudWatch Logs: Set retention periods appropriately
  • API Gateway: Use caching for frequently accessed endpoints