AWS Deployment

Step-by-step guide for deploying AskGov on AWS infrastructure

triangle-exclamation

This guide provides AskGov-specific guidance for AWS deployment. We assume you're familiar with AWS services and focus on AskGov-specific requirements and best practices.

Deployment Duration: 4-6 hours (excluding DNS propagation) Complexity Level: Intermediate to Advanced Prerequisites: AWS account with appropriate permissions, familiarity with AWS services

circle-info

Note: This guide focuses on AskGov-specific configurations. For general AWS setup instructions, refer to AWS documentation.

Pre-Deployment Checklist

Required AWS Services

Prerequisites

Architecture Overview

Step 1: Network Infrastructure

Key Considerations for AskGov

  • Multi-AZ deployment for high availability

  • Private subnets for application and database tiers

  • Public subnets only for load balancer and NAT gateways

  • Strict security groups limiting traffic between tiers

Security Groups Required

  1. ALB Security Group

    • Inbound: 80, 443 from internet

    • Outbound: 8080 to App Security Group

  2. App Security Group

    • Inbound: 8080 from ALB Security Group

    • Outbound: 5432/26257 to DB, 6379 to Redis, 443 to internet

  3. Database Security Group

    • Inbound: 5432/26257 from App Security Group only

    • No outbound rules needed

  4. Redis Security Group

    • Inbound: 6379 from App Security Group only

Step 2: Database Setup

Option A: RDS PostgreSQL (Simpler)

Specifications for AskGov:

  • Engine: PostgreSQL 14+

  • Instance: db.t3.medium minimum (adjust based on load)

  • Storage: 100GB GP3 SSD, encrypted

  • Multi-AZ: Required for production

  • Automated backups: 30-day retention

AskGov-specific configurations:

Option B: CockroachDB on EC2 (Full Compatibility)

Why CockroachDB for AskGov:

  • Full compatibility with AskGov's Prisma schemas

  • Better horizontal scaling for large deployments

  • Built-in geo-replication capabilities

Minimum 3-node cluster:

  • Instance type: m5.xlarge

  • Storage: 500GB SSD per node

  • Placement: Different AZs

Step 3: Cache Layer (Redis)

ElastiCache Configuration for AskGov

Cache strategy:

  • Session storage

  • Search results caching (5-minute TTL)

  • Rate limiting counters

  • Popular questions cache

Recommended setup:

  • Node type: cache.t3.micro for < 100k users

  • Node type: cache.r6g.large for > 100k users

  • Parameter group: Custom with maxmemory-policy allkeys-lru

Step 4: Search Engine (Weaviate)

Weaviate Deployment Considerations

Important: Weaviate requires dedicated EC2 instance (no managed service available)

Instance requirements:

  • t3.large minimum (4GB RAM for vectorization)

  • Persistent EBS volume for data

  • Private subnet deployment

AskGov-specific Weaviate configuration:

Step 5: Application Deployment

Container Configuration

Docker image considerations:

  • Multi-stage build to minimize size

  • Non-root user for security

  • Health check endpoint included

ECS Task Definition Key Settings

Environment Variables in Secrets Manager

Store these securely:

  • DATABASE_URL - Full connection string

  • SESSION_SECRET - 32+ character random string

  • REDIS_URL - ElastiCache endpoint

  • WEAVIATE_API_KEY - Weaviate authentication

  • VECTORIZER_API_KEY - OpenAI/Cohere key if using

Step 6: Load Balancer Configuration

ALB Settings for AskGov

Target Group Configuration:

  • Health check path: /health or /

  • Health check interval: 30 seconds

  • Deregistration delay: 30 seconds (for graceful shutdown)

Listener Rules:

  • HTTP → HTTPS redirect

  • Host-based routing if multiple agencies

  • Path-based routing for API vs frontend

Step 7: Storage Configuration

S3 Buckets Required

  1. askgov-uploads - User file uploads

    • Versioning: Enabled

    • Encryption: SSE-S3

    • Lifecycle: Archive after 90 days

  2. askgov-exports - Data exports

    • Encryption: SSE-KMS

    • Access: Restricted to app role

  3. askgov-backups - Database backups

    • Encryption: SSE-KMS

    • Lifecycle: Delete after retention period

    • Cross-region replication recommended

Step 8: Post-Deployment Tasks

8.1 Database Initialization

8.2 Weaviate Search Initialization

8.3 Create First Admin User

Step 9: Monitoring Setup

CloudWatch Dashboards

Create dashboards for:

  • Application metrics (CPU, memory, request count)

  • Database performance (connections, query time)

  • Search latency (Weaviate response times)

  • User activity (questions created, searches performed)

Key Alarms to Configure

  1. High CPU Usage (> 70% for 5 minutes)

  2. Memory Pressure (> 85%)

  3. Database Connection Pool (> 80% utilized)

  4. 4XX/5XX Error Rate (> 1% of requests)

  5. Search Latency (> 1 second p99)

Logging Strategy

  • Application logs → CloudWatch Logs

  • Access logs → S3 with analysis via Athena

  • Security events → CloudWatch Logs with alerts

  • Audit logs → Separate encrypted log group

Step 10: Backup and Disaster Recovery

Backup Components

  1. Database: Automated RDS backups + manual snapshots

  2. Application State: S3 versioning for uploads

  3. Configuration: Infrastructure as Code in Git

  4. Search Index: Periodic Weaviate exports

Recovery Time Objectives

  • RTO: 4 hours (full recovery)

  • RPO: 1 hour (maximum data loss)

Performance Optimization

Scaling Triggers

Configure auto-scaling based on:

  • CPU utilization > 70%

  • Memory utilization > 80%

  • Request count > threshold

  • Queue depth (if using SQS)

Caching Strategy

  1. CloudFront: Static assets, 30-day cache

  2. Redis: Session data, search results (5 min), popular questions (1 hour)

  3. Application: In-memory cache for frequently accessed data

Security Hardening

AWS-Specific Security

  1. Enable AWS WAF with managed rule sets

  2. Configure AWS Shield for DDoS protection

  3. Use AWS Systems Manager for patch management

  4. Enable VPC Flow Logs for network monitoring

  5. Configure AWS GuardDuty for threat detection

Compliance Features

  • AWS CloudTrail: API audit logging

  • AWS Config: Compliance monitoring

  • AWS Security Hub: Centralized security view

  • AWS Macie: Sensitive data discovery (if needed)

Cost Optimization

Cost Reduction Strategies

  1. Use Spot Instances for non-critical workloads

  2. Reserved Instances for predictable workloads (up to 72% savings)

  3. S3 Intelligent-Tiering for automatic cost optimization

  4. Scheduled scaling for non-production environments

  5. Right-sizing based on CloudWatch metrics

Estimated Monthly Costs

Deployment Size
Users
Estimated Cost

Small

< 100k

$500-800

Medium

100k-1M

$1,500-2,500

Large

> 1M

$4,000-8,000

Note: Costs vary by region and usage patterns

Troubleshooting

Common Issues

ECS Tasks Not Starting

  • Check task role permissions

  • Verify secrets/environment variables

  • Review CloudWatch logs

  • Ensure health checks pass

Database Connection Issues

  • Verify security group rules

  • Check RDS parameter group settings

  • Confirm password in Secrets Manager

  • Test from bastion host

Search Not Working

  • Verify Weaviate is running

  • Check API key configuration

  • Ensure vectorization module is loaded

  • Review Weaviate logs

High Memory Usage

  • Review Node.js heap settings

  • Check for memory leaks

  • Scale horizontally

  • Optimize database queries

Migration Checklist

Before Going Live


circle-check

Last updated

Was this helpful?