# Infrastructure Guidance

This guide provides cloud-agnostic infrastructure guidance for deploying AskGov in production environments. We focus on AskGov-specific requirements rather than general cloud setup instructions.

{% hint style="info" %}
**For AWS users**: See our detailed AWS Production Deployment guide for specific instructions.
{% endhint %}

### Architecture Overview

#### High-Level Architecture

{% @mermaid/diagram content="graph TB
subgraph "Internet"
U\[Users]
end

```
subgraph "Edge Layer"
    CDN[CDN/CloudFlare]
    WAF[Web Application Firewall]
end

subgraph "Application Layer"
    LB[Load Balancer]
    APP1[App Server 1]
    APP2[App Server 2]
    APP3[App Server N]
end

subgraph "Data Layer"
    DB[(CockroachDB/PostgreSQL)]
    CACHE[(Redis Cache)]
    SEARCH[(Weaviate)]
end

subgraph "Storage"
    FILES[Object Storage]
    BACKUPS[Backup Storage]
end

U --> CDN
CDN --> WAF
WAF --> LB
LB --> APP1
LB --> APP2
LB --> APP3
APP1 --> DB
APP1 --> CACHE
APP1 --> SEARCH" %}
```

#### Component Requirements

| Component         | Purpose                               | AskGov-Specific Needs                     |
| ----------------- | ------------------------------------- | ----------------------------------------- |
| **Load Balancer** | Traffic distribution, SSL termination | Session affinity not required (stateless) |
| **App Servers**   | Remix application                     | Node.js 20+, horizontal scaling           |
| **Database**      | Primary data store                    | PostgreSQL 14+ or CockroachDB             |
| **Weaviate**      | Vector search for Q\&A                | Requires 4GB+ RAM for vectorization       |
| **Redis**         | Caching and rate limiting             | Session storage, search results cache     |

### Infrastructure Sizing

#### Deployment Guidance

**Small deployments (< 100k citizens):**

* Start with minimal resources
* Single instances may be sufficient initially
* Monitor and scale as usage grows

**Medium deployments (100k - 1M citizens):**

* Multiple app server instances for redundancy
* Consider database clustering
* Implement caching layer

**Large deployments (> 1M citizens):**

* Full high-availability setup
* Multiple instances of each component
* Consider geographic distribution

**Key Principle**: Start small and scale based on actual usage metrics rather than predictions. Most deployments can begin with modest resources and grow as needed.

### Cloud Provider Quick Reference

#### Service Mapping

| AskGov Component  | AWS               | Azure                   | GCP                  | On-Premise               |
| ----------------- | ----------------- | ----------------------- | -------------------- | ------------------------ |
| **App Servers**   | ECS Fargate / EC2 | App Service / AKS       | Cloud Run / GKE      | Docker / VMs             |
| **Load Balancer** | ALB               | Application Gateway     | Cloud Load Balancing | Nginx / HAProxy          |
| **Database**      | RDS PostgreSQL    | Database for PostgreSQL | Cloud SQL            | PostgreSQL / CockroachDB |
| **Weaviate**      | EC2               | VMs                     | Compute Engine       | Docker / VMs             |
| **Redis**         | ElastiCache       | Azure Cache             | Memorystore          | Redis                    |
| **Storage**       | S3                | Blob Storage            | Cloud Storage        | MinIO / NFS              |
| **Secrets**       | Secrets Manager   | Key Vault               | Secret Manager       | Vault / Encrypted files  |

#### Key Considerations by Provider

**AWS**

* Use RDS for simpler setup, EC2 for CockroachDB
* Weaviate can be self-managed and host on AWS yourself, or can go for managed option
* Consider Fargate for serverless container management

**Azure**

* App Service provides easy PaaS deployment
* Weaviate needs VMs or AKS
* Consider Azure Database for PostgreSQL Flexible Server

**GCP**

* Cloud Run works well for containerized AskGov
* Weaviate requires Compute Engine
* Cloud SQL supports PostgreSQL with automatic backups

**On-Premise**

* Docker Compose for simple deployments
* Minimum 3 servers for high availability
* Consider OpenShift or Rancher for container orchestration

TODO: format with cards

### Docker Deployment

#### Simple Docker Compose Setup

```yaml
version: '3.8'

services:
  askgov:
    image: askgov:latest
    ports:
      - "8080:8080"
    environment:
      - NODE_ENV=production
      - DATABASE_URL=postgresql://root@cockroachdb:26257/askgov
      - REDIS_URL=redis://redis:6379
      - WEAVIATE_INSTANCE=http://weaviate:8080
    depends_on:
      - cockroachdb
      - redis
      - weaviate
    deploy:
      replicas: 3
      
  cockroachdb:
    image: cockroachdb/cockroach:latest
    command: start --insecure --store=node1
    volumes:
      - cockroach-data:/cockroach/cockroach-data
    ports:
      - "26257:26257"
      
  redis:
    image: redis:7-alpine
    volumes:
      - redis-data:/data
      
  weaviate:
    image: semitechnologies/weaviate:latest
    environment:
      - AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true
      - PERSISTENCE_DATA_PATH=/var/lib/weaviate
    volumes:
      - weaviate-data:/var/lib/weaviate

volumes:
  cockroach-data:
  redis-data:
  weaviate-data:
```

### Network Architecture

#### Security Zones

{% @mermaid/diagram content="graph TD
Mermaid --> Diagram" %}

#### Essential Firewall Rules

TODO: maybe not needed, can remove

| Source        | Destination   | Port       | Purpose                |
| ------------- | ------------- | ---------- | ---------------------- |
| Internet      | Load Balancer | 443        | HTTPS access           |
| Load Balancer | App Servers   | 8080       | Application            |
| App Servers   | Database      | 5432/26257 | PostgreSQL/CockroachDB |
| App Servers   | Redis         | 6379       | Cache                  |
| App Servers   | Weaviate      | 8080       | Search                 |

### High Availability Essentials

#### Minimum HA Setup

* **Application**: At least 2 instances across availability zones
* **Database**: 3-node cluster (odd number for quorum)
* **Redis**: Primary with at least one replica
* **Weaviate**: Can start with single instance, scale as needed

#### Database HA Options

**PostgreSQL with Streaming Replication**

* Primary + 2 standby servers
* Automatic failover with Patroni or similar
* Point-in-time recovery capability

**CockroachDB (Recommended for HA)**

* Built-in distributed architecture
* Automatic failover and rebalancing
* No SPOF

### Monitoring Essentials

#### Key Metrics for AskGov

| Metric                       | Why It Matters         | Alert Threshold Examples |
| ---------------------------- | ---------------------- | ------------------------ |
| **Question Creation Rate**   | User engagement        | Sudden drops             |
| **Search Response Time**     | User experience        | > 1 second               |
| **Answer Feedback Ratio**    | Content quality        | < 60% positive           |
| **Database Connection Pool** | Performance bottleneck | > 80% utilized           |
| **Weaviate Query Time**      | Search performance     | > 500ms p95              |
| **Redis Hit Rate**           | Cache effectiveness    | < 80%                    |

#### Recommended Monitoring Stack

* **Metrics**: Prometheus + Grafana (or cloud provider equivalents)
* **Logs**: ELK Stack or cloud logging services
* **APM**: Datadog, New Relic, or open-source alternatives
* **Uptime**: External monitoring service

***

{% hint style="success" %}
**Next Steps**

* For AWS deployment: See AWS Production Guide
* For customization: Check Component Customization
* For security: Review Security Guide
  {% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://international.open.gov.sg/self-hosting/askgov/infrastructure-guidance.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
