# Architecture This document describes the architecture of online-boutique. ## System Overview ``` ┌──────────────────────────────────────────────────────────────┐ │ Developer │ │ │ │ Backstage UI → Template → Gitea Repo → CI/CD Workflows │ └────────────────────┬─────────────────────────────────────────┘ │ │ git push ▼ ┌──────────────────────────────────────────────────────────────┐ │ Gitea Actions │ │ │ │ ┌───────────────┐ ┌──────────────────┐ │ │ │ Build & Push │──────▶│ Deploy Humanitec │ │ │ │ - Maven │ │ - humctl score │ │ │ │ - Docker │ │ - Environment │ │ │ │ - ACR Push │ │ - Orchestration │ │ │ └───────────────┘ └──────────────────┘ │ └─────────────┬─────────────────┬──────────────────────────────┘ │ │ │ image │ deployment ▼ ▼ ┌────────────────────┐ ┌────────────────────────────────────┐ │ Azure Container │ │ Humanitec Platform │ │ Registry │ │ │ │ │ │ ┌──────────────────────────────┐ │ │ bstagecjotdevacr │ │ │ Score Interpretation │ │ │ │ │ │ Resource Provisioning │ │ │ Images: │ │ │ Environment Management │ │ │ - app:latest │ │ └──────────────────────────────┘ │ │ - app:v1.0.0 │ │ │ │ │ - app:git-sha │ │ │ kubectl apply │ └────────────────────┘ └─────────────┼──────────────────────┘ │ ▼ ┌─────────────────────────────────────────────┐ │ Azure Kubernetes Service (AKS) │ │ │ │ ┌────────────────────────────────────┐ │ │ │ Namespace: │ │ │ │ │ │ │ │ ┌──────────────────────────────┐ │ │ │ │ │ Deployment │ │ │ │ │ │ - Replicas: 2 │ │ │ │ │ │ - Health Probes │ │ │ │ │ │ - Resource Limits │ │ │ │ │ │ │ │ │ │ │ │ ┌───────────┐ ┌──────────┐ │ │ │ │ │ │ │ Pod │ │ Pod │ │ │ │ │ │ │ │ Spring │ │ Spring │ │ │ │ │ │ │ │ Boot │ │ Boot │ │ │ │ │ │ │ │ :8080 │ │ :8080 │ │ │ │ │ │ │ └─────┬─────┘ └────┬─────┘ │ │ │ │ │ └────────┼────────────┼───────┘ │ │ │ │ │ │ │ │ │ │ ┌────────▼────────────▼───────┐ │ │ │ │ │ Service (ClusterIP) │ │ │ │ │ │ - Port: 80 → 8080 │ │ │ │ │ └────────┬───────────────────┘ │ │ │ │ │ │ │ │ │ ┌────────▼───────────────────┐ │ │ │ │ │ Ingress │ │ │ │ │ │ - TLS (cert-manager) │ │ │ │ │ │ - Host: app.kyndemo.live │ │ │ │ │ └────────────────────────────┘ │ │ │ └────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────┐ │ │ │ Monitoring Namespace │ │ │ │ │ │ │ │ ┌────────────────────────────┐ │ │ │ │ │ Prometheus │ │ │ │ │ │ - ServiceMonitor │ │ │ │ │ │ - Scrapes /actuator/ │ │ │ │ │ │ prometheus every 30s │ │ │ │ │ └────────────────────────────┘ │ │ │ │ │ │ │ │ ┌────────────────────────────┐ │ │ │ │ │ Grafana │ │ │ │ │ │ - Spring Boot Dashboard │ │ │ │ │ │ - Alerts │ │ │ │ │ └────────────────────────────┘ │ │ │ └─────────────────────────────────┘ │ └─────────────────────────────────────────┘ ``` ## Component Architecture ### 1. Application Layer #### Spring Boot Application **Technology Stack:** - **Framework**: Spring Boot 3.2 - **Java**: OpenJDK 17 (LTS) - **Build**: Maven 3.9 - **Runtime**: Embedded Tomcat **Key Components:** ```java @SpringBootApplication public class GoldenPathApplication { // Auto-configuration // Component scanning // Property binding } @RestController public class ApiController { @GetMapping("/") public String root(); @GetMapping("/api/status") public ResponseEntity> status(); } ``` **Configuration Management:** - `application.yml`: Base configuration - `application-development.yml`: Dev overrides - `application-production.yml`: Production overrides - Environment variables: Runtime overrides ### 2. Container Layer #### Docker Image **Multi-stage Build:** ```dockerfile # Stage 1: Build FROM maven:3.9-eclipse-temurin-17 AS builder WORKDIR /app COPY pom.xml . RUN mvn dependency:go-offline COPY src ./src RUN mvn package -DskipTests # Stage 2: Runtime FROM eclipse-temurin:17-jre-alpine WORKDIR /app COPY --from=builder /app/target/*.jar app.jar USER 1000 EXPOSE 8080 ENTRYPOINT ["java", "-jar", "app.jar"] ``` **Optimizations:** - Layer caching for dependencies - Minimal runtime image (Alpine) - Non-root user (UID 1000) - Health check support ### 3. Orchestration Layer #### Humanitec Score **Resource Specification:** ```yaml apiVersion: score.dev/v1b1 metadata: name: online-boutique containers: app: image: bstagecjotdevacr.azurecr.io/online-boutique:latest resources: requests: memory: 512Mi cpu: 250m limits: memory: 1Gi cpu: 1000m service: ports: http: port: 80 targetPort: 8080 resources: route: type: route params: host: online-boutique.kyndemo.live ``` **Capabilities:** - Environment-agnostic deployment - Resource dependencies - Configuration management - Automatic rollback #### Kubernetes Resources **Fallback Manifests:** - `deployment.yaml`: Pod specification, replicas, health probes - `service.yaml`: ClusterIP service for internal routing - `ingress.yaml`: External access with TLS - `servicemonitor.yaml`: Prometheus scraping config ### 4. CI/CD Pipeline #### Build & Push Workflow **Stages:** 1. **Checkout**: Clone repository 2. **Setup**: Install Maven, Docker 3. **Test**: Run unit & integration tests 4. **Build**: Maven package 5. **Docker**: Build multi-stage image 6. **Auth**: Azure OIDC login 7. **Push**: Push to ACR with tags **Triggers:** - Push to `main` branch - Pull requests - Manual dispatch #### Deploy Workflow **Stages:** 1. **Parse Image**: Extract image reference from build 2. **Setup**: Install humctl CLI 3. **Score Update**: Replace image in score.yaml 4. **Deploy**: Execute humctl score deploy 5. **Verify**: Check deployment status **Secrets:** - `HUMANITEC_TOKEN`: Platform authentication - `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`: OIDC federation ### 5. Observability Layer #### Metrics Collection **Flow:** ``` Spring Boot App │ └── /actuator/prometheus (HTTP endpoint) │ └── Prometheus (scrape every 30s) │ └── TSDB (15-day retention) │ └── Grafana (visualization) ``` **ServiceMonitor Configuration:** ```yaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor spec: selector: matchLabels: app: online-boutique endpoints: - port: http path: /actuator/prometheus interval: 30s ``` #### Metrics Categories 1. **HTTP Metrics**: - Request count/rate - Response time (avg, p95, p99) - Status code distribution 2. **JVM Metrics**: - Heap/non-heap memory - GC pause time - Thread count 3. **System Metrics**: - CPU usage - File descriptors - Process uptime ## Data Flow ### Request Flow ``` User Request │ ▼ Ingress Controller (nginx) │ TLS termination │ Host routing ▼ Service (ClusterIP) │ Load balancing │ Port mapping ▼ Pod (Spring Boot) │ Request handling │ Business logic ▼ Response ``` ### Metrics Flow ``` Spring Boot (Micrometer) │ Collect metrics │ Format Prometheus ▼ Actuator Endpoint │ Expose /actuator/prometheus ▼ Prometheus (Scraper) │ Pull every 30s │ Store in TSDB ▼ Grafana │ Query PromQL │ Render dashboards ▼ User Visualization ``` ### Deployment Flow ``` Git Push │ ▼ Gitea Actions (Webhook) │ ├── Build Workflow │ │ Maven test + package │ │ Docker build │ │ ACR push │ └── Output: image reference │ └── Deploy Workflow │ Parse image │ Update score.yaml │ humctl score deploy │ ▼ Humanitec Platform │ Interpret Score │ Provision resources │ Generate manifests │ ▼ Kubernetes API │ Apply deployment │ Create/update resources │ Schedule pods │ ▼ Running Application ``` ## Security Architecture ### Authentication & Authorization 1. **Azure Workload Identity**: - OIDC federation for CI/CD - No static credentials - Scoped permissions 2. **Service Account**: - Kubernetes ServiceAccount - Bound to Azure Managed Identity - Limited RBAC 3. **Image Pull Secrets**: - AKS ACR integration - Managed identity for registry access ### Network Security 1. **Ingress**: - TLS 1.2+ only - Cert-manager for automatic cert renewal - Rate limiting (optional) 2. **Network Policies**: - Restrict pod-to-pod communication - Allow only required egress 3. **Service Mesh (Future)**: - mTLS between services - Fine-grained authorization ### Application Security 1. **Container**: - Non-root user (UID 1000) - Read-only root filesystem - No privilege escalation 2. **Dependencies**: - Regular Maven dependency updates - Vulnerability scanning (Snyk/Trivy) 3. **Secrets Management**: - Azure Key Vault integration - CSI driver for secret mounting - No secrets in environment variables ## Scalability ### Horizontal Scaling ```yaml apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler spec: minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 ``` ### Vertical Scaling Use **VPA (Vertical Pod Autoscaler)** for automatic resource recommendation. ### Database Scaling (Future) - Connection pooling (HikariCP) - Read replicas for read-heavy workloads - Caching layer (Redis) ## High Availability ### Application Level - **Replicas**: Minimum 2 pods per environment - **Anti-affinity**: Spread across nodes - **Readiness probes**: Only route to healthy pods ### Infrastructure Level - **AKS**: Multi-zone node pools - **Ingress**: Multiple replicas with PodDisruptionBudget - **Monitoring**: High availability via Thanos ## Disaster Recovery ### Backup Strategy 1. **Application State**: Stateless, no backup needed 2. **Configuration**: Stored in Git 3. **Metrics**: 15-day retention, export to long-term storage 4. **Container Images**: Retained in ACR with retention policy ### Recovery Procedures 1. **Pod failure**: Automatic restart by kubelet 2. **Node failure**: Automatic rescheduling to healthy nodes 3. **Cluster failure**: Redeploy via Terraform + Humanitec 4. **Regional failure**: Failover to secondary region (if configured) ## Technology Decisions ### Why Spring Boot? - Industry-standard Java framework - Rich ecosystem (Actuator, Security, Data) - Production-ready features out of the box - Easy testing and debugging ### Why Humanitec? - Environment-agnostic deployment - Score specification simplicity - Resource dependency management - Reduces K8s complexity ### Why Prometheus + Grafana? - Cloud-native standard - Rich query language (PromQL) - Wide integration support - Open-source, vendor-neutral ### Why Maven? - Mature dependency management - Extensive plugin ecosystem - Declarative configuration - Wide adoption in Java community ## Future Enhancements 1. **Database Integration**: PostgreSQL with Flyway migrations 2. **Caching**: Redis for session storage 3. **Messaging**: Kafka for event-driven architecture 4. **Tracing**: Jaeger/Zipkin for distributed tracing 5. **Service Mesh**: Istio for advanced traffic management 6. **Multi-region**: Active-active deployment ## Next Steps - [Review deployment guide](deployment.md) - [Configure monitoring](monitoring.md) - [Return to overview](index.md)