Files
online-boutique/docs/architecture.md
Scaffolder 7e119cad41 initial commit
Change-Id: I9c68c43e939d2c1a3b95a68b71ecc5ba861a4df5
2026-03-05 13:37:56 +00:00

16 KiB

Architecture

This document describes the architecture of online-boutique.

System Overview

┌──────────────────────────────────────────────────────────────┐
│                         Developer                             │
│                                                               │
│  Backstage UI → Template → Gitea Repo → CI/CD Workflows     │
└────────────────────┬─────────────────────────────────────────┘
                     │
                     │ git push
                     ▼
┌──────────────────────────────────────────────────────────────┐
│                      Gitea Actions                            │
│                                                               │
│  ┌───────────────┐      ┌──────────────────┐                │
│  │ Build & Push  │──────▶│ Deploy Humanitec │                │
│  │  - Maven      │      │  - humctl score  │                │
│  │  - Docker     │      │  - Environment   │                │
│  │  - ACR Push   │      │  - Orchestration │                │
│  └───────────────┘      └──────────────────┘                │
└─────────────┬─────────────────┬──────────────────────────────┘
              │                 │
              │ image           │ deployment
              ▼                 ▼
┌────────────────────┐  ┌────────────────────────────────────┐
│  Azure Container   │  │      Humanitec Platform            │
│     Registry       │  │                                    │
│                    │  │  ┌──────────────────────────────┐  │
│  bstagecjotdevacr  │  │  │  Score Interpretation        │  │
│                    │  │  │  Resource Provisioning       │  │
│  Images:           │  │  │  Environment Management      │  │
│  - app:latest      │  │  └──────────────────────────────┘  │
│  - app:v1.0.0      │  │             │                      │
│  - app:git-sha     │  │             │ kubectl apply        │
└────────────────────┘  └─────────────┼──────────────────────┘
                                      │
                                      ▼
                ┌─────────────────────────────────────────────┐
                │        Azure Kubernetes Service (AKS)       │
                │                                             │
                │  ┌────────────────────────────────────┐    │
                │  │      Namespace:        │    │
                │  │                                    │    │
                │  │  ┌──────────────────────────────┐ │    │
                │  │  │  Deployment                  │ │    │
                │  │  │  - Replicas: 2               │ │    │
                │  │  │  - Health Probes             │ │    │
                │  │  │  - Resource Limits           │ │    │
                │  │  │                              │ │    │
                │  │  │  ┌───────────┐ ┌──────────┐ │ │    │
                │  │  │  │ Pod       │ │ Pod      │ │ │    │
                │  │  │  │ Spring    │ │ Spring   │ │ │    │
                │  │  │  │ Boot      │ │ Boot     │ │ │    │
                │  │  │  │ :8080     │ │ :8080    │ │ │    │
                │  │  │  └─────┬─────┘ └────┬─────┘ │ │    │
                │  │  └────────┼────────────┼───────┘ │    │
                │  │           │            │         │    │
                │  │  ┌────────▼────────────▼───────┐ │    │
                │  │  │  Service (ClusterIP)       │ │    │
                │  │  │  - Port: 80 → 8080         │ │    │
                │  │  └────────┬───────────────────┘ │    │
                │  │           │                     │    │
                │  │  ┌────────▼───────────────────┐ │    │
                │  │  │  Ingress                   │ │    │
                │  │  │  - TLS (cert-manager)      │ │    │
                │  │  │  - Host: app.kyndemo.live  │ │    │
                │  │  └────────────────────────────┘ │    │
                │  └────────────────────────────────┘    │
                │                                         │
                │  ┌─────────────────────────────────┐   │
                │  │      Monitoring Namespace       │   │
                │  │                                 │   │
                │  │  ┌────────────────────────────┐ │   │
                │  │  │  Prometheus                │ │   │
                │  │  │  - ServiceMonitor          │ │   │
                │  │  │  - Scrapes /actuator/      │ │   │
                │  │  │    prometheus every 30s    │ │   │
                │  │  └────────────────────────────┘ │   │
                │  │                                 │   │
                │  │  ┌────────────────────────────┐ │   │
                │  │  │  Grafana                   │ │   │
                │  │  │  - Spring Boot Dashboard   │ │   │
                │  │  │  - Alerts                  │ │   │
                │  │  └────────────────────────────┘ │   │
                │  └─────────────────────────────────┘   │
                └─────────────────────────────────────────┘

Component Architecture

1. Application Layer

Spring Boot Application

Technology Stack:

  • Framework: Spring Boot 3.2
  • Java: OpenJDK 17 (LTS)
  • Build: Maven 3.9
  • Runtime: Embedded Tomcat

Key Components:

@SpringBootApplication
public class GoldenPathApplication {
    // Auto-configuration
    // Component scanning
    // Property binding
}

@RestController
public class ApiController {
    @GetMapping("/")
    public String root();
    
    @GetMapping("/api/status")
    public ResponseEntity<Map<String, String>> status();
}

Configuration Management:

  • application.yml: Base configuration
  • application-development.yml: Dev overrides
  • application-production.yml: Production overrides
  • Environment variables: Runtime overrides

2. Container Layer

Docker Image

Multi-stage Build:

# Stage 1: Build
FROM maven:3.9-eclipse-temurin-17 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package -DskipTests

# Stage 2: Runtime
FROM eclipse-temurin:17-jre-alpine
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
USER 1000
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]

Optimizations:

  • Layer caching for dependencies
  • Minimal runtime image (Alpine)
  • Non-root user (UID 1000)
  • Health check support

3. Orchestration Layer

Humanitec Score

Resource Specification:

apiVersion: score.dev/v1b1
metadata:
  name: online-boutique

containers:
  app:
    image: bstagecjotdevacr.azurecr.io/online-boutique:latest
    resources:
      requests:
        memory: 512Mi
        cpu: 250m
      limits:
        memory: 1Gi
        cpu: 1000m

service:
  ports:
    http:
      port: 80
      targetPort: 8080

resources:
  route:
    type: route
    params:
      host: online-boutique.kyndemo.live

Capabilities:

  • Environment-agnostic deployment
  • Resource dependencies
  • Configuration management
  • Automatic rollback

Kubernetes Resources

Fallback Manifests:

  • deployment.yaml: Pod specification, replicas, health probes
  • service.yaml: ClusterIP service for internal routing
  • ingress.yaml: External access with TLS
  • servicemonitor.yaml: Prometheus scraping config

4. CI/CD Pipeline

Build & Push Workflow

Stages:

  1. Checkout: Clone repository
  2. Setup: Install Maven, Docker
  3. Test: Run unit & integration tests
  4. Build: Maven package
  5. Docker: Build multi-stage image
  6. Auth: Azure OIDC login
  7. Push: Push to ACR with tags

Triggers:

  • Push to main branch
  • Pull requests
  • Manual dispatch

Deploy Workflow

Stages:

  1. Parse Image: Extract image reference from build
  2. Setup: Install humctl CLI
  3. Score Update: Replace image in score.yaml
  4. Deploy: Execute humctl score deploy
  5. Verify: Check deployment status

Secrets:

  • HUMANITEC_TOKEN: Platform authentication
  • AZURE_CLIENT_ID, AZURE_TENANT_ID: OIDC federation

5. Observability Layer

Metrics Collection

Flow:

Spring Boot App
    │
    └── /actuator/prometheus (HTTP endpoint)
            │
            └── Prometheus (scrape every 30s)
                    │
                    └── TSDB (15-day retention)
                            │
                            └── Grafana (visualization)

ServiceMonitor Configuration:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
spec:
  selector:
    matchLabels:
      app: online-boutique
  endpoints:
    - port: http
      path: /actuator/prometheus
      interval: 30s

Metrics Categories

  1. HTTP Metrics:

    • Request count/rate
    • Response time (avg, p95, p99)
    • Status code distribution
  2. JVM Metrics:

    • Heap/non-heap memory
    • GC pause time
    • Thread count
  3. System Metrics:

    • CPU usage
    • File descriptors
    • Process uptime

Data Flow

Request Flow

User Request
    │
    ▼
Ingress Controller (nginx)
    │ TLS termination
    │ Host routing
    ▼
Service (ClusterIP)
    │ Load balancing
    │ Port mapping
    ▼
Pod (Spring Boot)
    │ Request handling
    │ Business logic
    ▼
Response

Metrics Flow

Spring Boot (Micrometer)
    │ Collect metrics
    │ Format Prometheus
    ▼
Actuator Endpoint
    │ Expose /actuator/prometheus
    ▼
Prometheus (Scraper)
    │ Pull every 30s
    │ Store in TSDB
    ▼
Grafana
    │ Query PromQL
    │ Render dashboards
    ▼
User Visualization

Deployment Flow

Git Push
    │
    ▼
Gitea Actions (Webhook)
    │
    ├── Build Workflow
    │   │ Maven test + package
    │   │ Docker build
    │   │ ACR push
    │   └── Output: image reference
    │
    └── Deploy Workflow
        │ Parse image
        │ Update score.yaml
        │ humctl score deploy
        │
        ▼
Humanitec Platform
    │ Interpret Score
    │ Provision resources
    │ Generate manifests
    │
    ▼
Kubernetes API
    │ Apply deployment
    │ Create/update resources
    │ Schedule pods
    │
    ▼
Running Application

Security Architecture

Authentication & Authorization

  1. Azure Workload Identity:

    • OIDC federation for CI/CD
    • No static credentials
    • Scoped permissions
  2. Service Account:

    • Kubernetes ServiceAccount
    • Bound to Azure Managed Identity
    • Limited RBAC
  3. Image Pull Secrets:

    • AKS ACR integration
    • Managed identity for registry access

Network Security

  1. Ingress:

    • TLS 1.2+ only
    • Cert-manager for automatic cert renewal
    • Rate limiting (optional)
  2. Network Policies:

    • Restrict pod-to-pod communication
    • Allow only required egress
  3. Service Mesh (Future):

    • mTLS between services
    • Fine-grained authorization

Application Security

  1. Container:

    • Non-root user (UID 1000)
    • Read-only root filesystem
    • No privilege escalation
  2. Dependencies:

    • Regular Maven dependency updates
    • Vulnerability scanning (Snyk/Trivy)
  3. Secrets Management:

    • Azure Key Vault integration
    • CSI driver for secret mounting
    • No secrets in environment variables

Scalability

Horizontal Scaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Vertical Scaling

Use VPA (Vertical Pod Autoscaler) for automatic resource recommendation.

Database Scaling (Future)

  • Connection pooling (HikariCP)
  • Read replicas for read-heavy workloads
  • Caching layer (Redis)

High Availability

Application Level

  • Replicas: Minimum 2 pods per environment
  • Anti-affinity: Spread across nodes
  • Readiness probes: Only route to healthy pods

Infrastructure Level

  • AKS: Multi-zone node pools
  • Ingress: Multiple replicas with PodDisruptionBudget
  • Monitoring: High availability via Thanos

Disaster Recovery

Backup Strategy

  1. Application State: Stateless, no backup needed
  2. Configuration: Stored in Git
  3. Metrics: 15-day retention, export to long-term storage
  4. Container Images: Retained in ACR with retention policy

Recovery Procedures

  1. Pod failure: Automatic restart by kubelet
  2. Node failure: Automatic rescheduling to healthy nodes
  3. Cluster failure: Redeploy via Terraform + Humanitec
  4. Regional failure: Failover to secondary region (if configured)

Technology Decisions

Why Spring Boot?

  • Industry-standard Java framework
  • Rich ecosystem (Actuator, Security, Data)
  • Production-ready features out of the box
  • Easy testing and debugging

Why Humanitec?

  • Environment-agnostic deployment
  • Score specification simplicity
  • Resource dependency management
  • Reduces K8s complexity

Why Prometheus + Grafana?

  • Cloud-native standard
  • Rich query language (PromQL)
  • Wide integration support
  • Open-source, vendor-neutral

Why Maven?

  • Mature dependency management
  • Extensive plugin ecosystem
  • Declarative configuration
  • Wide adoption in Java community

Future Enhancements

  1. Database Integration: PostgreSQL with Flyway migrations
  2. Caching: Redis for session storage
  3. Messaging: Kafka for event-driven architecture
  4. Tracing: Jaeger/Zipkin for distributed tracing
  5. Service Mesh: Istio for advanced traffic management
  6. Multi-region: Active-active deployment

Next Steps