initial commit

Change-Id: I9c68c43e939d2c1a3b95a68b71ecc5ba861a4df5
2026-03-05 13:37:56 +00:00
commit 7e119cad41
24 changed files with 3024 additions and 0 deletions
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -0,0 +1,521 @@
+# Architecture
+
+This document describes the architecture of online-boutique.
+
+## System Overview
+
+```
+┌──────────────────────────────────────────────────────────────┐
+│                         Developer                             │
+│                                                               │
+│  Backstage UI → Template → Gitea Repo → CI/CD Workflows     │
+└────────────────────┬─────────────────────────────────────────┘
+                     │
+                     │ git push
+                     ▼
+┌──────────────────────────────────────────────────────────────┐
+│                      Gitea Actions                            │
+│                                                               │
+│  ┌───────────────┐      ┌──────────────────┐                │
+│  │ Build & Push  │──────▶│ Deploy Humanitec │                │
+│  │  - Maven      │      │  - humctl score  │                │
+│  │  - Docker     │      │  - Environment   │                │
+│  │  - ACR Push   │      │  - Orchestration │                │
+│  └───────────────┘      └──────────────────┘                │
+└─────────────┬─────────────────┬──────────────────────────────┘
+              │                 │
+              │ image           │ deployment
+              ▼                 ▼
+┌────────────────────┐  ┌────────────────────────────────────┐
+│  Azure Container   │  │      Humanitec Platform            │
+│     Registry       │  │                                    │
+│                    │  │  ┌──────────────────────────────┐  │
+│  bstagecjotdevacr  │  │  │  Score Interpretation        │  │
+│                    │  │  │  Resource Provisioning       │  │
+│  Images:           │  │  │  Environment Management      │  │
+│  - app:latest      │  │  └──────────────────────────────┘  │
+│  - app:v1.0.0      │  │             │                      │
+│  - app:git-sha     │  │             │ kubectl apply        │
+└────────────────────┘  └─────────────┼──────────────────────┘
+                                      │
+                                      ▼
+                ┌─────────────────────────────────────────────┐
+                │        Azure Kubernetes Service (AKS)       │
+                │                                             │
+                │  ┌────────────────────────────────────┐    │
+                │  │      Namespace:        │    │
+                │  │                                    │    │
+                │  │  ┌──────────────────────────────┐ │    │
+                │  │  │  Deployment                  │ │    │
+                │  │  │  - Replicas: 2               │ │    │
+                │  │  │  - Health Probes             │ │    │
+                │  │  │  - Resource Limits           │ │    │
+                │  │  │                              │ │    │
+                │  │  │  ┌───────────┐ ┌──────────┐ │ │    │
+                │  │  │  │ Pod       │ │ Pod      │ │ │    │
+                │  │  │  │ Spring    │ │ Spring   │ │ │    │
+                │  │  │  │ Boot      │ │ Boot     │ │ │    │
+                │  │  │  │ :8080     │ │ :8080    │ │ │    │
+                │  │  │  └─────┬─────┘ └────┬─────┘ │ │    │
+                │  │  └────────┼────────────┼───────┘ │    │
+                │  │           │            │         │    │
+                │  │  ┌────────▼────────────▼───────┐ │    │
+                │  │  │  Service (ClusterIP)       │ │    │
+                │  │  │  - Port: 80 → 8080         │ │    │
+                │  │  └────────┬───────────────────┘ │    │
+                │  │           │                     │    │
+                │  │  ┌────────▼───────────────────┐ │    │
+                │  │  │  Ingress                   │ │    │
+                │  │  │  - TLS (cert-manager)      │ │    │
+                │  │  │  - Host: app.kyndemo.live  │ │    │
+                │  │  └────────────────────────────┘ │    │
+                │  └────────────────────────────────┘    │
+                │                                         │
+                │  ┌─────────────────────────────────┐   │
+                │  │      Monitoring Namespace       │   │
+                │  │                                 │   │
+                │  │  ┌────────────────────────────┐ │   │
+                │  │  │  Prometheus                │ │   │
+                │  │  │  - ServiceMonitor          │ │   │
+                │  │  │  - Scrapes /actuator/      │ │   │
+                │  │  │    prometheus every 30s    │ │   │
+                │  │  └────────────────────────────┘ │   │
+                │  │                                 │   │
+                │  │  ┌────────────────────────────┐ │   │
+                │  │  │  Grafana                   │ │   │
+                │  │  │  - Spring Boot Dashboard   │ │   │
+                │  │  │  - Alerts                  │ │   │
+                │  │  └────────────────────────────┘ │   │
+                │  └─────────────────────────────────┘   │
+                └─────────────────────────────────────────┘
+```
+
+## Component Architecture
+
+### 1. Application Layer
+
+#### Spring Boot Application
+
+**Technology Stack:**
+- **Framework**: Spring Boot 3.2
+- **Java**: OpenJDK 17 (LTS)
+- **Build**: Maven 3.9
+- **Runtime**: Embedded Tomcat
+
+**Key Components:**
+
+```java
+@SpringBootApplication
+public class GoldenPathApplication {
+    // Auto-configuration
+    // Component scanning
+    // Property binding
+}
+
+@RestController
+public class ApiController {
+    @GetMapping("/")
+    public String root();
+    
+    @GetMapping("/api/status")
+    public ResponseEntity<Map<String, String>> status();
+}
+```
+
+**Configuration Management:**
+- `application.yml`: Base configuration
+- `application-development.yml`: Dev overrides
+- `application-production.yml`: Production overrides
+- Environment variables: Runtime overrides
+
+### 2. Container Layer
+
+#### Docker Image
+
+**Multi-stage Build:**
+
+```dockerfile
+# Stage 1: Build
+FROM maven:3.9-eclipse-temurin-17 AS builder
+WORKDIR /app
+COPY pom.xml .
+RUN mvn dependency:go-offline
+COPY src ./src
+RUN mvn package -DskipTests
+
+# Stage 2: Runtime
+FROM eclipse-temurin:17-jre-alpine
+WORKDIR /app
+COPY --from=builder /app/target/*.jar app.jar
+USER 1000
+EXPOSE 8080
+ENTRYPOINT ["java", "-jar", "app.jar"]
+```
+
+**Optimizations:**
+- Layer caching for dependencies
+- Minimal runtime image (Alpine)
+- Non-root user (UID 1000)
+- Health check support
+
+### 3. Orchestration Layer
+
+#### Humanitec Score
+
+**Resource Specification:**
+
+```yaml
+apiVersion: score.dev/v1b1
+metadata:
+  name: online-boutique
+
+containers:
+  app:
+    image: bstagecjotdevacr.azurecr.io/online-boutique:latest
+    resources:
+      requests:
+        memory: 512Mi
+        cpu: 250m
+      limits:
+        memory: 1Gi
+        cpu: 1000m
+
+service:
+  ports:
+    http:
+      port: 80
+      targetPort: 8080
+
+resources:
+  route:
+    type: route
+    params:
+      host: online-boutique.kyndemo.live
+```
+
+**Capabilities:**
+- Environment-agnostic deployment
+- Resource dependencies
+- Configuration management
+- Automatic rollback
+
+#### Kubernetes Resources
+
+**Fallback Manifests:**
+- `deployment.yaml`: Pod specification, replicas, health probes
+- `service.yaml`: ClusterIP service for internal routing
+- `ingress.yaml`: External access with TLS
+- `servicemonitor.yaml`: Prometheus scraping config
+
+### 4. CI/CD Pipeline
+
+#### Build & Push Workflow
+
+**Stages:**
+
+1. **Checkout**: Clone repository
+2. **Setup**: Install Maven, Docker
+3. **Test**: Run unit & integration tests
+4. **Build**: Maven package
+5. **Docker**: Build multi-stage image
+6. **Auth**: Azure OIDC login
+7. **Push**: Push to ACR with tags
+
+**Triggers:**
+- Push to `main` branch
+- Pull requests
+- Manual dispatch
+
+#### Deploy Workflow
+
+**Stages:**
+
+1. **Parse Image**: Extract image reference from build
+2. **Setup**: Install humctl CLI
+3. **Score Update**: Replace image in score.yaml
+4. **Deploy**: Execute humctl score deploy
+5. **Verify**: Check deployment status
+
+**Secrets:**
+- `HUMANITEC_TOKEN`: Platform authentication
+- `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`: OIDC federation
+
+### 5. Observability Layer
+
+#### Metrics Collection
+
+**Flow:**
+
+```
+Spring Boot App
+    │
+    └── /actuator/prometheus (HTTP endpoint)
+            │
+            └── Prometheus (scrape every 30s)
+                    │
+                    └── TSDB (15-day retention)
+                            │
+                            └── Grafana (visualization)
+```
+
+**ServiceMonitor Configuration:**
+
+```yaml
+apiVersion: monitoring.coreos.com/v1
+kind: ServiceMonitor
+spec:
+  selector:
+    matchLabels:
+      app: online-boutique
+  endpoints:
+    - port: http
+      path: /actuator/prometheus
+      interval: 30s
+```
+
+#### Metrics Categories
+
+1. **HTTP Metrics**:
+   - Request count/rate
+   - Response time (avg, p95, p99)
+   - Status code distribution
+
+2. **JVM Metrics**:
+   - Heap/non-heap memory
+   - GC pause time
+   - Thread count
+
+3. **System Metrics**:
+   - CPU usage
+   - File descriptors
+   - Process uptime
+
+## Data Flow
+
+### Request Flow
+
+```
+User Request
+    │
+    ▼
+Ingress Controller (nginx)
+    │ TLS termination
+    │ Host routing
+    ▼
+Service (ClusterIP)
+    │ Load balancing
+    │ Port mapping
+    ▼
+Pod (Spring Boot)
+    │ Request handling
+    │ Business logic
+    ▼
+Response
+```
+
+### Metrics Flow
+
+```
+Spring Boot (Micrometer)
+    │ Collect metrics
+    │ Format Prometheus
+    ▼
+Actuator Endpoint
+    │ Expose /actuator/prometheus
+    ▼
+Prometheus (Scraper)
+    │ Pull every 30s
+    │ Store in TSDB
+    ▼
+Grafana
+    │ Query PromQL
+    │ Render dashboards
+    ▼
+User Visualization
+```
+
+### Deployment Flow
+
+```
+Git Push
+    │
+    ▼
+Gitea Actions (Webhook)
+    │
+    ├── Build Workflow
+    │   │ Maven test + package
+    │   │ Docker build
+    │   │ ACR push
+    │   └── Output: image reference
+    │
+    └── Deploy Workflow
+        │ Parse image
+        │ Update score.yaml
+        │ humctl score deploy
+        │
+        ▼
+Humanitec Platform
+    │ Interpret Score
+    │ Provision resources
+    │ Generate manifests
+    │
+    ▼
+Kubernetes API
+    │ Apply deployment
+    │ Create/update resources
+    │ Schedule pods
+    │
+    ▼
+Running Application
+```
+
+## Security Architecture
+
+### Authentication & Authorization
+
+1. **Azure Workload Identity**:
+   - OIDC federation for CI/CD
+   - No static credentials
+   - Scoped permissions
+
+2. **Service Account**:
+   - Kubernetes ServiceAccount
+   - Bound to Azure Managed Identity
+   - Limited RBAC
+
+3. **Image Pull Secrets**:
+   - AKS ACR integration
+   - Managed identity for registry access
+
+### Network Security
+
+1. **Ingress**:
+   - TLS 1.2+ only
+   - Cert-manager for automatic cert renewal
+   - Rate limiting (optional)
+
+2. **Network Policies**:
+   - Restrict pod-to-pod communication
+   - Allow only required egress
+
+3. **Service Mesh (Future)**:
+   - mTLS between services
+   - Fine-grained authorization
+
+### Application Security
+
+1. **Container**:
+   - Non-root user (UID 1000)
+   - Read-only root filesystem
+   - No privilege escalation
+
+2. **Dependencies**:
+   - Regular Maven dependency updates
+   - Vulnerability scanning (Snyk/Trivy)
+
+3. **Secrets Management**:
+   - Azure Key Vault integration
+   - CSI driver for secret mounting
+   - No secrets in environment variables
+
+## Scalability
+
+### Horizontal Scaling
+
+```yaml
+apiVersion: autoscaling/v2
+kind: HorizontalPodAutoscaler
+spec:
+  minReplicas: 2
+  maxReplicas: 10
+  metrics:
+    - type: Resource
+      resource:
+        name: cpu
+        target:
+          type: Utilization
+          averageUtilization: 70
+    - type: Resource
+      resource:
+        name: memory
+        target:
+          type: Utilization
+          averageUtilization: 80
+```
+
+### Vertical Scaling
+
+Use **VPA (Vertical Pod Autoscaler)** for automatic resource recommendation.
+
+### Database Scaling (Future)
+
+- Connection pooling (HikariCP)
+- Read replicas for read-heavy workloads
+- Caching layer (Redis)
+
+## High Availability
+
+### Application Level
+- **Replicas**: Minimum 2 pods per environment
+- **Anti-affinity**: Spread across nodes
+- **Readiness probes**: Only route to healthy pods
+
+### Infrastructure Level
+- **AKS**: Multi-zone node pools
+- **Ingress**: Multiple replicas with PodDisruptionBudget
+- **Monitoring**: High availability via Thanos
+
+## Disaster Recovery
+
+### Backup Strategy
+1. **Application State**: Stateless, no backup needed
+2. **Configuration**: Stored in Git
+3. **Metrics**: 15-day retention, export to long-term storage
+4. **Container Images**: Retained in ACR with retention policy
+
+### Recovery Procedures
+1. **Pod failure**: Automatic restart by kubelet
+2. **Node failure**: Automatic rescheduling to healthy nodes
+3. **Cluster failure**: Redeploy via Terraform + Humanitec
+4. **Regional failure**: Failover to secondary region (if configured)
+
+## Technology Decisions
+
+### Why Spring Boot?
+- Industry-standard Java framework
+- Rich ecosystem (Actuator, Security, Data)
+- Production-ready features out of the box
+- Easy testing and debugging
+
+### Why Humanitec?
+- Environment-agnostic deployment
+- Score specification simplicity
+- Resource dependency management
+- Reduces K8s complexity
+
+### Why Prometheus + Grafana?
+- Cloud-native standard
+- Rich query language (PromQL)
+- Wide integration support
+- Open-source, vendor-neutral
+
+### Why Maven?
+- Mature dependency management
+- Extensive plugin ecosystem
+- Declarative configuration
+- Wide adoption in Java community
+
+## Future Enhancements
+
+1. **Database Integration**: PostgreSQL with Flyway migrations
+2. **Caching**: Redis for session storage
+3. **Messaging**: Kafka for event-driven architecture
+4. **Tracing**: Jaeger/Zipkin for distributed tracing
+5. **Service Mesh**: Istio for advanced traffic management
+6. **Multi-region**: Active-active deployment
+
+## Next Steps
+
+- [Review deployment guide](deployment.md)
+- [Configure monitoring](monitoring.md)
+- [Return to overview](index.md)
--- a/docs/deployment.md
+++ b/docs/deployment.md
@@ -0,0 +1,384 @@
+# Deployment Guide
+
+This guide covers deploying online-boutique to Azure Kubernetes Service via Humanitec or ArgoCD.
+
+## Deployment Methods
+
+### 1. Humanitec Platform Orchestrator (Primary)
+
+Humanitec manages deployments using the `score.yaml` specification, automatically provisioning resources and handling promotions across environments.
+
+#### Prerequisites
+
+- Humanitec Organization: `kyn-cjot`
+- Application registered in Humanitec
+- Environments created (development, staging, production)
+- Gitea Actions configured with HUMANITEC_TOKEN secret
+
+#### Automatic Deployment (via Gitea Actions)
+
+Push to trigger workflows:
+
+```bash
+git add .
+git commit -m "feat: new feature"
+git push origin main
+```
+
+**Build & Push Workflow** (`.gitea/workflows/build-push.yml`):
+1. Maven build & test
+2. Docker image build
+3. Push to Azure Container Registry (ACR)
+4. Tags: `latest`, `git-SHA`, `semantic-version`
+
+**Deploy Workflow** (`.gitea/workflows/deploy-humanitec.yml`):
+1. Parses image from build
+2. Updates score.yaml with image reference
+3. Deploys to Humanitec environment
+4. Triggers orchestration
+
+#### Manual Deployment with humctl CLI
+
+Install Humanitec CLI:
+
+```bash
+# macOS
+brew install humanitec/tap/humctl
+
+# Linux/Windows
+curl -s https://get.humanitec.io/install.sh | bash
+```
+
+Login:
+
+```bash
+humctl login --org kyn-cjot
+```
+
+Deploy from Score:
+
+```bash
+humctl score deploy \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env development \
+  --file score.yaml \
+  --image bstagecjotdevacr.azurecr.io/online-boutique:latest \
+  --message "Manual deployment from local"
+```
+
+Deploy specific version:
+
+```bash
+humctl score deploy \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env production \
+  --file score.yaml \
+  --image bstagecjotdevacr.azurecr.io/online-boutique:v1.2.3 \
+  --message "Production release v1.2.3"
+```
+
+#### Environment Promotion
+
+Promote from development → staging:
+
+```bash
+humctl deploy \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env staging \
+  --from development \
+  --message "Promote to staging after testing"
+```
+
+Promote to production:
+
+```bash
+humctl deploy \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env production \
+  --from staging \
+  --message "Production release"
+```
+
+#### Check Deployment Status
+
+```bash
+# List deployments
+humctl get deployments \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env development
+
+# Get specific deployment
+humctl get deployment <DEPLOYMENT_ID> \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env development
+
+# View deployment logs
+humctl logs \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env development
+```
+
+### 2. ArgoCD GitOps (Fallback)
+
+If Humanitec is unavailable, use ArgoCD with Kubernetes manifests in `deploy/`.
+
+#### Create ArgoCD Application
+
+```yaml
+apiVersion: argoproj.io/v1alpha1
+kind: Application
+metadata:
+  name: online-boutique
+  namespace: argocd
+spec:
+  project: default
+  source:
+    repoURL: https://gitea.kyndemo.live/validate/online-boutique.git
+    targetRevision: main
+    path: deploy
+  destination:
+    server: https://kubernetes.default.svc
+    namespace: 
+  syncPolicy:
+    automated:
+      prune: true
+      selfHeal: true
+    syncOptions:
+      - CreateNamespace=true
+```
+
+Apply:
+
+```bash
+kubectl apply -f argocd-app.yaml
+```
+
+#### Manual Deploy with kubectl
+
+Update image in `deploy/kustomization.yaml`:
+
+```yaml
+images:
+  - name: app-image
+    newName: bstagecjotdevacr.azurecr.io/online-boutique
+    newTag: v1.2.3
+```
+
+Deploy:
+
+```bash
+kubectl apply -k deploy/
+```
+
+Verify:
+
+```bash
+kubectl -n  get pods
+kubectl -n  get svc
+kubectl -n  get ing
+```
+
+## Kubernetes Access
+
+### Get AKS Credentials
+
+```bash
+az aks get-credentials \
+  --resource-group bstage-cjot-dev \
+  --name bstage-cjot-dev-aks \
+  --overwrite-existing
+```
+
+### View Application
+
+```bash
+# List pods
+kubectl -n  get pods
+
+# Check pod logs
+kubectl -n  logs -f deployment/online-boutique
+
+# Describe deployment
+kubectl -n  describe deployment online-boutique
+
+# Port-forward for local access
+kubectl -n  port-forward svc/online-boutique-service 8080:80
+```
+
+### Check Health
+
+```bash
+# Health endpoint
+kubectl -n  exec -it deployment/online-boutique -- \
+  curl http://localhost:8080/actuator/health
+
+# Metrics endpoint
+kubectl -n  exec -it deployment/online-boutique -- \
+  curl http://localhost:8080/actuator/prometheus
+```
+
+## Environment Configuration
+
+### Development
+
+- **Purpose**: Active development, frequent deployments
+- **Image Tag**: `latest` or `git-SHA`
+- **Replicas**: 1
+- **Resources**: Minimal (requests: 256Mi RAM, 250m CPU)
+- **Monitoring**: Prometheus scraping enabled
+
+### Staging
+
+- **Purpose**: Pre-production testing, integration tests
+- **Image Tag**: Semantic version (e.g., `v1.2.3-rc.1`)
+- **Replicas**: 2
+- **Resources**: Production-like (requests: 512Mi RAM, 500m CPU)
+- **Monitoring**: Full observability stack
+
+### Production
+
+- **Purpose**: Live traffic, stable releases
+- **Image Tag**: Semantic version (e.g., `v1.2.3`)
+- **Replicas**: 3+ (autoscaling)
+- **Resources**: Right-sized (requests: 1Gi RAM, 1 CPU)
+- **Monitoring**: Alerts enabled, SLO tracking
+
+## Rollback Procedures
+
+### Humanitec Rollback
+
+```bash
+# List previous deployments
+humctl get deployments \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env production
+
+# Rollback to specific deployment
+humctl deploy \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env production \
+  --deployment-id <PREVIOUS_DEPLOYMENT_ID> \
+  --message "Rollback due to issue"
+```
+
+### Kubernetes Rollback
+
+```bash
+# Rollback to previous revision
+kubectl -n  rollout undo deployment/online-boutique
+
+# Rollback to specific revision
+kubectl -n  rollout undo deployment/online-boutique --to-revision=2
+
+# Check rollout status
+kubectl -n  rollout status deployment/online-boutique
+
+# View rollout history
+kubectl -n  rollout history deployment/online-boutique
+```
+
+## Troubleshooting
+
+### Pod Not Starting
+
+```bash
+# Check pod events
+kubectl -n  describe pod <POD_NAME>
+
+# Check logs
+kubectl -n  logs <POD_NAME>
+
+# Check previous container logs (if restarting)
+kubectl -n  logs <POD_NAME> --previous
+```
+
+### Image Pull Errors
+
+```bash
+# Verify ACR access
+az acr login --name bstagecjotdevacr
+
+# Check image exists
+az acr repository show-tags --name bstagecjotdevacr --repository online-boutique
+
+# Verify AKS ACR integration
+az aks check-acr \
+  --resource-group bstage-cjot-dev \
+  --name bstage-cjot-dev-aks \
+  --acr bstagecjotdevacr.azurecr.io
+```
+
+### Service Not Accessible
+
+```bash
+# Check service endpoints
+kubectl -n  get endpoints online-boutique-service
+
+# Check ingress
+kubectl -n  describe ingress online-boutique-ingress
+
+# Test internal connectivity
+kubectl -n  run curl-test --image=curlimages/curl:latest --rm -it --restart=Never -- \
+  curl http://online-boutique-service/actuator/health
+```
+
+### Humanitec Deployment Stuck
+
+```bash
+# Check deployment status
+humctl get deployment <DEPLOYMENT_ID> \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env development
+
+# View error logs
+humctl logs \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env development \
+  --deployment-id <DEPLOYMENT_ID>
+
+# Cancel stuck deployment
+humctl delete deployment <DEPLOYMENT_ID> \
+  --org kyn-cjot \
+  --app online-boutique \
+  --env development
+```
+
+### Resource Issues
+
+```bash
+# Check resource usage
+kubectl -n  top pods
+
+# Describe pod for resource constraints
+kubectl -n  describe pod <POD_NAME> | grep -A 10 "Conditions:"
+
+# Check node capacity
+kubectl describe nodes | grep -A 10 "Allocated resources:"
+```
+
+## Blue-Green Deployments
+
+For zero-downtime deployments with Humanitec:
+
+1. Deploy new version to staging
+2. Run smoke tests
+3. Promote to production with traffic splitting
+4. Monitor metrics
+5. Complete cutover or rollback
+
+## Next Steps
+
+- [Configure monitoring](monitoring.md)
+- [Review architecture](architecture.md)
+- [Return to overview](index.md)
--- a/docs/index.md
+++ b/docs/index.md
@@ -0,0 +1,130 @@
+# online-boutique
+
+## Overview
+
+**online-boutique** is a production-ready Java microservice built using the Kyndryl Platform Engineering Golden Path.
+
+!!! info "Service Information"
+    - **Description**: Java microservice via Golden Path
+    - **Environment**: development
+    - **Technology**: Spring Boot 3.2, Java 17
+    - **Orchestration**: Humanitec
+    - **Observability**: Prometheus + Grafana
+
+## Quick Links
+
+- [Repository](https://gitea.kyndemo.live/validate/online-boutique)
+- [Humanitec Console](https://app.humanitec.io/orgs/kyn-cjot/apps/online-boutique)
+- [Grafana Dashboard](https://grafana.kyndemo.live/d/spring-boot-dashboard?var-app=online-boutique)
+
+## Features
+
+✅ **Production-Ready Configuration**
+- Health checks (liveness, readiness, startup)
+- Graceful shutdown
+- Resource limits and requests
+- Security contexts
+
+✅ **Observability**
+- Prometheus metrics integration
+- Pre-configured Grafana dashboards
+- Structured logging
+- Request tracing
+
+✅ **CI/CD**
+- Automated builds via GitHub Actions
+- Azure Container Registry integration
+- Humanitec deployment automation
+- GitOps fallback with ArgoCD
+
+✅ **Developer Experience**
+- Local development support
+- Hot reload with Spring DevTools
+- Comprehensive tests
+- API documentation
+
+## Architecture
+
+This service follows the golden path architecture:
+
+```
+┌─────────────────────────────────────────┐
+│           Developer Experience           │
+│  (Backstage Template → Gitea Repo)      │
+└─────────────────────────────────────────┘
+                   │
+                   │ git push
+                   ▼
+┌─────────────────────────────────────────┐
+│          GitHub Actions CI/CD            │
+│  1. Build with Maven                     │
+│  2. Run tests                            │
+│  3. Build Docker image                   │
+│  4. Push to ACR                          │
+│  5. Deploy via Humanitec                 │
+└─────────────────────────────────────────┘
+                   │
+                   ▼
+┌─────────────────────────────────────────┐
+│        Humanitec Orchestrator            │
+│  - Interprets score.yaml                 │
+│  - Provisions resources                  │
+│  - Deploys to AKS                        │
+└─────────────────────────────────────────┘
+                   │
+                   ▼
+┌─────────────────────────────────────────┐
+│          Azure AKS Cluster               │
+│  - Pods with app containers              │
+│  - Prometheus scraping metrics           │
+│  - Service mesh (optional)               │
+└─────────────────────────────────────────┘
+                   │
+                   ▼
+┌─────────────────────────────────────────┐
+│       Grafana + Prometheus               │
+│  - Real-time metrics                     │
+│  - Dashboards                            │
+│  - Alerting                              │
+└─────────────────────────────────────────┘
+```
+
+## API Endpoints
+
+### Application Endpoints
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/` | GET | Welcome message |
+| `/api/status` | GET | Service health status |
+
+### Actuator Endpoints
+
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `/actuator/health` | GET | Overall health |
+| `/actuator/health/liveness` | GET | Liveness probe |
+| `/actuator/health/readiness` | GET | Readiness probe |
+| `/actuator/metrics` | GET | Available metrics |
+| `/actuator/prometheus` | GET | Prometheus metrics |
+| `/actuator/info` | GET | Application info |
+
+## Technology Stack
+
+- **Language**: Java 17
+- **Framework**: Spring Boot 3.2.0
+- **Build Tool**: Maven 3.9
+- **Metrics**: Micrometer + Prometheus
+- **Container**: Docker (Alpine-based)
+- **Orchestration**: Humanitec (Score)
+- **CI/CD**: GitHub Actions
+- **Registry**: Azure Container Registry
+- **Kubernetes**: Azure AKS
+- **Monitoring**: Prometheus + Grafana
+
+## Next Steps
+
+- [Set up local development environment](local-development.md)
+- [Learn about deployment process](deployment.md)
+- [Configure monitoring and alerts](monitoring.md)
+- [Understand the architecture](architecture.md)
--- a/docs/local-development.md
+++ b/docs/local-development.md
@@ -0,0 +1,279 @@
+# Local Development
+
+This guide covers setting up and running online-boutique on your local machine.
+
+## Prerequisites
+
+- **Java 17** or higher ([Download](https://adoptium.net/))
+- **Maven 3.9+** (included via Maven Wrapper)
+- **Docker** (optional, for container testing)
+- **Git**
+
+## Quick Start
+
+### 1. Clone the Repository
+
+```bash
+git clone https://gitea.kyndemo.live/validate/online-boutique.git
+cd online-boutique
+```
+
+### 2. Build the Application
+
+```bash
+# Using Maven Wrapper (recommended)
+./mvnw clean package
+
+# Or with system Maven
+mvn clean package
+```
+
+### 3. Run the Application
+
+```bash
+# Run with Spring Boot Maven plugin
+./mvnw spring-boot:run
+
+# Or run the JAR directly
+java -jar target/online-boutique-1.0.0-SNAPSHOT.jar
+```
+
+The application will start on **http://localhost:8080**
+
+### 4. Verify It's Running
+
+```bash
+# Check health
+curl http://localhost:8080/actuator/health
+
+# Check status
+curl http://localhost:8080/api/status
+
+# View metrics
+curl http://localhost:8080/actuator/prometheus
+```
+
+## Development Workflow
+
+### Hot Reload with Spring DevTools
+
+For automatic restarts during development, add Spring DevTools to `pom.xml`:
+
+```xml
+<dependency>
+    <groupId>org.springframework.boot</groupId>
+    <artifactId>spring-boot-devtools</artifactId>
+    <scope>runtime</scope>
+    <optional>true</optional>
+</dependency>
+```
+
+Changes to Java files will trigger automatic restarts.
+
+### Running Tests
+
+```bash
+# Run all tests
+./mvnw test
+
+# Run specific test class
+./mvnw test -Dtest=GoldenPathApplicationTests
+
+# Run tests with coverage
+./mvnw test jacoco:report
+```
+
+### Active Profile
+
+Set active profile via environment variable:
+
+```bash
+# Development profile
+export SPRING_PROFILES_ACTIVE=development
+./mvnw spring-boot:run
+
+# Or inline
+SPRING_PROFILES_ACTIVE=development ./mvnw spring-boot:run
+```
+
+## Docker Development
+
+### Build Image Locally
+
+```bash
+docker build -t online-boutique:dev .
+```
+
+### Run in Docker
+
+```bash
+docker run -p 8080:8080 \
+  -e SPRING_PROFILES_ACTIVE=development \
+  online-boutique:dev
+```
+
+### Docker Compose (if needed)
+
+Create `docker-compose.yml`:
+
+```yaml
+version: '3.8'
+services:
+  app:
+    build: .
+    ports:
+      - "8080:8080"
+    environment:
+      - SPRING_PROFILES_ACTIVE=development
+```
+
+Run with:
+```bash
+docker-compose up
+```
+
+## IDE Setup
+
+### IntelliJ IDEA
+
+1. **Import Project**: File → New → Project from Existing Sources
+2. **Select Maven**: Choose Maven as build tool
+3. **SDK**: Configure Java 17 SDK
+4. **Run Configuration**:
+   - Main class: `com.kyndryl.goldenpath.GoldenPathApplication`
+   - VM options: `-Dspring.profiles.active=development`
+
+### VS Code
+
+1. **Install Extensions**:
+   - Extension Pack for Java
+   - Spring Boot Extension Pack
+   
+2. **Open Folder**: Open the project root
+
+3. **Run/Debug**: Use Spring Boot Dashboard or F5
+
+### Eclipse
+
+1. **Import**: File → Import → Maven → Existing Maven Projects
+2. **Update Project**: Right-click → Maven → Update Project
+3. **Run**: Right-click on Application class → Run As → Java Application
+
+## Debugging
+
+### Enable Debug Logging
+
+In `application-development.yml`:
+
+```yaml
+logging:
+  level:
+    root: DEBUG
+    com.kyndryl.goldenpath: TRACE
+```
+
+### Remote Debugging
+
+Start with debug enabled:
+
+```bash
+./mvnw spring-boot:run -Dspring-boot.run.jvmArguments="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"
+```
+
+Connect debugger to `localhost:5005`
+
+## Common Development Tasks
+
+### Adding a New Endpoint
+
+```java
+@GetMapping("/api/hello")
+public ResponseEntity<String> hello() {
+    return ResponseEntity.ok("Hello, World!");
+}
+```
+
+### Adding Custom Metrics
+
+```java
+@Autowired
+private MeterRegistry meterRegistry;
+
+@GetMapping("/api/data")
+public String getData() {
+    Counter counter = Counter.builder("custom_api_calls")
+        .tag("endpoint", "data")
+        .register(meterRegistry);
+    counter.increment();
+    return "data";
+}
+```
+
+### Database Integration (Future)
+
+To add PostgreSQL:
+
+1. Add dependency in `pom.xml`:
+```xml
+<dependency>
+    <groupId>org.springframework.boot</groupId>
+    <artifactId>spring-boot-starter-data-jpa</artifactId>
+</dependency>
+<dependency>
+    <groupId>org.postgresql</groupId>
+    <artifactId>postgresql</artifactId>
+</dependency>
+```
+
+2. Configure in `application.yml`:
+```yaml
+spring:
+  datasource:
+    url: jdbc:postgresql://localhost:5432/mydb
+    username: user
+    password: pass
+  jpa:
+    hibernate:
+      ddl-auto: update
+```
+
+## Troubleshooting
+
+### Port 8080 Already in Use
+
+```bash
+# Find process using port 8080
+lsof -i :8080
+
+# Kill process
+kill -9 <PID>
+
+# Or use different port
+./mvnw spring-boot:run -Dspring-boot.run.arguments=--server.port=8081
+```
+
+### Maven Build Fails
+
+```bash
+# Clean and rebuild
+./mvnw clean install -U
+
+# Skip tests temporarily
+./mvnw clean package -DskipTests
+```
+
+### Tests Fail
+
+```bash
+# Run with verbose output
+./mvnw test -X
+
+# Run single test
+./mvnw test -Dtest=GoldenPathApplicationTests#contextLoads
+```
+
+## Next Steps
+
+- [Learn about deployment](deployment.md)
+- [Configure monitoring](monitoring.md)
+- [Review architecture](architecture.md)
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -0,0 +1,395 @@
+# Monitoring & Observability
+
+This guide covers monitoring online-boutique with Prometheus and Grafana.
+
+## Overview
+
+The Java Golden Path includes comprehensive observability:
+
+- **Metrics**: Prometheus metrics via Spring Boot Actuator
+- **Dashboards**: Pre-configured Grafana dashboard
+- **Scraping**: Automatic discovery via ServiceMonitor
+- **Retention**: 15 days of metrics storage
+
+## Metrics Endpoint
+
+Spring Boot Actuator exposes Prometheus metrics at:
+
+```
+http://<pod-ip>:8080/actuator/prometheus
+```
+
+### Verify Metrics Locally
+
+```bash
+curl http://localhost:8080/actuator/prometheus
+```
+
+### Sample Metrics Output
+
+```
+# HELP jvm_memory_used_bytes The amount of used memory
+# TYPE jvm_memory_used_bytes gauge
+jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 5.2428800E7
+
+# HELP http_server_requests_seconds Duration of HTTP server request handling
+# TYPE http_server_requests_seconds summary
+http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/status",} 42.0
+http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/status",} 0.351234567
+```
+
+## Available Metrics
+
+### HTTP Metrics
+
+- `http_server_requests_seconds_count`: Total request count
+- `http_server_requests_seconds_sum`: Total request duration
+- **Labels**: method, status, uri, outcome, exception
+
+### JVM Metrics
+
+#### Memory
+- `jvm_memory_used_bytes`: Current memory usage
+- `jvm_memory_max_bytes`: Maximum memory available
+- `jvm_memory_committed_bytes`: Committed memory
+- **Areas**: heap, nonheap
+- **Pools**: G1 Eden Space, G1 Old Gen, G1 Survivor Space
+
+#### Garbage Collection
+- `jvm_gc_pause_seconds_count`: GC pause count
+- `jvm_gc_pause_seconds_sum`: Total GC pause time
+- `jvm_gc_memory_allocated_bytes_total`: Total memory allocated
+- `jvm_gc_memory_promoted_bytes_total`: Memory promoted to old gen
+
+#### Threads
+- `jvm_threads_live_threads`: Current live threads
+- `jvm_threads_daemon_threads`: Current daemon threads
+- `jvm_threads_peak_threads`: Peak thread count
+- `jvm_threads_states_threads`: Threads by state (runnable, blocked, waiting)
+
+#### CPU
+- `process_cpu_usage`: Process CPU usage (0-1)
+- `system_cpu_usage`: System CPU usage (0-1)
+- `system_cpu_count`: Number of CPU cores
+
+### Application Metrics
+
+- `application_started_time_seconds`: Application start timestamp
+- `application_ready_time_seconds`: Application ready timestamp
+- `process_uptime_seconds`: Process uptime
+- `process_files_open_files`: Open file descriptors
+
+### Custom Metrics
+
+Add custom metrics with Micrometer:
+
+```java
+@Autowired
+private MeterRegistry meterRegistry;
+
+// Counter
+Counter.builder("business_operations")
+    .tag("operation", "checkout")
+    .register(meterRegistry)
+    .increment();
+
+// Gauge
+Gauge.builder("active_users", this, obj -> obj.getActiveUsers())
+    .register(meterRegistry);
+
+// Timer
+Timer.builder("api_processing_time")
+    .tag("endpoint", "/api/process")
+    .register(meterRegistry)
+    .record(() -> {
+        // Timed operation
+    });
+```
+
+## Prometheus Configuration
+
+### ServiceMonitor
+
+Deployed automatically in `deploy/servicemonitor.yaml`:
+
+```yaml
+apiVersion: monitoring.coreos.com/v1
+kind: ServiceMonitor
+metadata:
+  name: online-boutique
+  namespace: 
+  labels:
+    app: online-boutique
+    prometheus: kube-prometheus
+spec:
+  selector:
+    matchLabels:
+      app: online-boutique
+  endpoints:
+    - port: http
+      path: /actuator/prometheus
+      interval: 30s
+```
+
+### Verify Scraping
+
+Check Prometheus targets:
+
+1. Access Prometheus: `https://prometheus.kyndemo.live`
+2. Navigate to **Status → Targets**
+3. Find `online-boutique` in `monitoring/` namespace
+4. Status should be **UP**
+
+Or via kubectl:
+
+```bash
+# Port-forward Prometheus
+kubectl -n monitoring port-forward svc/prometheus-operated 9090:9090
+
+# Check targets API
+curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job == "online-boutique")'
+```
+
+### Query Metrics
+
+Access Prometheus UI and run queries:
+
+```promql
+# Request rate
+rate(http_server_requests_seconds_count{job="online-boutique"}[5m])
+
+# Average request duration
+rate(http_server_requests_seconds_sum{job="online-boutique"}[5m])
+/ rate(http_server_requests_seconds_count{job="online-boutique"}[5m])
+
+# Error rate
+sum(rate(http_server_requests_seconds_count{job="online-boutique",status=~"5.."}[5m]))
+/ sum(rate(http_server_requests_seconds_count{job="online-boutique"}[5m]))
+
+# Memory usage
+jvm_memory_used_bytes{job="online-boutique",area="heap"}
+/ jvm_memory_max_bytes{job="online-boutique",area="heap"}
+```
+
+## Grafana Dashboard
+
+### Access Dashboard
+
+1. Open Grafana: `https://grafana.kyndemo.live`
+2. Navigate to **Dashboards → Spring Boot Application**
+3. Select `online-boutique` from dropdown
+
+### Dashboard Panels
+
+#### HTTP Metrics
+- **Request Rate**: Requests per second by endpoint
+- **Request Duration**: Average, 95th, 99th percentile latency
+- **Status Codes**: Breakdown of 2xx, 4xx, 5xx responses
+- **Error Rate**: Percentage of failed requests
+
+#### JVM Metrics
+- **Heap Memory**: Used vs. max heap memory over time
+- **Non-Heap Memory**: Metaspace, code cache, compressed class space
+- **Garbage Collection**: GC pause frequency and duration
+- **Thread Count**: Live threads, daemon threads, peak threads
+
+#### System Metrics
+- **CPU Usage**: Process and system CPU utilization
+- **File Descriptors**: Open file count
+- **Uptime**: Application uptime
+
+### Custom Dashboards
+
+Import dashboard JSON from `/k8s/monitoring/spring-boot-dashboard.json`:
+
+1. Grafana → Dashboards → New → Import
+2. Upload `spring-boot-dashboard.json`
+3. Select Prometheus data source
+4. Click **Import**
+
+## Alerting
+
+### Prometheus Alerting Rules
+
+Create alerting rules in Prometheus:
+
+```yaml
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+  name: online-boutique-alerts
+  namespace: 
+  labels:
+    prometheus: kube-prometheus
+spec:
+  groups:
+    - name: online-boutique
+      interval: 30s
+      rules:
+        # High error rate
+        - alert: HighErrorRate
+          expr: |
+            sum(rate(http_server_requests_seconds_count{job="online-boutique",status=~"5.."}[5m]))
+            / sum(rate(http_server_requests_seconds_count{job="online-boutique"}[5m]))
+            > 0.05
+          for: 5m
+          labels:
+            severity: warning
+          annotations:
+            summary: "High error rate on online-boutique"
+            description: "Error rate is {{ $value | humanizePercentage }}"
+
+        # High latency
+        - alert: HighLatency
+          expr: |
+            histogram_quantile(0.95,
+              sum(rate(http_server_requests_seconds_bucket{job="online-boutique"}[5m])) by (le)
+            ) > 1.0
+          for: 5m
+          labels:
+            severity: warning
+          annotations:
+            summary: "High latency on online-boutique"
+            description: "95th percentile latency is {{ $value }}s"
+
+        # High memory usage
+        - alert: HighMemoryUsage
+          expr: |
+            jvm_memory_used_bytes{job="online-boutique",area="heap"}
+            / jvm_memory_max_bytes{job="online-boutique",area="heap"}
+            > 0.90
+          for: 5m
+          labels:
+            severity: critical
+          annotations:
+            summary: "High memory usage on online-boutique"
+            description: "Heap usage is {{ $value | humanizePercentage }}"
+
+        # Pod not ready
+        - alert: PodNotReady
+          expr: |
+            kube_pod_status_ready{namespace="",pod=~"online-boutique-.*",condition="true"} == 0
+          for: 5m
+          labels:
+            severity: critical
+          annotations:
+            summary: "online-boutique pod not ready"
+            description: "Pod {{ $labels.pod }} not ready for 5 minutes"
+```
+
+Apply:
+
+```bash
+kubectl apply -f prometheus-rules.yaml
+```
+
+### Grafana Alerts
+
+Configure alerts in Grafana dashboard panels:
+
+1. Edit panel
+2. Click **Alert** tab
+3. Set conditions (e.g., "when avg() of query(A) is above 0.8")
+4. Configure notification channels (Slack, email, PagerDuty)
+
+### Alert Testing
+
+Trigger test alerts:
+
+```bash
+# Generate errors
+for i in {1..100}; do
+  curl http://localhost:8080/api/nonexistent
+done
+
+# Trigger high latency
+ab -n 10000 -c 100 http://localhost:8080/api/status
+
+# Cause memory pressure
+curl -X POST http://localhost:8080/actuator/heapdump
+```
+
+## Distributed Tracing (Future)
+
+To add tracing with Jaeger/Zipkin:
+
+1. Add dependency:
+```xml
+<dependency>
+    <groupId>io.micrometer</groupId>
+    <artifactId>micrometer-tracing-bridge-otel</artifactId>
+</dependency>
+<dependency>
+    <groupId>io.opentelemetry</groupId>
+    <artifactId>opentelemetry-exporter-zipkin</artifactId>
+</dependency>
+```
+
+2. Configure in `application.yml`:
+```yaml
+management:
+  tracing:
+    sampling:
+      probability: 1.0
+  zipkin:
+    tracing:
+      endpoint: http://zipkin:9411/api/v2/spans
+```
+
+## Log Aggregation
+
+For centralized logging:
+
+1. **Loki**: Add Promtail to collect pod logs
+2. **Grafana Logs**: Query logs alongside metrics
+3. **Log Correlation**: Link traces to logs via trace ID
+
+## Best Practices
+
+1. **Metric Cardinality**: Avoid high-cardinality labels (user IDs, timestamps)
+2. **Naming**: Follow Prometheus naming conventions (`_total`, `_seconds`, `_bytes`)
+3. **Aggregation**: Use recording rules for expensive queries
+4. **Retention**: Adjust retention period based on storage capacity
+5. **Dashboarding**: Create business-specific dashboards for stakeholders
+
+## Troubleshooting
+
+### Metrics Not Appearing
+
+```bash
+# Check if actuator is enabled
+kubectl -n  exec -it deployment/online-boutique -- \
+  curl http://localhost:8080/actuator
+
+# Check ServiceMonitor
+kubectl -n  get servicemonitor online-boutique -o yaml
+
+# Check Prometheus logs
+kubectl -n monitoring logs -l app.kubernetes.io/name=prometheus --tail=100 | grep online-boutique
+```
+
+### High Memory Usage
+
+```bash
+# Take heap dump
+kubectl -n  exec -it deployment/online-boutique -- \
+  curl -X POST http://localhost:8080/actuator/heapdump --output heapdump.hprof
+
+# Analyze with jmap/jhat or Eclipse Memory Analyzer
+```
+
+### Slow Queries
+
+Enable query logging in Prometheus:
+
+```bash
+kubectl -n monitoring port-forward svc/prometheus-operated 9090:9090
+# Access http://localhost:9090/graph
+# Enable query stats in settings
+```
+
+## Next Steps
+
+- [Review architecture](architecture.md)
+- [Learn about deployment](deployment.md)
+- [Return to overview](index.md)