initial commit

Change-Id: I9c68c43e939d2c1a3b95a68b71ecc5ba861a4df5
This commit is contained in:
Scaffolder
2026-03-05 13:37:56 +00:00
commit 7e119cad41
24 changed files with 3024 additions and 0 deletions

521
docs/architecture.md Normal file
View File

@@ -0,0 +1,521 @@
# Architecture
This document describes the architecture of online-boutique.
## System Overview
```
┌──────────────────────────────────────────────────────────────┐
│ Developer │
│ │
│ Backstage UI → Template → Gitea Repo → CI/CD Workflows │
└────────────────────┬─────────────────────────────────────────┘
│ git push
┌──────────────────────────────────────────────────────────────┐
│ Gitea Actions │
│ │
│ ┌───────────────┐ ┌──────────────────┐ │
│ │ Build & Push │──────▶│ Deploy Humanitec │ │
│ │ - Maven │ │ - humctl score │ │
│ │ - Docker │ │ - Environment │ │
│ │ - ACR Push │ │ - Orchestration │ │
│ └───────────────┘ └──────────────────┘ │
└─────────────┬─────────────────┬──────────────────────────────┘
│ │
│ image │ deployment
▼ ▼
┌────────────────────┐ ┌────────────────────────────────────┐
│ Azure Container │ │ Humanitec Platform │
│ Registry │ │ │
│ │ │ ┌──────────────────────────────┐ │
│ bstagecjotdevacr │ │ │ Score Interpretation │ │
│ │ │ │ Resource Provisioning │ │
│ Images: │ │ │ Environment Management │ │
│ - app:latest │ │ └──────────────────────────────┘ │
│ - app:v1.0.0 │ │ │ │
│ - app:git-sha │ │ │ kubectl apply │
└────────────────────┘ └─────────────┼──────────────────────┘
┌─────────────────────────────────────────────┐
│ Azure Kubernetes Service (AKS) │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Namespace: │ │
│ │ │ │
│ │ ┌──────────────────────────────┐ │ │
│ │ │ Deployment │ │ │
│ │ │ - Replicas: 2 │ │ │
│ │ │ - Health Probes │ │ │
│ │ │ - Resource Limits │ │ │
│ │ │ │ │ │
│ │ │ ┌───────────┐ ┌──────────┐ │ │ │
│ │ │ │ Pod │ │ Pod │ │ │ │
│ │ │ │ Spring │ │ Spring │ │ │ │
│ │ │ │ Boot │ │ Boot │ │ │ │
│ │ │ │ :8080 │ │ :8080 │ │ │ │
│ │ │ └─────┬─────┘ └────┬─────┘ │ │ │
│ │ └────────┼────────────┼───────┘ │ │
│ │ │ │ │ │
│ │ ┌────────▼────────────▼───────┐ │ │
│ │ │ Service (ClusterIP) │ │ │
│ │ │ - Port: 80 → 8080 │ │ │
│ │ └────────┬───────────────────┘ │ │
│ │ │ │ │
│ │ ┌────────▼───────────────────┐ │ │
│ │ │ Ingress │ │ │
│ │ │ - TLS (cert-manager) │ │ │
│ │ │ - Host: app.kyndemo.live │ │ │
│ │ └────────────────────────────┘ │ │
│ └────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────┐ │
│ │ Monitoring Namespace │ │
│ │ │ │
│ │ ┌────────────────────────────┐ │ │
│ │ │ Prometheus │ │ │
│ │ │ - ServiceMonitor │ │ │
│ │ │ - Scrapes /actuator/ │ │ │
│ │ │ prometheus every 30s │ │ │
│ │ └────────────────────────────┘ │ │
│ │ │ │
│ │ ┌────────────────────────────┐ │ │
│ │ │ Grafana │ │ │
│ │ │ - Spring Boot Dashboard │ │ │
│ │ │ - Alerts │ │ │
│ │ └────────────────────────────┘ │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────┘
```
## Component Architecture
### 1. Application Layer
#### Spring Boot Application
**Technology Stack:**
- **Framework**: Spring Boot 3.2
- **Java**: OpenJDK 17 (LTS)
- **Build**: Maven 3.9
- **Runtime**: Embedded Tomcat
**Key Components:**
```java
@SpringBootApplication
public class GoldenPathApplication {
// Auto-configuration
// Component scanning
// Property binding
}
@RestController
public class ApiController {
@GetMapping("/")
public String root();
@GetMapping("/api/status")
public ResponseEntity<Map<String, String>> status();
}
```
**Configuration Management:**
- `application.yml`: Base configuration
- `application-development.yml`: Dev overrides
- `application-production.yml`: Production overrides
- Environment variables: Runtime overrides
### 2. Container Layer
#### Docker Image
**Multi-stage Build:**
```dockerfile
# Stage 1: Build
FROM maven:3.9-eclipse-temurin-17 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package -DskipTests
# Stage 2: Runtime
FROM eclipse-temurin:17-jre-alpine
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
USER 1000
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
```
**Optimizations:**
- Layer caching for dependencies
- Minimal runtime image (Alpine)
- Non-root user (UID 1000)
- Health check support
### 3. Orchestration Layer
#### Humanitec Score
**Resource Specification:**
```yaml
apiVersion: score.dev/v1b1
metadata:
name: online-boutique
containers:
app:
image: bstagecjotdevacr.azurecr.io/online-boutique:latest
resources:
requests:
memory: 512Mi
cpu: 250m
limits:
memory: 1Gi
cpu: 1000m
service:
ports:
http:
port: 80
targetPort: 8080
resources:
route:
type: route
params:
host: online-boutique.kyndemo.live
```
**Capabilities:**
- Environment-agnostic deployment
- Resource dependencies
- Configuration management
- Automatic rollback
#### Kubernetes Resources
**Fallback Manifests:**
- `deployment.yaml`: Pod specification, replicas, health probes
- `service.yaml`: ClusterIP service for internal routing
- `ingress.yaml`: External access with TLS
- `servicemonitor.yaml`: Prometheus scraping config
### 4. CI/CD Pipeline
#### Build & Push Workflow
**Stages:**
1. **Checkout**: Clone repository
2. **Setup**: Install Maven, Docker
3. **Test**: Run unit & integration tests
4. **Build**: Maven package
5. **Docker**: Build multi-stage image
6. **Auth**: Azure OIDC login
7. **Push**: Push to ACR with tags
**Triggers:**
- Push to `main` branch
- Pull requests
- Manual dispatch
#### Deploy Workflow
**Stages:**
1. **Parse Image**: Extract image reference from build
2. **Setup**: Install humctl CLI
3. **Score Update**: Replace image in score.yaml
4. **Deploy**: Execute humctl score deploy
5. **Verify**: Check deployment status
**Secrets:**
- `HUMANITEC_TOKEN`: Platform authentication
- `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`: OIDC federation
### 5. Observability Layer
#### Metrics Collection
**Flow:**
```
Spring Boot App
└── /actuator/prometheus (HTTP endpoint)
└── Prometheus (scrape every 30s)
└── TSDB (15-day retention)
└── Grafana (visualization)
```
**ServiceMonitor Configuration:**
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
spec:
selector:
matchLabels:
app: online-boutique
endpoints:
- port: http
path: /actuator/prometheus
interval: 30s
```
#### Metrics Categories
1. **HTTP Metrics**:
- Request count/rate
- Response time (avg, p95, p99)
- Status code distribution
2. **JVM Metrics**:
- Heap/non-heap memory
- GC pause time
- Thread count
3. **System Metrics**:
- CPU usage
- File descriptors
- Process uptime
## Data Flow
### Request Flow
```
User Request
Ingress Controller (nginx)
│ TLS termination
│ Host routing
Service (ClusterIP)
│ Load balancing
│ Port mapping
Pod (Spring Boot)
│ Request handling
│ Business logic
Response
```
### Metrics Flow
```
Spring Boot (Micrometer)
│ Collect metrics
│ Format Prometheus
Actuator Endpoint
│ Expose /actuator/prometheus
Prometheus (Scraper)
│ Pull every 30s
│ Store in TSDB
Grafana
│ Query PromQL
│ Render dashboards
User Visualization
```
### Deployment Flow
```
Git Push
Gitea Actions (Webhook)
├── Build Workflow
│ │ Maven test + package
│ │ Docker build
│ │ ACR push
│ └── Output: image reference
└── Deploy Workflow
│ Parse image
│ Update score.yaml
│ humctl score deploy
Humanitec Platform
│ Interpret Score
│ Provision resources
│ Generate manifests
Kubernetes API
│ Apply deployment
│ Create/update resources
│ Schedule pods
Running Application
```
## Security Architecture
### Authentication & Authorization
1. **Azure Workload Identity**:
- OIDC federation for CI/CD
- No static credentials
- Scoped permissions
2. **Service Account**:
- Kubernetes ServiceAccount
- Bound to Azure Managed Identity
- Limited RBAC
3. **Image Pull Secrets**:
- AKS ACR integration
- Managed identity for registry access
### Network Security
1. **Ingress**:
- TLS 1.2+ only
- Cert-manager for automatic cert renewal
- Rate limiting (optional)
2. **Network Policies**:
- Restrict pod-to-pod communication
- Allow only required egress
3. **Service Mesh (Future)**:
- mTLS between services
- Fine-grained authorization
### Application Security
1. **Container**:
- Non-root user (UID 1000)
- Read-only root filesystem
- No privilege escalation
2. **Dependencies**:
- Regular Maven dependency updates
- Vulnerability scanning (Snyk/Trivy)
3. **Secrets Management**:
- Azure Key Vault integration
- CSI driver for secret mounting
- No secrets in environment variables
## Scalability
### Horizontal Scaling
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
```
### Vertical Scaling
Use **VPA (Vertical Pod Autoscaler)** for automatic resource recommendation.
### Database Scaling (Future)
- Connection pooling (HikariCP)
- Read replicas for read-heavy workloads
- Caching layer (Redis)
## High Availability
### Application Level
- **Replicas**: Minimum 2 pods per environment
- **Anti-affinity**: Spread across nodes
- **Readiness probes**: Only route to healthy pods
### Infrastructure Level
- **AKS**: Multi-zone node pools
- **Ingress**: Multiple replicas with PodDisruptionBudget
- **Monitoring**: High availability via Thanos
## Disaster Recovery
### Backup Strategy
1. **Application State**: Stateless, no backup needed
2. **Configuration**: Stored in Git
3. **Metrics**: 15-day retention, export to long-term storage
4. **Container Images**: Retained in ACR with retention policy
### Recovery Procedures
1. **Pod failure**: Automatic restart by kubelet
2. **Node failure**: Automatic rescheduling to healthy nodes
3. **Cluster failure**: Redeploy via Terraform + Humanitec
4. **Regional failure**: Failover to secondary region (if configured)
## Technology Decisions
### Why Spring Boot?
- Industry-standard Java framework
- Rich ecosystem (Actuator, Security, Data)
- Production-ready features out of the box
- Easy testing and debugging
### Why Humanitec?
- Environment-agnostic deployment
- Score specification simplicity
- Resource dependency management
- Reduces K8s complexity
### Why Prometheus + Grafana?
- Cloud-native standard
- Rich query language (PromQL)
- Wide integration support
- Open-source, vendor-neutral
### Why Maven?
- Mature dependency management
- Extensive plugin ecosystem
- Declarative configuration
- Wide adoption in Java community
## Future Enhancements
1. **Database Integration**: PostgreSQL with Flyway migrations
2. **Caching**: Redis for session storage
3. **Messaging**: Kafka for event-driven architecture
4. **Tracing**: Jaeger/Zipkin for distributed tracing
5. **Service Mesh**: Istio for advanced traffic management
6. **Multi-region**: Active-active deployment
## Next Steps
- [Review deployment guide](deployment.md)
- [Configure monitoring](monitoring.md)
- [Return to overview](index.md)

384
docs/deployment.md Normal file
View File

@@ -0,0 +1,384 @@
# Deployment Guide
This guide covers deploying online-boutique to Azure Kubernetes Service via Humanitec or ArgoCD.
## Deployment Methods
### 1. Humanitec Platform Orchestrator (Primary)
Humanitec manages deployments using the `score.yaml` specification, automatically provisioning resources and handling promotions across environments.
#### Prerequisites
- Humanitec Organization: `kyn-cjot`
- Application registered in Humanitec
- Environments created (development, staging, production)
- Gitea Actions configured with HUMANITEC_TOKEN secret
#### Automatic Deployment (via Gitea Actions)
Push to trigger workflows:
```bash
git add .
git commit -m "feat: new feature"
git push origin main
```
**Build & Push Workflow** (`.gitea/workflows/build-push.yml`):
1. Maven build & test
2. Docker image build
3. Push to Azure Container Registry (ACR)
4. Tags: `latest`, `git-SHA`, `semantic-version`
**Deploy Workflow** (`.gitea/workflows/deploy-humanitec.yml`):
1. Parses image from build
2. Updates score.yaml with image reference
3. Deploys to Humanitec environment
4. Triggers orchestration
#### Manual Deployment with humctl CLI
Install Humanitec CLI:
```bash
# macOS
brew install humanitec/tap/humctl
# Linux/Windows
curl -s https://get.humanitec.io/install.sh | bash
```
Login:
```bash
humctl login --org kyn-cjot
```
Deploy from Score:
```bash
humctl score deploy \
--org kyn-cjot \
--app online-boutique \
--env development \
--file score.yaml \
--image bstagecjotdevacr.azurecr.io/online-boutique:latest \
--message "Manual deployment from local"
```
Deploy specific version:
```bash
humctl score deploy \
--org kyn-cjot \
--app online-boutique \
--env production \
--file score.yaml \
--image bstagecjotdevacr.azurecr.io/online-boutique:v1.2.3 \
--message "Production release v1.2.3"
```
#### Environment Promotion
Promote from development → staging:
```bash
humctl deploy \
--org kyn-cjot \
--app online-boutique \
--env staging \
--from development \
--message "Promote to staging after testing"
```
Promote to production:
```bash
humctl deploy \
--org kyn-cjot \
--app online-boutique \
--env production \
--from staging \
--message "Production release"
```
#### Check Deployment Status
```bash
# List deployments
humctl get deployments \
--org kyn-cjot \
--app online-boutique \
--env development
# Get specific deployment
humctl get deployment <DEPLOYMENT_ID> \
--org kyn-cjot \
--app online-boutique \
--env development
# View deployment logs
humctl logs \
--org kyn-cjot \
--app online-boutique \
--env development
```
### 2. ArgoCD GitOps (Fallback)
If Humanitec is unavailable, use ArgoCD with Kubernetes manifests in `deploy/`.
#### Create ArgoCD Application
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: online-boutique
namespace: argocd
spec:
project: default
source:
repoURL: https://gitea.kyndemo.live/validate/online-boutique.git
targetRevision: main
path: deploy
destination:
server: https://kubernetes.default.svc
namespace:
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
```
Apply:
```bash
kubectl apply -f argocd-app.yaml
```
#### Manual Deploy with kubectl
Update image in `deploy/kustomization.yaml`:
```yaml
images:
- name: app-image
newName: bstagecjotdevacr.azurecr.io/online-boutique
newTag: v1.2.3
```
Deploy:
```bash
kubectl apply -k deploy/
```
Verify:
```bash
kubectl -n get pods
kubectl -n get svc
kubectl -n get ing
```
## Kubernetes Access
### Get AKS Credentials
```bash
az aks get-credentials \
--resource-group bstage-cjot-dev \
--name bstage-cjot-dev-aks \
--overwrite-existing
```
### View Application
```bash
# List pods
kubectl -n get pods
# Check pod logs
kubectl -n logs -f deployment/online-boutique
# Describe deployment
kubectl -n describe deployment online-boutique
# Port-forward for local access
kubectl -n port-forward svc/online-boutique-service 8080:80
```
### Check Health
```bash
# Health endpoint
kubectl -n exec -it deployment/online-boutique -- \
curl http://localhost:8080/actuator/health
# Metrics endpoint
kubectl -n exec -it deployment/online-boutique -- \
curl http://localhost:8080/actuator/prometheus
```
## Environment Configuration
### Development
- **Purpose**: Active development, frequent deployments
- **Image Tag**: `latest` or `git-SHA`
- **Replicas**: 1
- **Resources**: Minimal (requests: 256Mi RAM, 250m CPU)
- **Monitoring**: Prometheus scraping enabled
### Staging
- **Purpose**: Pre-production testing, integration tests
- **Image Tag**: Semantic version (e.g., `v1.2.3-rc.1`)
- **Replicas**: 2
- **Resources**: Production-like (requests: 512Mi RAM, 500m CPU)
- **Monitoring**: Full observability stack
### Production
- **Purpose**: Live traffic, stable releases
- **Image Tag**: Semantic version (e.g., `v1.2.3`)
- **Replicas**: 3+ (autoscaling)
- **Resources**: Right-sized (requests: 1Gi RAM, 1 CPU)
- **Monitoring**: Alerts enabled, SLO tracking
## Rollback Procedures
### Humanitec Rollback
```bash
# List previous deployments
humctl get deployments \
--org kyn-cjot \
--app online-boutique \
--env production
# Rollback to specific deployment
humctl deploy \
--org kyn-cjot \
--app online-boutique \
--env production \
--deployment-id <PREVIOUS_DEPLOYMENT_ID> \
--message "Rollback due to issue"
```
### Kubernetes Rollback
```bash
# Rollback to previous revision
kubectl -n rollout undo deployment/online-boutique
# Rollback to specific revision
kubectl -n rollout undo deployment/online-boutique --to-revision=2
# Check rollout status
kubectl -n rollout status deployment/online-boutique
# View rollout history
kubectl -n rollout history deployment/online-boutique
```
## Troubleshooting
### Pod Not Starting
```bash
# Check pod events
kubectl -n describe pod <POD_NAME>
# Check logs
kubectl -n logs <POD_NAME>
# Check previous container logs (if restarting)
kubectl -n logs <POD_NAME> --previous
```
### Image Pull Errors
```bash
# Verify ACR access
az acr login --name bstagecjotdevacr
# Check image exists
az acr repository show-tags --name bstagecjotdevacr --repository online-boutique
# Verify AKS ACR integration
az aks check-acr \
--resource-group bstage-cjot-dev \
--name bstage-cjot-dev-aks \
--acr bstagecjotdevacr.azurecr.io
```
### Service Not Accessible
```bash
# Check service endpoints
kubectl -n get endpoints online-boutique-service
# Check ingress
kubectl -n describe ingress online-boutique-ingress
# Test internal connectivity
kubectl -n run curl-test --image=curlimages/curl:latest --rm -it --restart=Never -- \
curl http://online-boutique-service/actuator/health
```
### Humanitec Deployment Stuck
```bash
# Check deployment status
humctl get deployment <DEPLOYMENT_ID> \
--org kyn-cjot \
--app online-boutique \
--env development
# View error logs
humctl logs \
--org kyn-cjot \
--app online-boutique \
--env development \
--deployment-id <DEPLOYMENT_ID>
# Cancel stuck deployment
humctl delete deployment <DEPLOYMENT_ID> \
--org kyn-cjot \
--app online-boutique \
--env development
```
### Resource Issues
```bash
# Check resource usage
kubectl -n top pods
# Describe pod for resource constraints
kubectl -n describe pod <POD_NAME> | grep -A 10 "Conditions:"
# Check node capacity
kubectl describe nodes | grep -A 10 "Allocated resources:"
```
## Blue-Green Deployments
For zero-downtime deployments with Humanitec:
1. Deploy new version to staging
2. Run smoke tests
3. Promote to production with traffic splitting
4. Monitor metrics
5. Complete cutover or rollback
## Next Steps
- [Configure monitoring](monitoring.md)
- [Review architecture](architecture.md)
- [Return to overview](index.md)

130
docs/index.md Normal file
View File

@@ -0,0 +1,130 @@
# online-boutique
## Overview
**online-boutique** is a production-ready Java microservice built using the Kyndryl Platform Engineering Golden Path.
!!! info "Service Information"
- **Description**: Java microservice via Golden Path
- **Environment**: development
- **Technology**: Spring Boot 3.2, Java 17
- **Orchestration**: Humanitec
- **Observability**: Prometheus + Grafana
## Quick Links
- [Repository](https://gitea.kyndemo.live/validate/online-boutique)
- [Humanitec Console](https://app.humanitec.io/orgs/kyn-cjot/apps/online-boutique)
- [Grafana Dashboard](https://grafana.kyndemo.live/d/spring-boot-dashboard?var-app=online-boutique)
## Features
**Production-Ready Configuration**
- Health checks (liveness, readiness, startup)
- Graceful shutdown
- Resource limits and requests
- Security contexts
**Observability**
- Prometheus metrics integration
- Pre-configured Grafana dashboards
- Structured logging
- Request tracing
**CI/CD**
- Automated builds via GitHub Actions
- Azure Container Registry integration
- Humanitec deployment automation
- GitOps fallback with ArgoCD
**Developer Experience**
- Local development support
- Hot reload with Spring DevTools
- Comprehensive tests
- API documentation
## Architecture
This service follows the golden path architecture:
```
┌─────────────────────────────────────────┐
│ Developer Experience │
│ (Backstage Template → Gitea Repo) │
└─────────────────────────────────────────┘
│ git push
┌─────────────────────────────────────────┐
│ GitHub Actions CI/CD │
│ 1. Build with Maven │
│ 2. Run tests │
│ 3. Build Docker image │
│ 4. Push to ACR │
│ 5. Deploy via Humanitec │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Humanitec Orchestrator │
│ - Interprets score.yaml │
│ - Provisions resources │
│ - Deploys to AKS │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Azure AKS Cluster │
│ - Pods with app containers │
│ - Prometheus scraping metrics │
│ - Service mesh (optional) │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Grafana + Prometheus │
│ - Real-time metrics │
│ - Dashboards │
│ - Alerting │
└─────────────────────────────────────────┘
```
## API Endpoints
### Application Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/` | GET | Welcome message |
| `/api/status` | GET | Service health status |
### Actuator Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/actuator/health` | GET | Overall health |
| `/actuator/health/liveness` | GET | Liveness probe |
| `/actuator/health/readiness` | GET | Readiness probe |
| `/actuator/metrics` | GET | Available metrics |
| `/actuator/prometheus` | GET | Prometheus metrics |
| `/actuator/info` | GET | Application info |
## Technology Stack
- **Language**: Java 17
- **Framework**: Spring Boot 3.2.0
- **Build Tool**: Maven 3.9
- **Metrics**: Micrometer + Prometheus
- **Container**: Docker (Alpine-based)
- **Orchestration**: Humanitec (Score)
- **CI/CD**: GitHub Actions
- **Registry**: Azure Container Registry
- **Kubernetes**: Azure AKS
- **Monitoring**: Prometheus + Grafana
## Next Steps
- [Set up local development environment](local-development.md)
- [Learn about deployment process](deployment.md)
- [Configure monitoring and alerts](monitoring.md)
- [Understand the architecture](architecture.md)

279
docs/local-development.md Normal file
View File

@@ -0,0 +1,279 @@
# Local Development
This guide covers setting up and running online-boutique on your local machine.
## Prerequisites
- **Java 17** or higher ([Download](https://adoptium.net/))
- **Maven 3.9+** (included via Maven Wrapper)
- **Docker** (optional, for container testing)
- **Git**
## Quick Start
### 1. Clone the Repository
```bash
git clone https://gitea.kyndemo.live/validate/online-boutique.git
cd online-boutique
```
### 2. Build the Application
```bash
# Using Maven Wrapper (recommended)
./mvnw clean package
# Or with system Maven
mvn clean package
```
### 3. Run the Application
```bash
# Run with Spring Boot Maven plugin
./mvnw spring-boot:run
# Or run the JAR directly
java -jar target/online-boutique-1.0.0-SNAPSHOT.jar
```
The application will start on **http://localhost:8080**
### 4. Verify It's Running
```bash
# Check health
curl http://localhost:8080/actuator/health
# Check status
curl http://localhost:8080/api/status
# View metrics
curl http://localhost:8080/actuator/prometheus
```
## Development Workflow
### Hot Reload with Spring DevTools
For automatic restarts during development, add Spring DevTools to `pom.xml`:
```xml
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
```
Changes to Java files will trigger automatic restarts.
### Running Tests
```bash
# Run all tests
./mvnw test
# Run specific test class
./mvnw test -Dtest=GoldenPathApplicationTests
# Run tests with coverage
./mvnw test jacoco:report
```
### Active Profile
Set active profile via environment variable:
```bash
# Development profile
export SPRING_PROFILES_ACTIVE=development
./mvnw spring-boot:run
# Or inline
SPRING_PROFILES_ACTIVE=development ./mvnw spring-boot:run
```
## Docker Development
### Build Image Locally
```bash
docker build -t online-boutique:dev .
```
### Run in Docker
```bash
docker run -p 8080:8080 \
-e SPRING_PROFILES_ACTIVE=development \
online-boutique:dev
```
### Docker Compose (if needed)
Create `docker-compose.yml`:
```yaml
version: '3.8'
services:
app:
build: .
ports:
- "8080:8080"
environment:
- SPRING_PROFILES_ACTIVE=development
```
Run with:
```bash
docker-compose up
```
## IDE Setup
### IntelliJ IDEA
1. **Import Project**: File → New → Project from Existing Sources
2. **Select Maven**: Choose Maven as build tool
3. **SDK**: Configure Java 17 SDK
4. **Run Configuration**:
- Main class: `com.kyndryl.goldenpath.GoldenPathApplication`
- VM options: `-Dspring.profiles.active=development`
### VS Code
1. **Install Extensions**:
- Extension Pack for Java
- Spring Boot Extension Pack
2. **Open Folder**: Open the project root
3. **Run/Debug**: Use Spring Boot Dashboard or F5
### Eclipse
1. **Import**: File → Import → Maven → Existing Maven Projects
2. **Update Project**: Right-click → Maven → Update Project
3. **Run**: Right-click on Application class → Run As → Java Application
## Debugging
### Enable Debug Logging
In `application-development.yml`:
```yaml
logging:
level:
root: DEBUG
com.kyndryl.goldenpath: TRACE
```
### Remote Debugging
Start with debug enabled:
```bash
./mvnw spring-boot:run -Dspring-boot.run.jvmArguments="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005"
```
Connect debugger to `localhost:5005`
## Common Development Tasks
### Adding a New Endpoint
```java
@GetMapping("/api/hello")
public ResponseEntity<String> hello() {
return ResponseEntity.ok("Hello, World!");
}
```
### Adding Custom Metrics
```java
@Autowired
private MeterRegistry meterRegistry;
@GetMapping("/api/data")
public String getData() {
Counter counter = Counter.builder("custom_api_calls")
.tag("endpoint", "data")
.register(meterRegistry);
counter.increment();
return "data";
}
```
### Database Integration (Future)
To add PostgreSQL:
1. Add dependency in `pom.xml`:
```xml
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
</dependency>
```
2. Configure in `application.yml`:
```yaml
spring:
datasource:
url: jdbc:postgresql://localhost:5432/mydb
username: user
password: pass
jpa:
hibernate:
ddl-auto: update
```
## Troubleshooting
### Port 8080 Already in Use
```bash
# Find process using port 8080
lsof -i :8080
# Kill process
kill -9 <PID>
# Or use different port
./mvnw spring-boot:run -Dspring-boot.run.arguments=--server.port=8081
```
### Maven Build Fails
```bash
# Clean and rebuild
./mvnw clean install -U
# Skip tests temporarily
./mvnw clean package -DskipTests
```
### Tests Fail
```bash
# Run with verbose output
./mvnw test -X
# Run single test
./mvnw test -Dtest=GoldenPathApplicationTests#contextLoads
```
## Next Steps
- [Learn about deployment](deployment.md)
- [Configure monitoring](monitoring.md)
- [Review architecture](architecture.md)

395
docs/monitoring.md Normal file
View File

@@ -0,0 +1,395 @@
# Monitoring & Observability
This guide covers monitoring online-boutique with Prometheus and Grafana.
## Overview
The Java Golden Path includes comprehensive observability:
- **Metrics**: Prometheus metrics via Spring Boot Actuator
- **Dashboards**: Pre-configured Grafana dashboard
- **Scraping**: Automatic discovery via ServiceMonitor
- **Retention**: 15 days of metrics storage
## Metrics Endpoint
Spring Boot Actuator exposes Prometheus metrics at:
```
http://<pod-ip>:8080/actuator/prometheus
```
### Verify Metrics Locally
```bash
curl http://localhost:8080/actuator/prometheus
```
### Sample Metrics Output
```
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 5.2428800E7
# HELP http_server_requests_seconds Duration of HTTP server request handling
# TYPE http_server_requests_seconds summary
http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/status",} 42.0
http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/status",} 0.351234567
```
## Available Metrics
### HTTP Metrics
- `http_server_requests_seconds_count`: Total request count
- `http_server_requests_seconds_sum`: Total request duration
- **Labels**: method, status, uri, outcome, exception
### JVM Metrics
#### Memory
- `jvm_memory_used_bytes`: Current memory usage
- `jvm_memory_max_bytes`: Maximum memory available
- `jvm_memory_committed_bytes`: Committed memory
- **Areas**: heap, nonheap
- **Pools**: G1 Eden Space, G1 Old Gen, G1 Survivor Space
#### Garbage Collection
- `jvm_gc_pause_seconds_count`: GC pause count
- `jvm_gc_pause_seconds_sum`: Total GC pause time
- `jvm_gc_memory_allocated_bytes_total`: Total memory allocated
- `jvm_gc_memory_promoted_bytes_total`: Memory promoted to old gen
#### Threads
- `jvm_threads_live_threads`: Current live threads
- `jvm_threads_daemon_threads`: Current daemon threads
- `jvm_threads_peak_threads`: Peak thread count
- `jvm_threads_states_threads`: Threads by state (runnable, blocked, waiting)
#### CPU
- `process_cpu_usage`: Process CPU usage (0-1)
- `system_cpu_usage`: System CPU usage (0-1)
- `system_cpu_count`: Number of CPU cores
### Application Metrics
- `application_started_time_seconds`: Application start timestamp
- `application_ready_time_seconds`: Application ready timestamp
- `process_uptime_seconds`: Process uptime
- `process_files_open_files`: Open file descriptors
### Custom Metrics
Add custom metrics with Micrometer:
```java
@Autowired
private MeterRegistry meterRegistry;
// Counter
Counter.builder("business_operations")
.tag("operation", "checkout")
.register(meterRegistry)
.increment();
// Gauge
Gauge.builder("active_users", this, obj -> obj.getActiveUsers())
.register(meterRegistry);
// Timer
Timer.builder("api_processing_time")
.tag("endpoint", "/api/process")
.register(meterRegistry)
.record(() -> {
// Timed operation
});
```
## Prometheus Configuration
### ServiceMonitor
Deployed automatically in `deploy/servicemonitor.yaml`:
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: online-boutique
namespace:
labels:
app: online-boutique
prometheus: kube-prometheus
spec:
selector:
matchLabels:
app: online-boutique
endpoints:
- port: http
path: /actuator/prometheus
interval: 30s
```
### Verify Scraping
Check Prometheus targets:
1. Access Prometheus: `https://prometheus.kyndemo.live`
2. Navigate to **Status → Targets**
3. Find `online-boutique` in `monitoring/` namespace
4. Status should be **UP**
Or via kubectl:
```bash
# Port-forward Prometheus
kubectl -n monitoring port-forward svc/prometheus-operated 9090:9090
# Check targets API
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job == "online-boutique")'
```
### Query Metrics
Access Prometheus UI and run queries:
```promql
# Request rate
rate(http_server_requests_seconds_count{job="online-boutique"}[5m])
# Average request duration
rate(http_server_requests_seconds_sum{job="online-boutique"}[5m])
/ rate(http_server_requests_seconds_count{job="online-boutique"}[5m])
# Error rate
sum(rate(http_server_requests_seconds_count{job="online-boutique",status=~"5.."}[5m]))
/ sum(rate(http_server_requests_seconds_count{job="online-boutique"}[5m]))
# Memory usage
jvm_memory_used_bytes{job="online-boutique",area="heap"}
/ jvm_memory_max_bytes{job="online-boutique",area="heap"}
```
## Grafana Dashboard
### Access Dashboard
1. Open Grafana: `https://grafana.kyndemo.live`
2. Navigate to **Dashboards → Spring Boot Application**
3. Select `online-boutique` from dropdown
### Dashboard Panels
#### HTTP Metrics
- **Request Rate**: Requests per second by endpoint
- **Request Duration**: Average, 95th, 99th percentile latency
- **Status Codes**: Breakdown of 2xx, 4xx, 5xx responses
- **Error Rate**: Percentage of failed requests
#### JVM Metrics
- **Heap Memory**: Used vs. max heap memory over time
- **Non-Heap Memory**: Metaspace, code cache, compressed class space
- **Garbage Collection**: GC pause frequency and duration
- **Thread Count**: Live threads, daemon threads, peak threads
#### System Metrics
- **CPU Usage**: Process and system CPU utilization
- **File Descriptors**: Open file count
- **Uptime**: Application uptime
### Custom Dashboards
Import dashboard JSON from `/k8s/monitoring/spring-boot-dashboard.json`:
1. Grafana → Dashboards → New → Import
2. Upload `spring-boot-dashboard.json`
3. Select Prometheus data source
4. Click **Import**
## Alerting
### Prometheus Alerting Rules
Create alerting rules in Prometheus:
```yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: online-boutique-alerts
namespace:
labels:
prometheus: kube-prometheus
spec:
groups:
- name: online-boutique
interval: 30s
rules:
# High error rate
- alert: HighErrorRate
expr: |
sum(rate(http_server_requests_seconds_count{job="online-boutique",status=~"5.."}[5m]))
/ sum(rate(http_server_requests_seconds_count{job="online-boutique"}[5m]))
> 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "High error rate on online-boutique"
description: "Error rate is {{ $value | humanizePercentage }}"
# High latency
- alert: HighLatency
expr: |
histogram_quantile(0.95,
sum(rate(http_server_requests_seconds_bucket{job="online-boutique"}[5m])) by (le)
) > 1.0
for: 5m
labels:
severity: warning
annotations:
summary: "High latency on online-boutique"
description: "95th percentile latency is {{ $value }}s"
# High memory usage
- alert: HighMemoryUsage
expr: |
jvm_memory_used_bytes{job="online-boutique",area="heap"}
/ jvm_memory_max_bytes{job="online-boutique",area="heap"}
> 0.90
for: 5m
labels:
severity: critical
annotations:
summary: "High memory usage on online-boutique"
description: "Heap usage is {{ $value | humanizePercentage }}"
# Pod not ready
- alert: PodNotReady
expr: |
kube_pod_status_ready{namespace="",pod=~"online-boutique-.*",condition="true"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "online-boutique pod not ready"
description: "Pod {{ $labels.pod }} not ready for 5 minutes"
```
Apply:
```bash
kubectl apply -f prometheus-rules.yaml
```
### Grafana Alerts
Configure alerts in Grafana dashboard panels:
1. Edit panel
2. Click **Alert** tab
3. Set conditions (e.g., "when avg() of query(A) is above 0.8")
4. Configure notification channels (Slack, email, PagerDuty)
### Alert Testing
Trigger test alerts:
```bash
# Generate errors
for i in {1..100}; do
curl http://localhost:8080/api/nonexistent
done
# Trigger high latency
ab -n 10000 -c 100 http://localhost:8080/api/status
# Cause memory pressure
curl -X POST http://localhost:8080/actuator/heapdump
```
## Distributed Tracing (Future)
To add tracing with Jaeger/Zipkin:
1. Add dependency:
```xml
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-tracing-bridge-otel</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-exporter-zipkin</artifactId>
</dependency>
```
2. Configure in `application.yml`:
```yaml
management:
tracing:
sampling:
probability: 1.0
zipkin:
tracing:
endpoint: http://zipkin:9411/api/v2/spans
```
## Log Aggregation
For centralized logging:
1. **Loki**: Add Promtail to collect pod logs
2. **Grafana Logs**: Query logs alongside metrics
3. **Log Correlation**: Link traces to logs via trace ID
## Best Practices
1. **Metric Cardinality**: Avoid high-cardinality labels (user IDs, timestamps)
2. **Naming**: Follow Prometheus naming conventions (`_total`, `_seconds`, `_bytes`)
3. **Aggregation**: Use recording rules for expensive queries
4. **Retention**: Adjust retention period based on storage capacity
5. **Dashboarding**: Create business-specific dashboards for stakeholders
## Troubleshooting
### Metrics Not Appearing
```bash
# Check if actuator is enabled
kubectl -n exec -it deployment/online-boutique -- \
curl http://localhost:8080/actuator
# Check ServiceMonitor
kubectl -n get servicemonitor online-boutique -o yaml
# Check Prometheus logs
kubectl -n monitoring logs -l app.kubernetes.io/name=prometheus --tail=100 | grep online-boutique
```
### High Memory Usage
```bash
# Take heap dump
kubectl -n exec -it deployment/online-boutique -- \
curl -X POST http://localhost:8080/actuator/heapdump --output heapdump.hprof
# Analyze with jmap/jhat or Eclipse Memory Analyzer
```
### Slow Queries
Enable query logging in Prometheus:
```bash
kubectl -n monitoring port-forward svc/prometheus-operated 9090:9090
# Access http://localhost:9090/graph
# Enable query stats in settings
```
## Next Steps
- [Review architecture](architecture.md)
- [Learn about deployment](deployment.md)
- [Return to overview](index.md)