# Deployment Guide - Observability Platform

Complete guide for deploying the Observability Platform to production environments.

## Table of Contents

1. [Local Development](#local-development)
2. [Docker Deployment](#docker-deployment)
3. [Docker Compose](#docker-compose)
4. [Kubernetes](#kubernetes)
5. [Cloud Platforms](#cloud-platforms)
6. [Monitoring](#monitoring)
7. [Troubleshooting](#troubleshooting)

---

## Local Development

### Prerequisites

- Go 1.23+
- Git

### Installation

```bash
# Clone repository
git clone https://github.com/yourusername/observability-platform.git
cd observability-platform

# Install dependencies
go mod download
go mod verify

# Build application
make build

# Run tests
make test

# Start server
make run
```

Server will be available at `http://localhost:8080`

---

## Docker Deployment

### Building the Image

```bash
# Build image
make docker-build

# Verify image
docker images | grep observability-platform

# Tag for registry
docker tag observability-platform:latest <registry>/observability-platform:1.0.0
```

### Running Container

```bash
# Run container
docker run -d \
  --name observability-platform \
  -p 8080:8080 \
  -e PORT=8080 \
  observability-platform:latest

# Check logs
docker logs -f observability-platform

# Stop container
docker stop observability-platform
docker rm observability-platform
```

### Docker Best Practices

- Use multi-stage builds (already implemented in Dockerfile)
- Run as non-root user (created in Dockerfile)
- Include health checks
- Set resource limits
- Use Alpine Linux for minimal image size

---

## Docker Compose

### Quick Start

```bash
# Start all services
docker-compose up -d

# Check services
docker-compose ps

# View logs
docker-compose logs -f

# Stop services
docker-compose down
```

### Service Configuration

**Services Started:**
- **Platform** (8080): Main observability platform
- **Jaeger** (16686): Distributed tracing UI
- **Prometheus** (9090): Metrics server

### Scaling

```bash
# Run multiple platform instances
docker-compose up -d --scale platform=3

# Access via load balancer
# Each instance on ports 8080-8082
```

### Custom Configuration

```bash
# Override port
PORT=3000 docker-compose up -d

# Use custom file
docker-compose -f docker-compose.prod.yml up -d

# Stop and remove volumes
docker-compose down -v
```

---

## Kubernetes

### Prerequisites

- Kubernetes 1.20+
- kubectl configured
- Container registry access

### Prepare Image

```bash
# Build and push to registry
docker build -t <registry>/observability-platform:1.0.0 .
docker push <registry>/observability-platform:1.0.0
```

### Deploy to Kubernetes

```bash
# Create namespace
kubectl create namespace observability

# Apply deployment (update image in manifest first)
kubectl apply -f k8s/deployment.yaml

# Verify deployment
kubectl get pods -n observability
kubectl get svc -n observability

# Check logs
kubectl logs -f -l app=observability-platform -n observability
```

### Kubernetes Manifest

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: observability-platform
  namespace: observability
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: observability-platform
  template:
    metadata:
      labels:
        app: observability-platform
    spec:
      containers:
      - name: platform
        image: <registry>/observability-platform:1.0.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          name: http
        env:
        - name: PORT
          value: "8080"
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 2
---
apiVersion: v1
kind: Service
metadata:
  name: observability-platform
  namespace: observability
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
    name: http
  selector:
    app: observability-platform
```

### Management

```bash
# Scale deployment
kubectl scale deployment observability-platform --replicas=5 -n observability

# Update image
kubectl set image deployment/observability-platform \
  platform=<registry>/observability-platform:2.0.0 -n observability

# Delete deployment
kubectl delete -f k8s/deployment.yaml

# View events
kubectl get events -n observability

# Port forward (for local testing)
kubectl port-forward svc/observability-platform 8080:80 -n observability
```

---

## Cloud Platforms

### AWS ECS

```bash
# Create ECR repository
aws ecr create-repository --repository-name observability-platform

# Build and push
docker build -t observability-platform:latest .
docker tag observability-platform:latest <account>.dkr.ecr.<region>.amazonaws.com/observability-platform:latest
docker push <account>.dkr.ecr.<region>.amazonaws.com/observability-platform:latest

# Create task definition (save as task-definition.json)
aws ecs register-task-definition --cli-input-json file://task-definition.json

# Create service
aws ecs create-service \
  --cluster my-cluster \
  --service-name observability-platform \
  --task-definition observability-platform:1 \
  --desired-count 3 \
  --launch-type EC2
```

### Google Cloud Run

```bash
# Configure Docker for GCR
gcloud auth configure-docker

# Build and push
gcloud builds submit --tag gcr.io/<project>/observability-platform

# Deploy
gcloud run deploy observability-platform \
  --image gcr.io/<project>/observability-platform \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --port 8080 \
  --memory 512Mi \
  --cpu 2
```

### Azure Container Instances

```bash
# Create container registry
az acr create --resource-group mygroup --name myregistry --sku Basic

# Build image in ACR
az acr build --registry myregistry --image observability-platform:latest .

# Deploy container
az container create \
  --resource-group mygroup \
  --name observability-platform \
  --image myregistry.azurecr.io/observability-platform:latest \
  --dns-name-label observability-platform \
  --ports 8080 \
  --cpu 2 \
  --memory 0.5
```

---

## Monitoring

### Health Checks

```bash
# Check platform health
curl http://localhost:8080/health

# Response:
# OK
```

### Prometheus Metrics

```bash
# Access Prometheus
http://localhost:9090

# Useful queries:
# - up{job="observability-platform"}
# - rate(http_requests_total[5m])
# - histogram_quantile(0.95, http_request_duration_seconds_bucket)
```

### Jaeger Tracing

```bash
# Access Jaeger UI
http://localhost:16686

# View traces by service
# Search for specific operations
# Analyze latency and errors
```

### Logs

```bash
# Container logs
docker logs observability-platform

# Kubernetes logs
kubectl logs -f deployment/observability-platform -n observability

# Follow logs
docker-compose logs -f platform
```

---

## Troubleshooting

### Port Already in Use

```bash
# Find process using port 8080
lsof -i :8080

# Kill process
kill -9 <PID>

# Or use different port
PORT=3000 make run
```

### Container Won't Start

```bash
# Check logs
docker logs observability-platform

# Test locally
go run cmd/server/main.go

# Verify dependencies
go mod verify
```

### High Memory Usage

```bash
# Check memory
docker stats observability-platform

# Increase limits
docker run -m 1g observability-platform:latest

# Or in docker-compose.yml
services:
  platform:
    deploy:
      resources:
        limits:
          memory: 1G
```

### Slow Queries

```bash
# Check query performance
curl -X POST http://localhost:8080/api/logs/query \
  -H "Content-Type: application/json" \
  -d '{...}'

# Reduce time range
# Increase batch size
# Add indexing
```

---

## Security Recommendations

### Environment Variables

```bash
# Use .env file (not in version control)
cp .env.example .env
# Edit .env with secrets
export $(cat .env | xargs)
```

### TLS/HTTPS

```bash
# Generate self-signed certificate (development only)
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes

# Configure in application
# Use Let's Encrypt for production
```

### Network Security

```bash
# Use network policies in Kubernetes
# Restrict ingress/egress
# Use service mesh (Istio)
```

### Container Security

```bash
# Scan for vulnerabilities
docker scan observability-platform

# Sign images
docker trust key load key.key

# Use private registry
```

---

## Performance Tuning

### Resource Allocation

```bash
# Monitor resource usage
docker stats

# Adjust limits based on actual usage
docker run -m 512m -c 512 observability-platform:latest
```

### Caching

```bash
# Enable caching in reverse proxy
# Set appropriate cache headers
# Use Redis for hot data
```

### Scaling

```bash
# Horizontal scaling
docker-compose up -d --scale platform=5

# Load balancing
# Use nginx, HAProxy, or cloud load balancer
```

---

## Maintenance

### Regular Updates

```bash
# Update dependencies
go get -u ./...
go mod tidy

# Rebuild and test
make build
make test
```

### Backup

```bash
# For database backends, implement regular backups
# For in-memory storage, data is not persisted
```

### Cleanup

```bash
# Remove old logs
docker-compose logs --timestamps platform | grep "2024-01-01"

# Prune Docker
docker system prune -a

# Cleanup old images
docker image prune -a
```

---

**For questions or issues, refer to the main README.md or open an issue on GitHub.**
