This comprehensive project synthesizes everything you've learned in the Advanced Topics section by building a production-grade distributed job queue system. You'll implement generics for type-safe job handling, apply design patterns, utilize dependency injection for testability, and demonstrate concurrency patterns with race-free operation.
Problem Statement
Modern applications frequently need to process tasks asynchronously:
- Email notifications after user actions
- Data processing pipelines that take minutes or hours
- Image/video processing that's CPU-intensive
- Report generation with complex aggregations
- Webhooks and API integrations that may be unreliable
These tasks shouldn't block the main application flow. A distributed job queue system solves this by:
- Accepting job submissions via API
- Storing jobs in a reliable queue
- Processing jobs asynchronously with worker pools
- Providing retry logic with exponential backoff
- Tracking job status and results
- Offering observability through metrics
Requirements
Functional Requirements
Job Management:
- Submit jobs via HTTP API with type-specific payloads
- Support multiple job types
- Priority-based job scheduling
- Configurable retry logic with exponential backoff
- Job status tracking
Worker Pool:
- Concurrent job processing with configurable worker count
- Graceful shutdown
- Context-based timeout and cancellation
- Observer pattern for job lifecycle events
Queue Operations:
- Redis-backed distributed queue
- Atomic operations for job state transitions
- Result storage and retrieval
- Queue statistics
Monitoring:
- Prometheus metrics for job counts and durations
- Web dashboard for real-time monitoring
- Health check endpoint
Non-Functional Requirements
- Type Safety: Use generics for job interface
- Concurrency: Race-free implementation
- Testability: Dependency injection with interfaces
- Observability: Comprehensive metrics and logging
- Scalability: Horizontal scaling via worker replicas
- Reliability: Jobs survive process restarts
Constraints
- Go 1.21 or higher
- Redis 7.0 or higher
- Docker and Docker Compose for orchestration
- Maximum job payload size: 1MB
- Job execution timeout: 5 minutes
- Result retention: 24 hours in Redis
Design Considerations
High-Level Architecture
The system consists of three main components:
-
API Server: HTTP API for job submission, status queries, and monitoring
- Accepts job submissions via REST endpoints
- Enqueues jobs to Redis with priority and metadata
- Serves real-time statistics and health checks
- Exposes Prometheus metrics endpoint
-
Worker Pool: Distributed job processors
- Multiple workers dequeue and process jobs concurrently
- Factory pattern for creating job instances from envelopes
- Observer pattern for lifecycle event notifications
- Graceful shutdown with context cancellation
-
Redis Queue: Persistent job storage and queue management
- Priority queue using sorted sets
- Atomic state transitions with pipelining
- Result storage with automatic expiration
- Statistics tracking for monitoring
Key Design Patterns
- Generics: Type-safe
Job[T any]interface eliminates runtime type assertions - Factory Pattern: Centralized job creation enables easy extensibility
- Observer Pattern: Decoupled event notifications for job lifecycle
- Dependency Injection: Interface-based abstractions for testability
- Worker Pool Pattern: Bounded concurrency with goroutines and channels
Technology Choices
- Redis: Fast, reliable queue with atomic operations and persistence
- Gin Framework: Lightweight HTTP router with middleware support
- Prometheus: Industry-standard metrics collection and visualization
- Docker Compose: Simple orchestration for multi-container deployment
- Go Generics: Compile-time type safety without code generation
Acceptance Criteria
Your implementation is complete when it meets the following criteria:
Core Functionality:
- Jobs can be submitted via HTTP POST with JSON payloads
- Multiple job types are supported
- Jobs are processed asynchronously by worker pool
- Failed jobs retry automatically with exponential backoff
- Job status can be queried by job ID
- Queue statistics are accessible via API endpoint
Quality Requirements:
- All tests pass with
go test ./... - No data races detected with
go test -race ./... - Code coverage above 70% for core packages
- Integration tests verify end-to-end job processing
- Graceful shutdown completes running jobs before exit
Production Readiness:
- Prometheus metrics expose job counts and durations
- Health check endpoint returns service status
- Web dashboard displays real-time queue statistics
- Docker Compose setup runs all services correctly
- Workers can be scaled horizontally
- Jobs persist across process restarts
Code Quality:
- Generic
Job[T]interface provides type safety - Factory pattern allows adding new job types without code changes
- Observer pattern decouples metrics/logging from worker logic
- All dependencies injected via interfaces
- Context used throughout for timeouts and cancellation
Documentation:
- README includes setup, usage, and API documentation
- Code comments explain design decisions and patterns
- API endpoints documented with example requests/responses
- Docker deployment instructions provided
Usage Examples
Submit Email Job
1curl -X POST http://localhost:8080/api/v1/jobs \
2 -H "Content-Type: application/json" \
3 -d '{
4 "type": "email",
5 "payload": {
6 "to": "user@example.com",
7 "subject": "Welcome!",
8 "body": "Thanks for signing up",
9 "from": "noreply@example.com"
10 },
11 "max_attempts": 3,
12 "priority": 5
13 }'
14
15# Response:
16# {"job_id": "550e8400-e29b-41d4-a716-446655440000"}
Submit Data Processing Job
1curl -X POST http://localhost:8080/api/v1/jobs \
2 -H "Content-Type: application/json" \
3 -d '{
4 "type": "data_process",
5 "payload": {
6 "source_url": "https://api.example.com/data",
7 "operation": "transform",
8 "filters": {"status": "active"},
9 "destination": "s3://bucket/output"
10 },
11 "max_attempts": 5,
12 "priority": 8
13 }'
Check Job Status
1curl http://localhost:8080/api/v1/jobs/550e8400-e29b-41d4-a716-446655440000
2
3# Response:
4# {
5# "job_id": "550e8400-e29b-41d4-a716-446655440000",
6# "status": "completed",
7# "started_at": "2025-10-21T10:30:00Z",
8# "completed_at": "2025-10-21T10:30:02Z",
9# "attempts": 1
10# }
View Queue Stats
1curl http://localhost:8080/api/v1/stats
2
3# Response:
4# {
5# "pending": 12,
6# "running": 4,
7# "completed": 358,
8# "failed": 3,
9# "total": 377
10# }
View Prometheus Metrics
1curl http://localhost:8080/metrics
2
3# Sample output:
4# jobqueue_jobs_started_total{job_type="email"} 45
5# jobqueue_jobs_completed_total{job_type="email"} 42
6# jobqueue_jobs_failed_total{job_type="email"} 3
7# jobqueue_job_duration_seconds_bucket{job_type="email",status="completed",le="2.5"} 40
Access Web Dashboard
1# Open in browser:
2http://localhost:8080/
3
4# The dashboard displays:
5# - Real-time queue statistics
6# - Job submission form
7# - Auto-refreshing stats every 2 seconds
Deploy with Docker Compose
1# Start all services
2docker-compose up -d
3
4# Scale workers
5docker-compose up -d --scale worker=4
6
7# View logs
8docker-compose logs -f worker
9
10# Stop services
11docker-compose down
Run Tests
1# Unit tests
2go test ./...
3
4# With race detection
5go test -race ./...
6
7# With coverage
8go test -cover ./...
9
10# Integration tests
11go test -v ./tests/integration_test.go
12
13# Benchmarks
14go test -bench=. ./internal/queue
Key Takeaways
Generics
- Type-safe abstractions:
Job[T any]eliminates runtime type assertions for payloads - Flexible yet safe: Different job types share the same interface with compile-time safety
- Practical trade-offs: Type erasure means using type switches in worker execution
- Generic constraints: Payloads must be JSON-serializable
Design Patterns
- Factory Pattern: Centralized job creation makes adding new job types trivial
- Observer Pattern: Decoupled event notifications for metrics, logging, and monitoring
- Worker Pool: Bounded concurrency pattern prevents resource exhaustion
- Dependency Injection: Interfaces enable comprehensive testing
Advanced Go Features
- Reflection: Used sparingly in job serialization/deserialization via
encoding/json - Concurrency: Goroutines, channels,
sync.WaitGroup, context propagation - Race-free code: Mutex protection for observer list, Redis atomic operations
- Context propagation: Timeout and cancellation flow through entire call stack
- Channel patterns: Blocking operations with
BZPopMinfor efficient dequeuing
Production Patterns
- Exponential backoff:
attempts²seconds prevents thundering herd on retries - Graceful shutdown: Workers complete current jobs before stopping via context cancellation
- Observability: Prometheus metrics provide real-time monitoring and alerting
- Horizontal scaling: Stateless workers share Redis queue for linear scalability
- Idempotency: Job execution should be idempotent for safe retries
- Circuit breakers: Consider adding for external service calls in jobs
Performance Considerations
- Redis pipelining: Atomic multi-operation transactions reduce network round trips
- Blocking operations:
BZPopMinavoids CPU-intensive polling loops - Connection pooling: Redis client reuses connections for efficiency
- TTL on job data: Automatic cleanup prevents unbounded memory growth
- Priority queues: Sorted sets enable O(log N) priority-based dequeuing
- Batching: Consider batch operations for high-throughput scenarios
Testing Strategies
- Unit tests: Mock interfaces for isolated component testing
- Integration tests: Use real Redis with test containers for realistic scenarios
- Race detection: Always run with
-raceflag to catch concurrency bugs - Benchmarks: Measure enqueue/dequeue throughput and identify bottlenecks
- Chaos testing: Simulate Redis failures, network partitions, and worker crashes
Extensions and Improvements
For learning purposes, consider adding:
- Dead Letter Queue: Move permanently failed jobs to separate queue for analysis
- Scheduled Jobs: Support cron-like job scheduling with delayed execution
- Job Dependencies: Wait for other jobs to complete before starting
- Rate Limiting: Limit job execution rate per type or globally using token bucket
- Persistent Results: Store results in PostgreSQL instead of Redis with TTL
- Admin API: Cancel running jobs, requeue failed jobs, purge queues
- Webhook Notifications: Notify external systems on job completion/failure
- Multi-tenancy: Isolate jobs by tenant/organization with separate queues
- Priority Queues: Multiple priority levels with dedicated worker pools
- Distributed Tracing: OpenTelemetry integration for request tracing across services
- Job Timeouts: Per-job-type timeout configuration instead of global timeout
- Result Streaming: Stream large results via chunked responses or S3
- Job Chaining: Automatically enqueue follow-up jobs on completion
- Circuit Breakers: Prevent cascading failures in external service calls
- Authentication: API key or JWT-based authentication for job submission
Next Steps
After completing this project:
- Section 4: Production Engineering - Learn Kubernetes, Docker, gRPC, microservices
- Section 5: Practice Exercises - Reinforce concepts with targeted exercises
- Section 6: Applied Projects - Build 7 production-ready applications
- Section 7: Capstone Projects - Tackle comprehensive distributed systems
Recommended path:
- Complete this project thoroughly, understanding every pattern
- Run with
go test -raceto verify concurrency correctness - Deploy with Docker Compose and observe Prometheus metrics
- Extend with at least 2 features from the Extensions list
- Review code with focus on testability and maintainability
- Load test with hundreds of concurrent job submissions
- Simulate failures and verify recovery
This project represents the culmination of Advanced Topics—you've built a real distributed system using production patterns. Well done!
Download Complete Solution
The complete implementation includes:
- Generic job interface with concrete implementations
- Factory pattern for extensible job creation
- Observer pattern in worker pool for lifecycle events
- Redis queue with atomic operations and pipelining
- HTTP API with Gin framework
- Prometheus metrics collection
- Real-time web dashboard
- Docker Compose orchestration for multi-container deployment
- Comprehensive unit and integration tests with race detection
- Benchmarks for performance testing
- Detailed README with full implementation guide, setup instructions, and API documentation
File Size: ~22KB
README Contents:
- Complete project structure breakdown
- Step-by-step implementation guide for all components
- Detailed code explanations with inline comments
- Development and deployment instructions
- Testing strategies and examples
- Performance tuning recommendations
- Troubleshooting guide
- Extension ideas with implementation hints
Congratulations! You've built a production-grade distributed job queue system. This project demonstrates mastery of generics, design patterns, concurrency, and distributed systems fundamentals. Move forward to Section 4 to learn cloud-native deployment patterns.