Monitoring¶
Every product exposes health and metrics endpoints for monitoring.
Health Endpoints¶
| Endpoint | Purpose | Use for |
|---|---|---|
/health |
Full health check with component status | Dashboards, alerting |
/health/live |
Liveness probe (is the process running?) | Kubernetes liveness probe |
/health/ready |
Readiness probe (can it serve traffic?) | Kubernetes readiness probe, load balancers |
# Full health check
curl http://localhost:8100/health
# {"status":"healthy","product":"haagsman-document-search","version":"1.0.0","uptime_seconds":3600,"checks":{"vectorstore":"ok"}}
Prometheus Metrics¶
Every product exposes metrics at /metrics in Prometheus format:
Key metrics¶
| Metric | Type | Description |
|---|---|---|
haagsman_http_requests_total |
Counter | Total HTTP requests (by method, endpoint, status) |
haagsman_http_latency_seconds |
Histogram | Request latency (by method, endpoint) |
haagsman_llm_requests_total |
Counter | LLM API calls (by provider, model, status) |
haagsman_llm_latency_seconds |
Histogram | LLM response time |
haagsman_llm_tokens_total |
Counter | Token usage (by provider, direction) |
haagsman_documents_indexed |
Gauge | Documents in vector store |
Grafana integration¶
Add Prometheus as a data source in Grafana, then import or build dashboards using the metrics above. Useful panels:
- Request rate and error rate over time
- LLM latency p50/p95/p99
- Token usage and cost estimation
- Active documents indexed
Docker health checks¶
All containers include built-in Docker health checks:
# View container health
docker inspect hai-document-search --format='{{.State.Health.Status}}'
# healthy
# View recent health check results
docker inspect hai-document-search --format='{{json .State.Health}}' | jq
Alerting recommendations¶
| Condition | Severity | Action |
|---|---|---|
/health returns non-200 |
Critical | Investigate immediately |
| LLM error rate > 5% | Warning | Check API key, provider status |
| p99 latency > 10s | Warning | Check LLM provider, consider scaling |
| Disk usage > 80% | Warning | Archive old data, increase storage |
Logging¶
All products log to stdout in structured format:
View logs:
Configure log level via HAAGSMAN_LOG_LEVEL (DEBUG, INFO, WARNING, ERROR).