Prometheus is an open-source systems monitoring and alerting toolkit. It collects and stores time-series data as metrics, which makes it ideal for monitoring container environments. In our project, Prometheus serves as the backbone for monitoring service health, performance metrics, and resource utilization.
Our Prometheus instance is configured in Docker Compose as follows:
prometheus:
<<: *common
build:
context: ./src/grafana
dockerfile: Dockerfile.prometheus
profiles: ["grafanaprofile"]
container_name: prometheus
ports:
- "9090:9090"
volumes:
- prometheus_data:/prometheus
logging:
driver: gelf
options:
gelf-address: "udp://${LOG_HOST}:12201"
tag: "prometheus"
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:9090/-/healthy || exit 1"]
interval: 30s
timeout: 10s
retries: 5
Our Prometheus is configured to scrape metrics from various services in our stack:
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: 'transcendence'
# This ensures that Prometheus sends alerts to Alertmanager.
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
rule_files:
- /etc/prometheus/rules.yaml
- /etc/prometheus/alerts.yaml
- /etc/node_exporter_recording_rules.yml
# Prometheus scrapes metrics from the services.
scrape_configs:
- job_name: 'caddy'
static_configs:
- targets: ['caddy:80']
- job_name: 'backend'
static_configs:
- targets: ['backend:8000']
- job_name: "node"
static_configs:
- targets: ["node-exporter:9100"]
- job_name: 'postgresql'
static_configs:
- targets: ['postgres-exporter:9187']
- job_name: 'alertmanager'
static_configs:
- targets: ['alertmanager:9093']
- job_name: 'prometheus'
scrape_interval: 5s
scrape_timeout: 5s
static_configs:
- targets: ['prometheus:9090']
scrape_interval
: Default interval (15s) for collecting metricsevaluation_interval
: How often evaluation rules are evaluatedexternal_labels
: Labels added to any time series or alertsPrometheus collects metrics from the following services:
More about this integration see caddy_metrics.md
See also django_metrics.md for more.
See node_exporter.md
See postgres logs.md
For more about alertmanager.md.
Prometheus data is persisted in a Docker volume (prometheus_data
) to ensure metrics history is preserved across container restarts or rebuilds. This allows for:
The storage location is configured via the --storage.tsdb.path=/prometheus
parameter in the Dockerfile CMD.
The Prometheus web interface is accessible at:
http://localhost:9090
Key pages:
The Node Exporter is a key component that provides system-level metrics. To utilize these metrics in Grafana:
Grafana uses Prometheus as a data source to create dashboards for:
To verify that Prometheus is correctly scraping metrics from all targets:
http://localhost:9090
For example, to check if Caddy is exposing metrics correctly:
docker run --rm --network transcendence_network curlimages/curl curl http://caddy:80/metrics
To monitor Prometheus itself, we’ve integrated a specialized Grafana dashboard that tracks the health and performance of our monitoring system. This “meta-monitoring” approach ensures we can quickly detect and resolve issues with our observability infrastructure.
The Prometheus self-monitoring dashboard provides visibility into:
This dashboard is especially valuable when troubleshooting performance issues or planning capacity for your monitoring infrastructure.
Consider setting up alerts for:
https://prometheus.io/docs/prometheus/latest/getting_started/
https://github.com/prometheus/prometheus/tree/main?tab=readme-ov-file
https://medium.com/@tommyraspati/monitoring-your-django-project-with-prometheus-and-grafana-b06a5ca78744
id13978 node exporter quickstart and dashboard
https://grafana.com/grafana/dashboards/13978-node-exporter-quickstart-and-dashboard/
id 17658 dashboard
https://grafana.com/grafana/dashboards/17658-django/