Make dashboard -V show the same hash as the agent by extracting the hash from the dashboard binary's nix store path instead of the system configuration path. Now both will show identical hashes since they're from the same cm-dashboard package.
CM Dashboard
A real-time infrastructure monitoring system with intelligent status aggregation and email notifications, built with Rust and ZMQ.
Current Implementation
This is a complete rewrite implementing an individual metrics architecture where:
- Agent collects individual metrics (e.g.,
cpu_load_1min,memory_usage_percent) and calculates status - Dashboard subscribes to specific metrics and composes widgets
- Status Aggregation provides intelligent email notifications with batching
- Persistent Cache prevents false notifications on restart
Dashboard Interface
cm-dashboard • ● cmbox ● srv01 ● srv02 ● steambox
┌system──────────────────────────────┐┌services─────────────────────────────────────────┐
│CPU: ││Service: Status: RAM: Disk: │
│● Load: 0.10 0.52 0.88 • 400.0 MHz ││● docker active 27M 496MB │
│RAM: ││● docker-registry active 19M 496MB │
│● Used: 30% 2.3GB/7.6GB ││● gitea active 579M 2.6GB │
│● tmp: 0.0% 0B/2.0GB ││● gitea-runner-default active 11M 2.6GB │
│Disk nvme0n1: ││● haasp-core active 9M 1MB │
│● Health: PASSED ││● haasp-mqtt active 3M 1MB │
│● Usage @root: 8.3% • 75.4/906.2 GB ││● haasp-webgrid active 10M 1MB │
│● Usage @boot: 5.9% • 0.1/1.0 GB ││● immich-server active 240M 45.1GB │
│ ││● mosquitto active 1M 1MB │
│ ││● mysql active 38M 225MB │
│ ││● nginx active 28M 24MB │
│ ││ ├─ ● gitea.cmtec.se 51ms │
│ ││ ├─ ● haasp.cmtec.se 43ms │
│ ││ ├─ ● haasp.net 43ms │
│ ││ ├─ ● pages.cmtec.se 45ms │
└────────────────────────────────────┘│ ├─ ● photos.cmtec.se 41ms │
┌backup──────────────────────────────┐│ ├─ ● unifi.cmtec.se 46ms │
│Latest backup: ││ ├─ ● vault.cmtec.se 47ms │
│● Status: OK ││ ├─ ● www.kryddorten.se 81ms │
│Duration: 54s • Last: 4h ago ││ ├─ ● www.mariehall2.se 86ms │
│Disk usage: 48.2GB/915.8GB ││● postgresql active 112M 357MB │
│P/N: Samsung SSD 870 QVO 1TB ││● redis-immich active 8M 45.1GB │
│S/N: S5RRNF0W800639Y ││● sshd active 2M 0 │
│● gitea 2 archives 2.7GB ││● unifi active 594M 495MB │
│● immich 2 archives 45.0GB ││● vaultwarden active 12M 1MB │
│● kryddorten 2 archives 67.6MB ││ │
│● mariehall2 2 archives 321.8MB ││ │
│● nixosbox 2 archives 4.5MB ││ │
│● unifi 2 archives 2.9MB ││ │
│● vaultwarden 2 archives 305kB ││ │
└────────────────────────────────────┘└─────────────────────────────────────────────────┘
Navigation: ←→ switch hosts, r refresh, q quit
Features
- Real-time monitoring - Dashboard updates every 1-2 seconds
- Individual metric collection - Granular data for flexible dashboard composition
- Intelligent status aggregation - Host-level status calculated from all services
- Smart email notifications - Batched, detailed alerts with service groupings
- Persistent state - Prevents false notifications on restarts
- ZMQ communication - Efficient agent-to-dashboard messaging
- Clean TUI - Terminal-based dashboard with color-coded status indicators
Architecture
Core Components
- Agent (
cm-dashboard-agent) - Collects metrics and sends via ZMQ - Dashboard (
cm-dashboard) - Real-time TUI display consuming metrics - Shared (
cm-dashboard-shared) - Common types and protocol - Status Aggregation - Intelligent batching and notification management
- Persistent Cache - Maintains state across restarts
Status Levels
- 🟢 Ok - Service running normally
- 🔵 Pending - Service starting/stopping/reloading
- 🟡 Warning - Service issues (high load, memory, disk usage)
- 🔴 Critical - Service failed or critical thresholds exceeded
- ❓ Unknown - Service state cannot be determined
Quick Start
Build
# With Nix (recommended)
nix-shell -p openssl pkg-config --run "cargo build --workspace"
# Or with system dependencies
sudo apt install libssl-dev pkg-config # Ubuntu/Debian
cargo build --workspace
Run
# Start agent (requires configuration file)
./target/debug/cm-dashboard-agent --config /etc/cm-dashboard/agent.toml
# Start dashboard
./target/debug/cm-dashboard --config /path/to/dashboard.toml
Configuration
Agent Configuration (agent.toml)
The agent requires a comprehensive TOML configuration file:
collection_interval_seconds = 2
[zmq]
publisher_port = 6130
command_port = 6131
bind_address = "0.0.0.0"
timeout_ms = 5000
heartbeat_interval_ms = 30000
[collectors.cpu]
enabled = true
interval_seconds = 2
load_warning_threshold = 9.0
load_critical_threshold = 10.0
temperature_warning_threshold = 100.0
temperature_critical_threshold = 110.0
[collectors.memory]
enabled = true
interval_seconds = 2
usage_warning_percent = 80.0
usage_critical_percent = 95.0
[collectors.disk]
enabled = true
interval_seconds = 300
usage_warning_percent = 80.0
usage_critical_percent = 90.0
[[collectors.disk.filesystems]]
name = "root"
uuid = "4cade5ce-85a5-4a03-83c8-dfd1d3888d79"
mount_point = "/"
fs_type = "ext4"
monitor = true
[collectors.systemd]
enabled = true
interval_seconds = 10
memory_warning_mb = 1000.0
memory_critical_mb = 2000.0
service_name_filters = [
"nginx", "postgresql", "redis", "docker", "sshd"
]
excluded_services = [
"nginx-config-reload", "sshd-keygen"
]
[notifications]
enabled = true
smtp_host = "localhost"
smtp_port = 25
from_email = "{hostname}@example.com"
to_email = "admin@example.com"
rate_limit_minutes = 0
trigger_on_warnings = true
trigger_on_failures = true
recovery_requires_all_ok = true
suppress_individual_recoveries = true
[status_aggregation]
enabled = true
aggregation_method = "worst_case"
notification_interval_seconds = 30
[cache]
persist_path = "/var/lib/cm-dashboard/cache.json"
Dashboard Configuration (dashboard.toml)
[zmq]
hosts = [
{ name = "server1", address = "192.168.1.100", port = 6130 },
{ name = "server2", address = "192.168.1.101", port = 6130 }
]
connection_timeout_ms = 5000
reconnect_interval_ms = 10000
[ui]
refresh_interval_ms = 1000
theme = "dark"
Collectors
The agent implements several specialized collectors:
CPU Collector (cpu.rs)
- Load average (1, 5, 15 minute)
- CPU temperature monitoring
- Real-time process monitoring (top CPU consumers)
- Status calculation with configurable thresholds
Memory Collector (memory.rs)
- RAM usage (total, used, available)
- Swap monitoring
- Real-time process monitoring (top RAM consumers)
- Memory pressure detection
Disk Collector (disk.rs)
- Filesystem usage per mount point
- SMART health monitoring
- Temperature and wear tracking
- Configurable filesystem monitoring
Systemd Collector (systemd.rs)
- Service status monitoring (
active,inactive,failed) - Memory usage per service
- Service filtering and exclusions
- Handles transitional states (
Status::Pending)
Backup Collector (backup.rs)
- Reads TOML status files from backup systems
- Archive age verification
- Disk usage tracking
- Repository health monitoring
Email Notifications
Intelligent Batching
The system implements smart notification batching to prevent email spam:
- Real-time dashboard updates - Status changes appear immediately
- Batched email notifications - Aggregated every 30 seconds
- Detailed groupings - Services organized by severity
Example Alert Email
Subject: Status Alert: 2 critical, 1 warning, 15 started
Status Summary (30s duration)
Host Status: Ok → Warning
🔴 CRITICAL ISSUES (2):
postgresql: Ok → Critical
nginx: Warning → Critical
🟡 WARNINGS (1):
redis: Ok → Warning (memory usage 85%)
✅ RECOVERIES (0):
🟢 SERVICE STARTUPS (15):
docker: Unknown → Ok
sshd: Unknown → Ok
...
--
CM Dashboard Agent
Generated at 2025-10-21 19:42:42 CET
Individual Metrics Architecture
The system follows a metrics-first architecture:
Agent Side
// Agent collects individual metrics
vec![
Metric::new("cpu_load_1min".to_string(), MetricValue::Float(2.5), Status::Ok),
Metric::new("memory_usage_percent".to_string(), MetricValue::Float(78.5), Status::Warning),
Metric::new("service_nginx_status".to_string(), MetricValue::String("active".to_string()), Status::Ok),
]
Dashboard Side
// Widgets subscribe to specific metrics
impl Widget for CpuWidget {
fn update_from_metrics(&mut self, metrics: &[&Metric]) {
for metric in metrics {
match metric.name.as_str() {
"cpu_load_1min" => self.load_1min = metric.value.as_f32(),
"cpu_load_5min" => self.load_5min = metric.value.as_f32(),
"cpu_temperature_celsius" => self.temperature = metric.value.as_f32(),
_ => {}
}
}
}
}
Persistent Cache
The cache system prevents false notifications:
- Automatic saving - Saves when service status changes
- Persistent storage - Maintains state across agent restarts
- Simple design - No complex TTL or cleanup logic
- Status preservation - Prevents duplicate notifications
Development
Project Structure
cm-dashboard/
├── agent/ # Metrics collection agent
│ ├── src/
│ │ ├── collectors/ # CPU, memory, disk, systemd, backup
│ │ ├── status/ # Status aggregation and notifications
│ │ ├── cache/ # Persistent metric caching
│ │ ├── config/ # TOML configuration loading
│ │ └── notifications/ # Email notification system
├── dashboard/ # TUI dashboard application
│ ├── src/
│ │ ├── ui/widgets/ # CPU, memory, services, backup widgets
│ │ ├── metrics/ # Metric storage and filtering
│ │ └── communication/ # ZMQ metric consumption
├── shared/ # Shared types and utilities
│ └── src/
│ ├── metrics.rs # Metric, Status, and Value types
│ ├── protocol.rs # ZMQ message format
│ └── cache.rs # Cache configuration
└── README.md # This file
Building
# Debug build
cargo build --workspace
# Release build
cargo build --workspace --release
# Run tests
cargo test --workspace
# Check code formatting
cargo fmt --all -- --check
# Run clippy linter
cargo clippy --workspace -- -D warnings
Dependencies
- tokio - Async runtime
- zmq - Message passing between agent and dashboard
- ratatui - Terminal user interface
- serde - Serialization for metrics and config
- anyhow/thiserror - Error handling
- tracing - Structured logging
- lettre - SMTP email notifications
- clap - Command-line argument parsing
- toml - Configuration file parsing
NixOS Integration
This project is designed for declarative deployment via NixOS:
Configuration Generation
The NixOS module automatically generates the agent configuration:
# hosts/common/cm-dashboard.nix
services.cm-dashboard-agent = {
enable = true;
port = 6130;
};
Deployment
# Update NixOS configuration
git add hosts/common/cm-dashboard.nix
git commit -m "Update cm-dashboard configuration"
git push
# Rebuild system (user-performed)
sudo nixos-rebuild switch --flake .
Monitoring Intervals
- CPU/Memory: 2 seconds (real-time monitoring)
- Disk usage: 300 seconds (5 minutes)
- Systemd services: 10 seconds
- SMART health: 600 seconds (10 minutes)
- Backup status: 60 seconds (1 minute)
- Email notifications: 30 seconds (batched)
- Dashboard updates: 1 second (real-time display)
License
MIT License - see LICENSE file for details
Description
cm-dashboard v0.1.265
Latest
Languages
Rust
100%