All checks were successful
Build and Release / build-and-release (push) Successful in 1m19s
Change display from "IP: X.X.X.X" to "ip: X.X.X.X". Remove status icon for vpn_route service type. Version: v0.1.236
CM Dashboard
A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built with ZMQ-based metric collection and individual metrics architecture.
Features
Core Monitoring
- Real-time metrics: CPU, RAM, Storage, and Service status
- Multi-host support: Monitor multiple servers from single dashboard
- Service management: Start/stop services with intelligent status tracking
- NixOS integration: System rebuild via SSH + tmux popup
- Backup monitoring: Borgbackup status and scheduling
- Email notifications: Intelligent batching prevents spam
User-Stopped Service Tracking
Services stopped via the dashboard are intelligently tracked to prevent false alerts:
- Smart status reporting: User-stopped services show as Status::OK instead of Warning
- Persistent storage: Tracking survives agent restarts via JSON storage
- Automatic management: Flags cleared when services restarted via dashboard
- Maintenance friendly: No false alerts during intentional service operations
Architecture
Individual Metrics Philosophy
- Agent: Collects individual metrics, calculates status using thresholds
- Dashboard: Subscribes to specific metrics, composes widgets from individual data
- ZMQ Communication: Efficient real-time metric transmission
- Status Aggregation: Host-level status calculated from all service metrics
Components
┌─────────────────┐ ZMQ ┌─────────────────┐
│ │◄──────────►│ │
│ Agent │ Metrics │ Dashboard │
│ - Collectors │ │ - TUI │
│ - Status │ │ - Widgets │
│ - Tracking │ │ - Commands │
│ │ │ │
└─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ JSON Storage │ │ SSH + tmux │
│ - User-stopped │ │ - Remote rebuild│
│ - Cache │ │ - Process │
│ - State │ │ isolation │
└─────────────────┘ └─────────────────┘
Service Control Flow
- User Action: Dashboard sends
UserStart/UserStopcommands - Agent Processing:
- Marks service as user-stopped (if stopping)
- Executes
systemctl start/stop service - Syncs state to global tracker
- Status Calculation:
- Systemd collector checks user-stopped flag
- Reports Status::OK for user-stopped inactive services
- Normal Warning status for system failures
Interface
cm-dashboard • ● cmbox ● srv01 ● srv02 ● steambox
┌system──────────────────────────────┐┌services─────────────────────────────────────────┐
│NixOS: ││Service: Status: RAM: Disk: │
│Build: 25.05.20251004.3bcc93c ││● docker active 27M 496MB │
│Agent: v0.1.43 ││● gitea active 579M 2.6GB │
│Active users: cm, simon ││● nginx active 28M 24MB │
│CPU: ││ ├─ ● gitea.cmtec.se 51ms │
│● Load: 0.10 0.52 0.88 • 3000MHz ││ ├─ ● photos.cmtec.se 41ms │
│RAM: ││● postgresql active 112M 357MB │
│● Usage: 33% 2.6GB/7.6GB ││● redis-immich user-stopped │
│● /tmp: 0% 0B/2.0GB ││● sshd active 2M 0 │
│Storage: ││● unifi active 594M 495MB │
│● root (Single): ││ │
│ ├─ ● nvme0n1 W: 1% ││ │
│ └─ ● 18% 167.4GB/928.2GB ││ │
└────────────────────────────────────┘└─────────────────────────────────────────────────┘
Navigation
- Tab: Switch between hosts
- ↑↓ or j/k: Navigate services
- s: Start selected service (UserStart)
- S: Stop selected service (UserStop)
- J: Show service logs (journalctl in tmux popup)
- L: Show custom log files (tail -f custom paths in tmux popup)
- R: Rebuild current host
- B: Run backup on current host
- q: Quit
Status Indicators
- Green ●: Active service
- Yellow ◐: Inactive service (system issue)
- Red ◯: Failed service
- Blue arrows: Service transitioning (↑ starting, ↓ stopping, ↻ restarting)
- "user-stopped": Service stopped via dashboard (Status::OK)
Quick Start
Building
# With Nix (recommended)
nix-shell -p openssl pkg-config --run "cargo build --workspace"
# Or with system dependencies
sudo apt install libssl-dev pkg-config # Ubuntu/Debian
cargo build --workspace
Running
# Start agent (requires configuration)
./target/debug/cm-dashboard-agent --config /etc/cm-dashboard/agent.toml
# Start dashboard (inside tmux session)
tmux
./target/debug/cm-dashboard --config /etc/cm-dashboard/dashboard.toml
Configuration
Agent Configuration
collection_interval_seconds = 2
[zmq]
publisher_port = 6130
command_port = 6131
bind_address = "0.0.0.0"
transmission_interval_seconds = 2
[collectors.cpu]
enabled = true
interval_seconds = 2
load_warning_threshold = 5.0
load_critical_threshold = 10.0
[collectors.memory]
enabled = true
interval_seconds = 2
usage_warning_percent = 80.0
usage_critical_percent = 90.0
[collectors.systemd]
enabled = true
interval_seconds = 10
service_name_filters = ["nginx*", "postgresql*", "docker*", "sshd*"]
excluded_services = ["nginx-config-reload", "systemd-", "getty@"]
nginx_latency_critical_ms = 1000.0
http_timeout_seconds = 10
[notifications]
enabled = true
smtp_host = "localhost"
smtp_port = 25
from_email = "{hostname}@example.com"
to_email = "admin@example.com"
aggregation_interval_seconds = 30
Dashboard Configuration
[zmq]
subscriber_ports = [6130]
[hosts]
predefined_hosts = ["cmbox", "srv01", "srv02"]
[ssh]
rebuild_user = "cm"
rebuild_alias = "nixos-rebuild-cmtec"
backup_alias = "cm-backup-run"
Technical Implementation
Collectors
Systemd Collector
- Service Discovery: Uses
systemctl list-unit-files+list-units --all - Status Calculation: Checks user-stopped flag before assigning Warning status
- Memory Tracking: Per-service memory usage via
systemctl show - Sub-services: Nginx site latency, Docker containers
- User-stopped Integration:
UserStoppedServiceTracker::is_service_user_stopped()
User-Stopped Service Tracker
- Storage:
/var/lib/cm-dashboard/user-stopped-services.json - Thread Safety: Global singleton with
Arc<Mutex<>> - Persistence: Automatic save on state changes
- Global Access: Static methods for collector integration
Other Collectors
- CPU: Load average, temperature, frequency monitoring
- Memory: RAM/swap usage, tmpfs monitoring
- Disk: Filesystem usage, SMART health data
- NixOS: Build version, active users, agent version
- Backup: Borgbackup repository status and metrics
ZMQ Protocol
// Metric Message
#[derive(Serialize, Deserialize)]
pub struct MetricMessage {
pub hostname: String,
pub timestamp: u64,
pub metrics: Vec<Metric>,
}
// Service Commands
pub enum AgentCommand {
ServiceControl {
service_name: String,
action: ServiceAction,
},
SystemRebuild { /* SSH config */ },
CollectNow,
}
pub enum ServiceAction {
Start, // System-initiated
Stop, // System-initiated
UserStart, // User via dashboard (clears user-stopped)
UserStop, // User via dashboard (marks user-stopped)
Status,
}
Maintenance Mode
Suppress notifications during planned maintenance:
# Enable maintenance mode
touch /tmp/cm-maintenance
# Perform maintenance
systemctl stop service
# ... work ...
systemctl start service
# Disable maintenance mode
rm /tmp/cm-maintenance
Email Notifications
Intelligent Batching
- Real-time dashboard: Immediate status updates
- Batched emails: Aggregated every 30 seconds
- Smart grouping: Services organized by severity
- Recovery suppression: Reduces notification spam
Example Alert
Subject: Status Alert: 1 critical, 2 warnings, 0 recoveries
Status Summary (30s duration)
Host Status: Ok → Warning
🔴 CRITICAL ISSUES (1):
postgresql: Ok → Critical (memory usage 95%)
🟡 WARNINGS (2):
nginx: Ok → Warning (high load 8.5)
redis: user-stopped → Warning (restarted by system)
✅ RECOVERIES (0):
--
CM Dashboard Agent v0.1.43
Development
Project Structure
cm-dashboard/
├── agent/ # Metrics collection agent
│ ├── src/
│ │ ├── collectors/ # CPU, memory, disk, systemd, backup, nixos
│ │ ├── service_tracker.rs # User-stopped service tracking
│ │ ├── status/ # Status aggregation and notifications
│ │ ├── config/ # TOML configuration loading
│ │ └── communication/ # ZMQ message handling
├── dashboard/ # TUI dashboard application
│ ├── src/
│ │ ├── ui/widgets/ # CPU, memory, services, backup, system
│ │ ├── communication/ # ZMQ consumption and commands
│ │ └── app.rs # Main application loop
├── shared/ # Shared types and utilities
│ └── src/
│ ├── metrics.rs # Metric, Status, StatusTracker types
│ ├── protocol.rs # ZMQ message format
│ └── cache.rs # Cache configuration
└── CLAUDE.md # Development guidelines and rules
Testing
# Build and test
nix-shell -p openssl pkg-config --run "cargo build --workspace"
nix-shell -p openssl pkg-config --run "cargo test --workspace"
# Code quality
cargo fmt --all
cargo clippy --workspace -- -D warnings
Deployment
Automated Binary Releases
# Create new release
cd ~/projects/cm-dashboard
git tag v0.1.X
git push origin v0.1.X
This triggers automated:
- Static binary compilation with
RUSTFLAGS="-C target-feature=+crt-static" - GitHub-style release creation
- Tarball upload to Gitea
NixOS Integration
Update ~/projects/nixosbox/hosts/services/cm-dashboard.nix:
version = "v0.1.43";
src = pkgs.fetchurl {
url = "https://gitea.cmtec.se/cm/cm-dashboard/releases/download/${version}/cm-dashboard-linux-x86_64.tar.gz";
sha256 = "sha256-HASH";
};
Get hash via:
cd ~/projects/nixosbox
nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchurl {
url = "URL_HERE";
sha256 = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
}' 2>&1 | grep "got:"
Monitoring Intervals
- Metrics Collection: 2 seconds (CPU, memory, services)
- Metric Transmission: 2 seconds (ZMQ publish)
- Dashboard Updates: 1 second (UI refresh)
- Email Notifications: 30 seconds (batched)
- Disk Monitoring: 300 seconds (5 minutes)
- Service Discovery: 300 seconds (5 minutes cache)
License
MIT License - see LICENSE file for details.
Description
cm-dashboard v0.1.259
Latest
Languages
Rust
100%