All checks were successful
Build and Release / build-and-release (push) Successful in 1m27s
Update collector to use qBittorrent Web API instead of Transmission RPC. Query qBittorrent through VPN namespace using existing passwordless sudo permissions for ip netns exec commands. - Change service name from transmission-vpn to openvpn-vpn-download - Replace get_transmission_stats() with get_qbittorrent_stats() - Use curl through VPN namespace to access qBittorrent API at localhost:8080 - Parse qBittorrent JSON response for state, dlspeed, upspeed - Count active torrents (downloading, uploading, stalledDL, stalledUP) - Update version to v0.1.246
CM Dashboard
A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built with ZMQ-based metric collection and individual metrics architecture.
Features
Core Monitoring
- Real-time metrics: CPU, RAM, Storage, and Service status
- Multi-host support: Monitor multiple servers from single dashboard
- Service management: Start/stop services with intelligent status tracking
- NixOS integration: System rebuild via SSH + tmux popup
- Backup monitoring: Borgbackup status and scheduling
- Email notifications: Intelligent batching prevents spam
User-Stopped Service Tracking
Services stopped via the dashboard are intelligently tracked to prevent false alerts:
- Smart status reporting: User-stopped services show as Status::OK instead of Warning
- Persistent storage: Tracking survives agent restarts via JSON storage
- Automatic management: Flags cleared when services restarted via dashboard
- Maintenance friendly: No false alerts during intentional service operations
Architecture
Individual Metrics Philosophy
- Agent: Collects individual metrics, calculates status using thresholds
- Dashboard: Subscribes to specific metrics, composes widgets from individual data
- ZMQ Communication: Efficient real-time metric transmission
- Status Aggregation: Host-level status calculated from all service metrics
Components
┌─────────────────┐ ZMQ ┌─────────────────┐
│ │◄──────────►│ │
│ Agent │ Metrics │ Dashboard │
│ - Collectors │ │ - TUI │
│ - Status │ │ - Widgets │
│ - Tracking │ │ - Commands │
│ │ │ │
└─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ JSON Storage │ │ SSH + tmux │
│ - User-stopped │ │ - Remote rebuild│
│ - Cache │ │ - Process │
│ - State │ │ isolation │
└─────────────────┘ └─────────────────┘
Service Control Flow
- User Action: Dashboard sends
UserStart/UserStopcommands - Agent Processing:
- Marks service as user-stopped (if stopping)
- Executes
systemctl start/stop service - Syncs state to global tracker
- Status Calculation:
- Systemd collector checks user-stopped flag
- Reports Status::OK for user-stopped inactive services
- Normal Warning status for system failures
Interface
cm-dashboard • ● cmbox ● srv01 ● srv02 ● steambox
┌system──────────────────────────────┐┌services─────────────────────────────────────────┐
│NixOS: ││Service: Status: RAM: Disk: │
│Build: 25.05.20251004.3bcc93c ││● docker active 27M 496MB │
│Agent: v0.1.43 ││● gitea active 579M 2.6GB │
│Active users: cm, simon ││● nginx active 28M 24MB │
│CPU: ││ ├─ ● gitea.cmtec.se 51ms │
│● Load: 0.10 0.52 0.88 • 3000MHz ││ ├─ ● photos.cmtec.se 41ms │
│RAM: ││● postgresql active 112M 357MB │
│● Usage: 33% 2.6GB/7.6GB ││● redis-immich user-stopped │
│● /tmp: 0% 0B/2.0GB ││● sshd active 2M 0 │
│Storage: ││● unifi active 594M 495MB │
│● root (Single): ││ │
│ ├─ ● nvme0n1 W: 1% ││ │
│ └─ ● 18% 167.4GB/928.2GB ││ │
└────────────────────────────────────┘└─────────────────────────────────────────────────┘
Navigation
- Tab: Switch between hosts
- ↑↓ or j/k: Navigate services
- s: Start selected service (UserStart)
- S: Stop selected service (UserStop)
- J: Show service logs (journalctl in tmux popup)
- L: Show custom log files (tail -f custom paths in tmux popup)
- R: Rebuild current host
- B: Run backup on current host
- q: Quit
Status Indicators
- Green ●: Active service
- Yellow ◐: Inactive service (system issue)
- Red ◯: Failed service
- Blue arrows: Service transitioning (↑ starting, ↓ stopping, ↻ restarting)
- "user-stopped": Service stopped via dashboard (Status::OK)
Quick Start
Building
# With Nix (recommended)
nix-shell -p openssl pkg-config --run "cargo build --workspace"
# Or with system dependencies
sudo apt install libssl-dev pkg-config # Ubuntu/Debian
cargo build --workspace
Running
# Start agent (requires configuration)
./target/debug/cm-dashboard-agent --config /etc/cm-dashboard/agent.toml
# Start dashboard (inside tmux session)
tmux
./target/debug/cm-dashboard --config /etc/cm-dashboard/dashboard.toml
Configuration
Agent Configuration
collection_interval_seconds = 2
[zmq]
publisher_port = 6130
command_port = 6131
bind_address = "0.0.0.0"
transmission_interval_seconds = 2
[collectors.cpu]
enabled = true
interval_seconds = 2
load_warning_threshold = 5.0
load_critical_threshold = 10.0
[collectors.memory]
enabled = true
interval_seconds = 2
usage_warning_percent = 80.0
usage_critical_percent = 90.0
[collectors.systemd]
enabled = true
interval_seconds = 10
service_name_filters = ["nginx*", "postgresql*", "docker*", "sshd*"]
excluded_services = ["nginx-config-reload", "systemd-", "getty@"]
nginx_latency_critical_ms = 1000.0
http_timeout_seconds = 10
[notifications]
enabled = true
smtp_host = "localhost"
smtp_port = 25
from_email = "{hostname}@example.com"
to_email = "admin@example.com"
aggregation_interval_seconds = 30
Dashboard Configuration
[zmq]
subscriber_ports = [6130]
[hosts]
predefined_hosts = ["cmbox", "srv01", "srv02"]
[ssh]
rebuild_user = "cm"
rebuild_alias = "nixos-rebuild-cmtec"
backup_alias = "cm-backup-run"
Technical Implementation
Collectors
Systemd Collector
- Service Discovery: Uses
systemctl list-unit-files+list-units --all - Status Calculation: Checks user-stopped flag before assigning Warning status
- Memory Tracking: Per-service memory usage via
systemctl show - Sub-services: Nginx site latency, Docker containers
- User-stopped Integration:
UserStoppedServiceTracker::is_service_user_stopped()
User-Stopped Service Tracker
- Storage:
/var/lib/cm-dashboard/user-stopped-services.json - Thread Safety: Global singleton with
Arc<Mutex<>> - Persistence: Automatic save on state changes
- Global Access: Static methods for collector integration
Other Collectors
- CPU: Load average, temperature, frequency monitoring
- Memory: RAM/swap usage, tmpfs monitoring
- Disk: Filesystem usage, SMART health data
- NixOS: Build version, active users, agent version
- Backup: Borgbackup repository status and metrics
ZMQ Protocol
// Metric Message
#[derive(Serialize, Deserialize)]
pub struct MetricMessage {
pub hostname: String,
pub timestamp: u64,
pub metrics: Vec<Metric>,
}
// Service Commands
pub enum AgentCommand {
ServiceControl {
service_name: String,
action: ServiceAction,
},
SystemRebuild { /* SSH config */ },
CollectNow,
}
pub enum ServiceAction {
Start, // System-initiated
Stop, // System-initiated
UserStart, // User via dashboard (clears user-stopped)
UserStop, // User via dashboard (marks user-stopped)
Status,
}
Maintenance Mode
Suppress notifications during planned maintenance:
# Enable maintenance mode
touch /tmp/cm-maintenance
# Perform maintenance
systemctl stop service
# ... work ...
systemctl start service
# Disable maintenance mode
rm /tmp/cm-maintenance
Email Notifications
Intelligent Batching
- Real-time dashboard: Immediate status updates
- Batched emails: Aggregated every 30 seconds
- Smart grouping: Services organized by severity
- Recovery suppression: Reduces notification spam
Example Alert
Subject: Status Alert: 1 critical, 2 warnings, 0 recoveries
Status Summary (30s duration)
Host Status: Ok → Warning
🔴 CRITICAL ISSUES (1):
postgresql: Ok → Critical (memory usage 95%)
🟡 WARNINGS (2):
nginx: Ok → Warning (high load 8.5)
redis: user-stopped → Warning (restarted by system)
✅ RECOVERIES (0):
--
CM Dashboard Agent v0.1.43
Development
Project Structure
cm-dashboard/
├── agent/ # Metrics collection agent
│ ├── src/
│ │ ├── collectors/ # CPU, memory, disk, systemd, backup, nixos
│ │ ├── service_tracker.rs # User-stopped service tracking
│ │ ├── status/ # Status aggregation and notifications
│ │ ├── config/ # TOML configuration loading
│ │ └── communication/ # ZMQ message handling
├── dashboard/ # TUI dashboard application
│ ├── src/
│ │ ├── ui/widgets/ # CPU, memory, services, backup, system
│ │ ├── communication/ # ZMQ consumption and commands
│ │ └── app.rs # Main application loop
├── shared/ # Shared types and utilities
│ └── src/
│ ├── metrics.rs # Metric, Status, StatusTracker types
│ ├── protocol.rs # ZMQ message format
│ └── cache.rs # Cache configuration
└── CLAUDE.md # Development guidelines and rules
Testing
# Build and test
nix-shell -p openssl pkg-config --run "cargo build --workspace"
nix-shell -p openssl pkg-config --run "cargo test --workspace"
# Code quality
cargo fmt --all
cargo clippy --workspace -- -D warnings
Deployment
Automated Binary Releases
# Create new release
cd ~/projects/cm-dashboard
git tag v0.1.X
git push origin v0.1.X
This triggers automated:
- Static binary compilation with
RUSTFLAGS="-C target-feature=+crt-static" - GitHub-style release creation
- Tarball upload to Gitea
NixOS Integration
Update ~/projects/nixosbox/hosts/services/cm-dashboard.nix:
version = "v0.1.43";
src = pkgs.fetchurl {
url = "https://gitea.cmtec.se/cm/cm-dashboard/releases/download/${version}/cm-dashboard-linux-x86_64.tar.gz";
sha256 = "sha256-HASH";
};
Get hash via:
cd ~/projects/nixosbox
nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchurl {
url = "URL_HERE";
sha256 = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
}' 2>&1 | grep "got:"
Monitoring Intervals
- Metrics Collection: 2 seconds (CPU, memory, services)
- Metric Transmission: 2 seconds (ZMQ publish)
- Dashboard Updates: 1 second (UI refresh)
- Email Notifications: 30 seconds (batched)
- Disk Monitoring: 300 seconds (5 minutes)
- Service Discovery: 300 seconds (5 minutes cache)
License
MIT License - see LICENSE file for details.
Description
cm-dashboard v0.1.259
Latest
Languages
Rust
100%