Added detailed logging for ZMQ data sending to see exactly what data is being transmitted and whether sends are successful. This will help identify if the issue is in data format, sending, or dashboard reception.
CM Dashboard - Infrastructure Monitoring TUI
A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built to replace Glance with a custom solution tailored for specific monitoring needs and API integrations. Features real-time monitoring of all infrastructure components with intelligent email notifications and automatic status calculation.
System Widget
┌System───────────────────────────────────────────────────────┐
│ Memory usage │
│✔ 3.0 / 7.8 GB │
│ CPU load CPU temp │
│✔ 1.05 • 0.96 • 0.58 64.0°C │
│ C1E C3 C6 C8 C9 C10 │
│✔ 0.5% 0.5% 10.4% 10.2% 0.4% 77.9% │
│ GPU load GPU temp │
│✔ — — │
└─────────────────────────────────────────────────────────────┘
Services Widget (Enhanced)
┌Services────────────────────────────────────────────────────┐
│ Service Memory (GB) CPU Disk │
│✔ Service Memory 7.1/23899.7 MiB — │
│✔ Disk Usage — — 45/100 GB │
│⚠ CPU Load — 2.18 — │
│✔ CPU Temperature — 47.0°C — │
│✔ docker-registry 0.0 GB 0.0% <1 MB │
│✔ gitea 0.4/4.1 GB 0.2% 970 MB │
│ 1 active connections │
│✔ nginx 0.0/1.0 GB 0.0% <1 MB │
│✔ ├─ docker.cmtec.se │
│✔ ├─ git.cmtec.se │
│✔ ├─ gitea.cmtec.se │
│✔ ├─ haasp.cmtec.se │
│✔ ├─ pages.cmtec.se │
│✔ └─ www.kryddorten.se │
│✔ postgresql 0.1 GB 0.0% 378 MB │
│ 1 active connections │
│✔ redis-immich 0.0 GB 0.4% <1 MB │
│✔ sshd 0.0 GB 0.0% <1 MB │
│ 1 SSH connection │
│✔ unifi 0.9/2.0 GB 0.4% 391 MB │
└────────────────────────────────────────────────────────────┘
Storage Widget
┌Storage──────────────────────────────────────────────────────┐
│ Drive Temp Wear Spare Hours Capacity Usage │
│✔ nvme0n1 57°C 4% 100% 11463 932G 23G (2%) │
│ │
└─────────────────────────────────────────────────────────────┘
Backups Widget
┌Backups──────────────────────────────────────────────────────┐
│ Backup Status Details │
│✔ Latest 3h ago 1.4 GiB │
│ 8 archives, 2.4 GiB total │
│✔ Disk ok 2.4/468 GB (1%) │
└─────────────────────────────────────────────────────────────┘
Hosts Widget
┌Hosts────────────────────────────────────────────────────────┐
│ Host Status Timestamp │
│✔ cmbox ok 2025-10-13 05:45:28 │
│✔ srv01 ok 2025-10-13 05:45:28 │
│? labbox No data received — │
└─────────────────────────────────────────────────────────────┘
Navigation: ←→ hosts, r refresh, q quit
Key Features
Real-time Monitoring
- Multi-host support for cmbox, labbox, simonbox, steambox, srv01
- Performance-focused with minimal resource usage
- Keyboard-driven interface for power users
- ZMQ gossip network for efficient data distribution
Infrastructure Monitoring
- NVMe health monitoring with wear prediction and temperature tracking
- CPU/Memory/GPU telemetry with automatic thresholding
- Service resource monitoring with per-service CPU and RAM usage
- Disk usage overview for root filesystems
- Backup status with detailed metrics and history
- C-state monitoring for CPU power management analysis
Intelligent Alerting
- Agent-calculated status with predefined thresholds
- Email notifications via SMTP with rate limiting
- Recovery notifications with context about original issues
- Stockholm timezone support for email timestamps
- Unified alert pipeline summarizing host health
Architecture
Agent-Dashboard Separation
The system follows a strict separation of concerns:
- Agent: Single source of truth for all status calculations using defined thresholds
- Dashboard: Display-only interface that shows agent-provided status
- Data Flow: Agent (calculations) → Status → Dashboard (display) → Colors
Agent Thresholds (Production)
- CPU Load: Warning ≥ 5.0, Critical ≥ 8.0
- Memory Usage: Warning ≥ 80%, Critical ≥ 95%
- CPU Temperature: Warning ≥ 100°C, Critical ≥ 100°C (effectively disabled)
Email Notification System
- From:
{hostname}@cmtec.se(e.g., cmbox@cmtec.se) - To:
cm@cmtec.se - SMTP: localhost:25 (postfix)
- Rate Limiting: 30 minutes (configurable)
- Triggers: Status degradation and recovery with detailed context
Installation
Requirements
- Rust toolchain 1.75+ (install via
rustup) - Root privileges for agent (hardware monitoring access)
- Network access for ZMQ communication (default port 6130)
- SMTP server for notifications (postfix recommended)
Build from Source
git clone https://github.com/cmtec/cm-dashboard.git
cd cm-dashboard
cargo build --release
Optimized binaries available at:
- Dashboard:
target/release/cm-dashboard - Agent:
target/release/cm-dashboard-agent
Installation
# Install dashboard
cargo install --path dashboard
# Install agent (requires root for hardware access)
sudo cargo install --path agent
Quick Start
Dashboard
# Run with default configuration
cm-dashboard
# Specify host to monitor
cm-dashboard --host cmbox
# Override ZMQ endpoints
cm-dashboard --zmq-endpoint tcp://srv01:6130,tcp://labbox:6130
# Increase logging verbosity
cm-dashboard -v
Agent (Pure Auto-Discovery)
The agent requires no configuration files and auto-discovers all system components:
# Basic agent startup (auto-detects everything)
sudo cm-dashboard-agent
# With verbose logging for troubleshooting
sudo cm-dashboard-agent -v
The agent automatically:
- Discovers storage devices for SMART monitoring
- Detects running systemd services for resource tracking
- Configures collection intervals based on system capabilities
- Sets up email notifications using hostname@cmtec.se
Configuration
Dashboard Configuration
The dashboard creates config/dashboard.toml on first run:
[hosts]
default_host = "srv01"
[[hosts.hosts]]
name = "srv01"
enabled = true
[[hosts.hosts]]
name = "cmbox"
enabled = true
[dashboard]
tick_rate_ms = 250
history_duration_minutes = 60
[data_source]
kind = "zmq"
[data_source.zmq]
endpoints = ["tcp://127.0.0.1:6130"]
Agent Configuration (Optional)
The agent works without configuration but supports optional settings:
# Generate example configuration
cm-dashboard-agent --help
# Override specific settings
sudo cm-dashboard-agent \
--hostname cmbox \
--bind tcp://*:6130 \
--interval 5000
Widget Layout
Services Widget Structure
The Services widget now displays both system metrics and services in a unified table:
┌Services────────────────────────────────────────────────────┐
│ Service Memory (GB) CPU Disk │
│✔ Service Memory 7.1/23899.7 MiB — │ ← System metric as service row
│✔ Disk Usage — — 45/100 GB │ ← System metric as service row
│⚠ CPU Load — 2.18 — │ ← System metric as service row
│✔ CPU Temperature — 47.0°C — │ ← System metric as service row
│✔ docker-registry 0.0 GB 0.0% <1 MB │ ← Regular service
│✔ nginx 0.0/1.0 GB 0.0% <1 MB │ ← Regular service
│✔ ├─ docker.cmtec.se │ ← Nginx site (sub-service)
│✔ ├─ git.cmtec.se │ ← Nginx site (sub-service)
│✔ └─ gitea.cmtec.se │ ← Nginx site (sub-service)
│✔ sshd 0.0 GB 0.0% <1 MB │ ← Regular service
│ 1 SSH connection │ ← Service description
└────────────────────────────────────────────────────────────┘
Row Types:
- System Metrics: CPU Load, Service Memory, Disk Usage, CPU Temperature with status indicators
- Regular Services: Full resource data (memory, CPU, disk) with optional description lines
- Sub-services: Nginx sites with tree structure, status indicators only (no resource columns)
- Description Lines: Connection counts and service-specific info without status indicators
Hosts Widget (formerly Alerts)
The Hosts widget provides a summary view of all monitored hosts:
┌Hosts────────────────────────────────────────────────────────┐
│ Host Status Timestamp │
│✔ cmbox ok 2025-10-13 05:45:28 │
│✔ srv01 ok 2025-10-13 05:45:28 │
│? labbox No data received — │
└─────────────────────────────────────────────────────────────┘
Monitoring Components
System Collector
- CPU Load: 1/5/15 minute averages with warning/critical thresholds
- Memory Usage: Used/total with percentage calculation
- CPU Temperature: x86_pkg_temp prioritized for accuracy
- C-States: Power management state distribution (C0-C10)
Service Collector
- System Metrics as Services: CPU Load, Service Memory, Disk Usage, CPU Temperature displayed as individual service rows
- Systemd Services: Auto-discovery of interesting services with resource monitoring
- Nginx Site Monitoring: Individual rows for each nginx virtual host with tree structure (
├─and└─) - Resource Usage: Per-service memory, CPU, and disk consumption
- Service Health: Running/stopped/degraded status with detailed failure info
- Connection Tracking: SSH connections, database connections as description lines
SMART Collector
- NVMe Health: Temperature, wear leveling, spare blocks
- Drive Capacity: Total/used space with percentage
- SMART Attributes: Critical health indicators
Backup Collector
- Restic Integration: Backup status and history
- Health Monitoring: Success/failure tracking
- Storage Metrics: Backup size and retention
Keyboard Controls
| Key | Action |
|---|---|
← / h |
Previous host |
→ / l / Tab |
Next host |
? |
Toggle help overlay |
r |
Force refresh |
q / Esc |
Quit |
Email Notifications
Notification Triggers
- Status Degradation: Any status change to warning/critical
- Recovery: Warning/critical status returning to ok
- Service Failures: Individual service stop/start events
Example Recovery Email
✅ RESOLVED: system cpu on cmbox
Status Change Alert
Host: cmbox
Component: system
Metric: cpu
Status Change: warning → ok
Time: 2025-10-12 22:15:30 CET
Details:
Recovered from: CPU load (1/5/15min): 6.20 / 5.80 / 4.50
Current status: CPU load (1/5/15min): 3.30 / 3.17 / 2.84
--
CM Dashboard Agent
Generated at 2025-10-12 22:15:30 CET
Rate Limiting
- Default: 30 minutes between notifications per component
- Testing: Set to 0 for immediate notifications
- Configurable: Adjustable per deployment needs
Development
Project Structure
cm-dashboard/
├── agent/ # Monitoring agent
│ ├── src/
│ │ ├── collectors/ # Data collection modules
│ │ ├── notifications.rs # Email notification system
│ │ └── simple_agent.rs # Main agent logic
├── dashboard/ # TUI dashboard
│ ├── src/
│ │ ├── ui/ # Widget implementations
│ │ ├── data/ # Data structures
│ │ └── app.rs # Application state
├── shared/ # Common data structures
└── config/ # Configuration files
Development Commands
# Format code
cargo fmt
# Check all packages
cargo check
# Run tests
cargo test
# Build release
cargo build --release
# Run with logging
RUST_LOG=debug cargo run -p cm-dashboard-agent
Architecture Principles
Status Calculation Rules
- Agent calculates all status using predefined thresholds
- Dashboard never calculates status - only displays agent data
- No hardcoded thresholds in dashboard widgets
- Use "unknown" when agent status missing (never default to "ok")
Data Flow
System Metrics → Agent Collectors → Status Calculation → ZMQ → Dashboard → Display
↓
Email Notifications
Pure Auto-Discovery
- No config files required for basic operation
- Runtime discovery of system capabilities
- Service auto-detection via systemd patterns
- Storage device enumeration via /sys filesystem
Troubleshooting
Common Issues
Agent Won't Start
# Check permissions (agent requires root)
sudo cm-dashboard-agent -v
# Verify ZMQ binding
sudo netstat -tulpn | grep 6130
# Check system access
sudo smartctl --scan
Dashboard Connection Issues
# Test ZMQ connectivity
cm-dashboard --zmq-endpoint tcp://target-host:6130 -v
# Check network connectivity
telnet target-host 6130
Email Notifications Not Working
# Check postfix status
sudo systemctl status postfix
# Test SMTP manually
telnet localhost 25
# Verify notification settings
sudo cm-dashboard-agent -v | grep notification
Logging
Set RUST_LOG=debug for detailed logging:
RUST_LOG=debug sudo cm-dashboard-agent
RUST_LOG=debug cm-dashboard
License
MIT License - see LICENSE file for details.
Contributing
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
For bugs and feature requests, please use GitHub Issues.
NixOS Integration
Updating cm-dashboard in NixOS Configuration
When new code is pushed to the cm-dashboard repository, follow these steps to update the NixOS configuration:
1. Get the Latest Commit Hash
# Get the latest commit from the API
curl -s "https://gitea.cmtec.se/api/v1/repos/cm/cm-dashboard/commits?sha=main&limit=1" | head -20
# Or use git
git log --oneline -1
2. Update the NixOS Configuration
Edit hosts/common/cm-dashboard.nix and update the rev field:
src = pkgs.fetchFromGitea {
domain = "gitea.cmtec.se";
owner = "cm";
repo = "cm-dashboard";
rev = "f786d054f2ece80823f85e46933857af96e241b2"; # Update this
hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="; # Reset temporarily
};
3. Get the Correct Hash
Build with placeholder hash to get the actual hash:
nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchFromGitea {
domain = "gitea.cmtec.se";
owner = "cm";
repo = "cm-dashboard";
rev = "YOUR_COMMIT_HASH";
hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
}' 2>&1 | grep "got:"
Example output:
error: hash mismatch in fixed-output derivation '/nix/store/...':
specified: sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
got: sha256-x8crxNusOUYRrkP9mYEOG+Ga3JCPIdJLkEAc5P1ZxdQ=
4. Update the Hash
Replace the placeholder with the correct hash from the error message (the "got:" line):
hash = "sha256-vjy+j91iDCHUf0RE43anK4WZ+rKcyohP/3SykwZGof8="; # Use actual hash
5. Update Cargo Dependencies (if needed)
If Cargo.lock has changed, you may need to update cargoHash:
# Build to get cargo hash error
nix-build --no-out-link --expr 'with import <nixpkgs> {}; rustPlatform.buildRustPackage rec {
pname = "cm-dashboard";
version = "0.1.0";
src = fetchFromGitea {
domain = "gitea.cmtec.se";
owner = "cm";
repo = "cm-dashboard";
rev = "YOUR_COMMIT_HASH";
hash = "YOUR_SOURCE_HASH";
};
cargoHash = "";
nativeBuildInputs = [ pkg-config ];
buildInputs = [ openssl ];
buildAndTestSubdir = ".";
cargoBuildFlags = [ "--workspace" ];
}' 2>&1 | grep "got:"
Then update cargoHash in the configuration.
6. Commit the Changes
git add hosts/common/cm-dashboard.nix
git commit -m "Update cm-dashboard to latest version"
git push
Example Update Process
# 1. Get latest commit
LATEST_COMMIT=$(curl -s "https://gitea.cmtec.se/api/v1/repos/cm/cm-dashboard/commits?sha=main&limit=1" | grep '"sha"' | head -1 | cut -d'"' -f4)
# 2. Get source hash
SOURCE_HASH=$(nix-build --no-out-link -E "with import <nixpkgs> {}; fetchFromGitea { domain = \"gitea.cmtec.se\"; owner = \"cm\"; repo = \"cm-dashboard\"; rev = \"$LATEST_COMMIT\"; hash = \"sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=\"; }" 2>&1 | grep "got:" | cut -d' ' -f12)
# 3. Update configuration and commit
echo "Latest commit: $LATEST_COMMIT"
echo "Source hash: $SOURCE_HASH"