Services widget: - Remove SB (sandbox) column and related formatting function - Fix quota formatting to show decimals when needed (1.5G not 1G) - Remove spaces in unit display (128MB not 128 MB) Storage widget: - Change usage format to 23GB (932GB) for better readability Documentation: - Add NixOS configuration update process to CLAUDE.md
CM Dashboard - Infrastructure Monitoring TUI
A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built to replace Glance with a custom solution tailored for specific monitoring needs and API integrations. Features real-time monitoring of all infrastructure components with intelligent email notifications and automatic status calculation.
System Widget
┌System───────────────────────────────────────────────────────┐
│ Memory usage │
│✔ 3.0 / 7.8 GB │
│ CPU load CPU temp │
│✔ 1.05 • 0.96 • 0.58 64.0°C │
│ C1E C3 C6 C8 C9 C10 │
│✔ 0.5% 0.5% 10.4% 10.2% 0.4% 77.9% │
│ GPU load GPU temp │
│✔ — — │
└─────────────────────────────────────────────────────────────┘
Services Widget (Enhanced)
┌Services────────────────────────────────────────────────────┐
│ Service Memory (GB) CPU Disk │
│✔ Service Memory 7.1/23899.7 MiB — │
│✔ Disk Usage — — 45/100 GB │
│⚠ CPU Load — 2.18 — │
│✔ CPU Temperature — 47.0°C — │
│✔ docker-registry 0.0 GB 0.0% <1 MB │
│✔ gitea 0.4/4.1 GB 0.2% 970 MB │
│ 1 active connections │
│✔ nginx 0.0/1.0 GB 0.0% <1 MB │
│✔ ├─ docker.cmtec.se │
│✔ ├─ git.cmtec.se │
│✔ ├─ gitea.cmtec.se │
│✔ ├─ haasp.cmtec.se │
│✔ ├─ pages.cmtec.se │
│✔ └─ www.kryddorten.se │
│✔ postgresql 0.1 GB 0.0% 378 MB │
│ 1 active connections │
│✔ redis-immich 0.0 GB 0.4% <1 MB │
│✔ sshd 0.0 GB 0.0% <1 MB │
│ 1 SSH connection │
│✔ unifi 0.9/2.0 GB 0.4% 391 MB │
└────────────────────────────────────────────────────────────┘
Storage Widget
┌Storage──────────────────────────────────────────────────────┐
│ Drive Temp Wear Spare Hours Capacity Usage │
│✔ nvme0n1 57°C 4% 100% 11463 932G 23G (2%) │
│ │
└─────────────────────────────────────────────────────────────┘
Backups Widget
┌Backups──────────────────────────────────────────────────────┐
│ Backup Status Details │
│✔ Latest 3h ago 1.4 GiB │
│ 8 archives, 2.4 GiB total │
│✔ Disk ok 2.4/468 GB (1%) │
└─────────────────────────────────────────────────────────────┘
Hosts Widget
┌Hosts────────────────────────────────────────────────────────┐
│ Host Status Timestamp │
│✔ cmbox ok 2025-10-13 05:45:28 │
│✔ srv01 ok 2025-10-13 05:45:28 │
│? labbox No data received — │
└─────────────────────────────────────────────────────────────┘
Navigation: ←→ hosts, r refresh, q quit
Key Features
Real-time Monitoring
- Multi-host support for cmbox, labbox, simonbox, steambox, srv01
- Performance-focused with minimal resource usage
- Keyboard-driven interface for power users
- ZMQ gossip network for efficient data distribution
Infrastructure Monitoring
- NVMe health monitoring with wear prediction and temperature tracking
- CPU/Memory/GPU telemetry with automatic thresholding
- Service resource monitoring with per-service CPU and RAM usage
- Disk usage overview for root filesystems
- Backup status with detailed metrics and history
- C-state monitoring for CPU power management analysis
Intelligent Alerting
- Agent-calculated status with predefined thresholds
- Email notifications via SMTP with rate limiting
- Recovery notifications with context about original issues
- Stockholm timezone support for email timestamps
- Unified alert pipeline summarizing host health
Architecture
Agent-Dashboard Separation
The system follows a strict separation of concerns:
- Agent: Single source of truth for all status calculations using defined thresholds
- Dashboard: Display-only interface that shows agent-provided status
- Data Flow: Agent (calculations) → Status → Dashboard (display) → Colors
Agent Thresholds (Production)
- CPU Load: Warning ≥ 5.0, Critical ≥ 8.0
- Memory Usage: Warning ≥ 80%, Critical ≥ 95%
- CPU Temperature: Warning ≥ 100°C, Critical ≥ 100°C (effectively disabled)
Email Notification System
- From:
{hostname}@cmtec.se(e.g., cmbox@cmtec.se) - To:
cm@cmtec.se - SMTP: localhost:25 (postfix)
- Rate Limiting: 30 minutes (configurable)
- Triggers: Status degradation and recovery with detailed context
Installation
Requirements
- Rust toolchain 1.75+ (install via
rustup) - Root privileges for agent (hardware monitoring access)
- Network access for ZMQ communication (default port 6130)
- SMTP server for notifications (postfix recommended)
Build from Source
git clone https://github.com/cmtec/cm-dashboard.git
cd cm-dashboard
cargo build --release
Optimized binaries available at:
- Dashboard:
target/release/cm-dashboard - Agent:
target/release/cm-dashboard-agent
Installation
# Install dashboard
cargo install --path dashboard
# Install agent (requires root for hardware access)
sudo cargo install --path agent
Quick Start
Dashboard
# Run with default configuration
cm-dashboard
# Specify host to monitor
cm-dashboard --host cmbox
# Override ZMQ endpoints
cm-dashboard --zmq-endpoint tcp://srv01:6130,tcp://labbox:6130
# Increase logging verbosity
cm-dashboard -v
Agent (Pure Auto-Discovery)
The agent requires no configuration files and auto-discovers all system components:
# Basic agent startup (auto-detects everything)
sudo cm-dashboard-agent
# With verbose logging for troubleshooting
sudo cm-dashboard-agent -v
The agent automatically:
- Discovers storage devices for SMART monitoring
- Detects running systemd services for resource tracking
- Configures collection intervals based on system capabilities
- Sets up email notifications using hostname@cmtec.se
Configuration
Dashboard Configuration
The dashboard creates config/dashboard.toml on first run:
[hosts]
default_host = "srv01"
[[hosts.hosts]]
name = "srv01"
enabled = true
[[hosts.hosts]]
name = "cmbox"
enabled = true
[dashboard]
tick_rate_ms = 250
history_duration_minutes = 60
[data_source]
kind = "zmq"
[data_source.zmq]
endpoints = ["tcp://127.0.0.1:6130"]
Agent Configuration (Optional)
The agent works without configuration but supports optional settings:
# Generate example configuration
cm-dashboard-agent --help
# Override specific settings
sudo cm-dashboard-agent \
--hostname cmbox \
--bind tcp://*:6130 \
--interval 5000
Widget Layout
Services Widget Structure
The Services widget now displays both system metrics and services in a unified table:
┌Services────────────────────────────────────────────────────┐
│ Service Memory (GB) CPU Disk │
│✔ Service Memory 7.1/23899.7 MiB — │ ← System metric as service row
│✔ Disk Usage — — 45/100 GB │ ← System metric as service row
│⚠ CPU Load — 2.18 — │ ← System metric as service row
│✔ CPU Temperature — 47.0°C — │ ← System metric as service row
│✔ docker-registry 0.0 GB 0.0% <1 MB │ ← Regular service
│✔ nginx 0.0/1.0 GB 0.0% <1 MB │ ← Regular service
│✔ ├─ docker.cmtec.se │ ← Nginx site (sub-service)
│✔ ├─ git.cmtec.se │ ← Nginx site (sub-service)
│✔ └─ gitea.cmtec.se │ ← Nginx site (sub-service)
│✔ sshd 0.0 GB 0.0% <1 MB │ ← Regular service
│ 1 SSH connection │ ← Service description
└────────────────────────────────────────────────────────────┘
Row Types:
- System Metrics: CPU Load, Service Memory, Disk Usage, CPU Temperature with status indicators
- Regular Services: Full resource data (memory, CPU, disk) with optional description lines
- Sub-services: Nginx sites with tree structure, status indicators only (no resource columns)
- Description Lines: Connection counts and service-specific info without status indicators
Hosts Widget (formerly Alerts)
The Hosts widget provides a summary view of all monitored hosts:
┌Hosts────────────────────────────────────────────────────────┐
│ Host Status Timestamp │
│✔ cmbox ok 2025-10-13 05:45:28 │
│✔ srv01 ok 2025-10-13 05:45:28 │
│? labbox No data received — │
└─────────────────────────────────────────────────────────────┘
Monitoring Components
System Collector
- CPU Load: 1/5/15 minute averages with warning/critical thresholds
- Memory Usage: Used/total with percentage calculation
- CPU Temperature: x86_pkg_temp prioritized for accuracy
- C-States: Power management state distribution (C0-C10)
Service Collector
- System Metrics as Services: CPU Load, Service Memory, Disk Usage, CPU Temperature displayed as individual service rows
- Systemd Services: Auto-discovery of interesting services with resource monitoring
- Nginx Site Monitoring: Individual rows for each nginx virtual host with tree structure (
├─and└─) - Resource Usage: Per-service memory, CPU, and disk consumption
- Service Health: Running/stopped/degraded status with detailed failure info
- Connection Tracking: SSH connections, database connections as description lines
SMART Collector
- NVMe Health: Temperature, wear leveling, spare blocks
- Drive Capacity: Total/used space with percentage
- SMART Attributes: Critical health indicators
Backup Collector
- Restic Integration: Backup status and history
- Health Monitoring: Success/failure tracking
- Storage Metrics: Backup size and retention
Keyboard Controls
| Key | Action |
|---|---|
← / h |
Previous host |
→ / l / Tab |
Next host |
? |
Toggle help overlay |
r |
Force refresh |
q / Esc |
Quit |
Email Notifications
Notification Triggers
- Status Degradation: Any status change to warning/critical
- Recovery: Warning/critical status returning to ok
- Service Failures: Individual service stop/start events
Example Recovery Email
✅ RESOLVED: system cpu on cmbox
Status Change Alert
Host: cmbox
Component: system
Metric: cpu
Status Change: warning → ok
Time: 2025-10-12 22:15:30 CET
Details:
Recovered from: CPU load (1/5/15min): 6.20 / 5.80 / 4.50
Current status: CPU load (1/5/15min): 3.30 / 3.17 / 2.84
--
CM Dashboard Agent
Generated at 2025-10-12 22:15:30 CET
Rate Limiting
- Default: 30 minutes between notifications per component
- Testing: Set to 0 for immediate notifications
- Configurable: Adjustable per deployment needs
Development
Project Structure
cm-dashboard/
├── agent/ # Monitoring agent
│ ├── src/
│ │ ├── collectors/ # Data collection modules
│ │ ├── notifications.rs # Email notification system
│ │ └── simple_agent.rs # Main agent logic
├── dashboard/ # TUI dashboard
│ ├── src/
│ │ ├── ui/ # Widget implementations
│ │ ├── data/ # Data structures
│ │ └── app.rs # Application state
├── shared/ # Common data structures
└── config/ # Configuration files
Development Commands
# Format code
cargo fmt
# Check all packages
cargo check
# Run tests
cargo test
# Build release
cargo build --release
# Run with logging
RUST_LOG=debug cargo run -p cm-dashboard-agent
Architecture Principles
Status Calculation Rules
- Agent calculates all status using predefined thresholds
- Dashboard never calculates status - only displays agent data
- No hardcoded thresholds in dashboard widgets
- Use "unknown" when agent status missing (never default to "ok")
Data Flow
System Metrics → Agent Collectors → Status Calculation → ZMQ → Dashboard → Display
↓
Email Notifications
Pure Auto-Discovery
- No config files required for basic operation
- Runtime discovery of system capabilities
- Service auto-detection via systemd patterns
- Storage device enumeration via /sys filesystem
Troubleshooting
Common Issues
Agent Won't Start
# Check permissions (agent requires root)
sudo cm-dashboard-agent -v
# Verify ZMQ binding
sudo netstat -tulpn | grep 6130
# Check system access
sudo smartctl --scan
Dashboard Connection Issues
# Test ZMQ connectivity
cm-dashboard --zmq-endpoint tcp://target-host:6130 -v
# Check network connectivity
telnet target-host 6130
Email Notifications Not Working
# Check postfix status
sudo systemctl status postfix
# Test SMTP manually
telnet localhost 25
# Verify notification settings
sudo cm-dashboard-agent -v | grep notification
Logging
Set RUST_LOG=debug for detailed logging:
RUST_LOG=debug sudo cm-dashboard-agent
RUST_LOG=debug cm-dashboard
License
MIT License - see LICENSE file for details.
Contributing
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
For bugs and feature requests, please use GitHub Issues.
NixOS Integration
Updating cm-dashboard in NixOS Configuration
When new code is pushed to the cm-dashboard repository, follow these steps to update the NixOS configuration:
1. Get the Latest Commit Hash
# Get the latest commit from the API
curl -s "https://gitea.cmtec.se/api/v1/repos/cm/cm-dashboard/commits?sha=main&limit=1" | head -20
# Or use git
git log --oneline -1
2. Update the NixOS Configuration
Edit hosts/common/cm-dashboard.nix and update the rev field:
src = pkgs.fetchFromGitea {
domain = "gitea.cmtec.se";
owner = "cm";
repo = "cm-dashboard";
rev = "f786d054f2ece80823f85e46933857af96e241b2"; # Update this
hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="; # Reset temporarily
};
3. Get the Correct Hash
Build with placeholder hash to get the actual hash:
nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchFromGitea {
domain = "gitea.cmtec.se";
owner = "cm";
repo = "cm-dashboard";
rev = "YOUR_COMMIT_HASH";
hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
}' 2>&1 | grep "got:"
4. Update the Hash
Replace the placeholder with the correct hash:
hash = "sha256-vjy+j91iDCHUf0RE43anK4WZ+rKcyohP/3SykwZGof8="; # Use actual hash
5. Update Cargo Dependencies (if needed)
If Cargo.lock has changed, you may need to update cargoHash:
# Build to get cargo hash error
nix-build --no-out-link --expr 'with import <nixpkgs> {}; rustPlatform.buildRustPackage rec {
pname = "cm-dashboard";
version = "0.1.0";
src = fetchFromGitea {
domain = "gitea.cmtec.se";
owner = "cm";
repo = "cm-dashboard";
rev = "YOUR_COMMIT_HASH";
hash = "YOUR_SOURCE_HASH";
};
cargoHash = "";
nativeBuildInputs = [ pkg-config ];
buildInputs = [ openssl ];
buildAndTestSubdir = ".";
cargoBuildFlags = [ "--workspace" ];
}' 2>&1 | grep "got:"
Then update cargoHash in the configuration.
6. Commit the Changes
git add hosts/common/cm-dashboard.nix
git commit -m "Update cm-dashboard to latest version"
git push
Example Update Process
# 1. Get latest commit
LATEST_COMMIT=$(curl -s "https://gitea.cmtec.se/api/v1/repos/cm/cm-dashboard/commits?sha=main&limit=1" | grep '"sha"' | head -1 | cut -d'"' -f4)
# 2. Get source hash
SOURCE_HASH=$(nix-build --no-out-link -E "with import <nixpkgs> {}; fetchFromGitea { domain = \"gitea.cmtec.se\"; owner = \"cm\"; repo = \"cm-dashboard\"; rev = \"$LATEST_COMMIT\"; hash = \"sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=\"; }" 2>&1 | grep "got:" | cut -d' ' -f12)
# 3. Update configuration and commit
echo "Latest commit: $LATEST_COMMIT"
echo "Source hash: $SOURCE_HASH"