Christoffer Martinsson 8de3d2ba79 Clean up services widget column headers and units
Standardize all services widget columns to show units in headers
and remove units from metric values for cleaner display.

Changes:
- Update column headers: "RAM (GB)", "CPU (%)", "Disk (GB)"
- Remove units from metric values:
  - RAM: "5.2/32.0" (no GB)
  - CPU: "2.5" (no %)
  - Disk: "1.5/500.0" (no GB)
- Simplify disk formatting to always show GB format

All columns now consistently display units in headers with
clean, uncluttered metric values.
2025-10-14 10:36:38 +02:00
2025-10-12 18:03:32 +02:00
2025-10-13 00:16:24 +02:00
2025-10-12 22:31:46 +02:00
2025-10-13 11:23:49 +02:00

CM Dashboard - Infrastructure Monitoring TUI

A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built to replace Glance with a custom solution tailored for specific monitoring needs and API integrations. Features real-time monitoring of all infrastructure components with intelligent email notifications and automatic status calculation.

System Widget

┌System───────────────────────────────────────────────────────┐
│  Memory usage                                               │
│✔ 3.0 / 7.8 GB                                               │
│  CPU load            CPU temp                               │
│✔ 1.05 • 0.96 • 0.58  64.0°C                                 │
│  C1E    C3     C6     C8     C9     C10                     │
│✔ 0.5%   0.5%   10.4%  10.2%  0.4%   77.9%                   │
│  GPU load  GPU temp                                         │
│✔ —         —                                                │
└─────────────────────────────────────────────────────────────┘

Services Widget (Enhanced)

┌Services────────────────────────────────────────────────────┐
│  Service          Memory (GB)  CPU    Disk                 │
│✔ Service Memory   7.1/23899.7 MiB     —                   │
│✔ Disk Usage       —           —       45/100 GB           │
│⚠ CPU Load         —           2.18    —                   │
│✔ CPU Temperature  —           47.0°C  —                   │
│✔ docker-registry  0.0 GB       0.0%   <1 MB               │
│✔ gitea            0.4/4.1 GB   0.2%   970 MB               │
│  1 active connections                                      │
│✔ nginx            0.0/1.0 GB   0.0%   <1 MB                │
│✔  ├─ docker.cmtec.se                                      │
│✔  ├─ git.cmtec.se                                         │
│✔  ├─ gitea.cmtec.se                                       │
│✔  ├─ haasp.cmtec.se                                       │
│✔  ├─ pages.cmtec.se                                       │
│✔  └─ www.kryddorten.se                                    │
│✔ postgresql       0.1 GB       0.0%   378 MB               │
│  1 active connections                                      │
│✔ redis-immich     0.0 GB       0.4%   <1 MB                │
│✔ sshd             0.0 GB       0.0%   <1 MB                │
│  1 SSH connection                                          │
│✔ unifi            0.9/2.0 GB   0.4%   391 MB               │
└────────────────────────────────────────────────────────────┘

Storage Widget

┌Storage──────────────────────────────────────────────────────┐
│  Drive    Temp   Wear   Spare  Hours  Capacity  Usage       │
│✔ nvme0n1  57°C   4%     100%   11463  932G      23G (2%)    │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Backups Widget

┌Backups──────────────────────────────────────────────────────┐
│  Backup  Status  Details                                    │
│✔ Latest  3h ago  1.4 GiB                                    │
│  8 archives, 2.4 GiB total                                  │
│✔ Disk    ok      2.4/468 GB (1%)                            │
└─────────────────────────────────────────────────────────────┘

Hosts Widget

┌Hosts────────────────────────────────────────────────────────┐
│  Host    Status            Timestamp                        │
│✔ cmbox   ok                2025-10-13 05:45:28              │
│✔ srv01   ok                2025-10-13 05:45:28              │
│? labbox  No data received  —                                │
└─────────────────────────────────────────────────────────────┘

Navigation: ←→ hosts, r refresh, q quit

Key Features

Real-time Monitoring

  • Multi-host support for cmbox, labbox, simonbox, steambox, srv01
  • Performance-focused with minimal resource usage
  • Keyboard-driven interface for power users
  • ZMQ gossip network for efficient data distribution

Infrastructure Monitoring

  • NVMe health monitoring with wear prediction and temperature tracking
  • CPU/Memory/GPU telemetry with automatic thresholding
  • Service resource monitoring with per-service CPU and RAM usage
  • Disk usage overview for root filesystems
  • Backup status with detailed metrics and history
  • C-state monitoring for CPU power management analysis

Intelligent Alerting

  • Agent-calculated status with predefined thresholds
  • Email notifications via SMTP with rate limiting
  • Recovery notifications with context about original issues
  • Stockholm timezone support for email timestamps
  • Unified alert pipeline summarizing host health

Architecture

Agent-Dashboard Separation

The system follows a strict separation of concerns:

  • Agent: Single source of truth for all status calculations using defined thresholds
  • Dashboard: Display-only interface that shows agent-provided status
  • Data Flow: Agent (calculations) → Status → Dashboard (display) → Colors

Agent Thresholds (Production)

  • CPU Load: Warning ≥ 5.0, Critical ≥ 8.0
  • Memory Usage: Warning ≥ 80%, Critical ≥ 95%
  • CPU Temperature: Warning ≥ 100°C, Critical ≥ 100°C (effectively disabled)

Email Notification System

  • From: {hostname}@cmtec.se (e.g., cmbox@cmtec.se)
  • To: cm@cmtec.se
  • SMTP: localhost:25 (postfix)
  • Rate Limiting: 30 minutes (configurable)
  • Triggers: Status degradation and recovery with detailed context

Installation

Requirements

  • Rust toolchain 1.75+ (install via rustup)
  • Root privileges for agent (hardware monitoring access)
  • Network access for ZMQ communication (default port 6130)
  • SMTP server for notifications (postfix recommended)

Build from Source

git clone https://github.com/cmtec/cm-dashboard.git
cd cm-dashboard
cargo build --release

Optimized binaries available at:

  • Dashboard: target/release/cm-dashboard
  • Agent: target/release/cm-dashboard-agent

Installation

# Install dashboard
cargo install --path dashboard

# Install agent (requires root for hardware access)
sudo cargo install --path agent

Quick Start

Dashboard

# Run with default configuration
cm-dashboard

# Specify host to monitor
cm-dashboard --host cmbox

# Override ZMQ endpoints
cm-dashboard --zmq-endpoint tcp://srv01:6130,tcp://labbox:6130

# Increase logging verbosity
cm-dashboard -v

Agent (Pure Auto-Discovery)

The agent requires no configuration files and auto-discovers all system components:

# Basic agent startup (auto-detects everything)
sudo cm-dashboard-agent

# With verbose logging for troubleshooting
sudo cm-dashboard-agent -v

The agent automatically:

  • Discovers storage devices for SMART monitoring
  • Detects running systemd services for resource tracking
  • Configures collection intervals based on system capabilities
  • Sets up email notifications using hostname@cmtec.se

Configuration

Dashboard Configuration

The dashboard creates config/dashboard.toml on first run:

[hosts]
default_host = "srv01"

[[hosts.hosts]]
name = "srv01"
enabled = true

[[hosts.hosts]]
name = "cmbox"
enabled = true

[dashboard]
tick_rate_ms = 250
history_duration_minutes = 60

[data_source]
kind = "zmq"

[data_source.zmq]
endpoints = ["tcp://127.0.0.1:6130"]

Agent Configuration (Optional)

The agent works without configuration but supports optional settings:

# Generate example configuration
cm-dashboard-agent --help

# Override specific settings
sudo cm-dashboard-agent \
    --hostname cmbox \
    --bind tcp://*:6130 \
    --interval 5000

Widget Layout

Services Widget Structure

The Services widget now displays both system metrics and services in a unified table:

┌Services────────────────────────────────────────────────────┐
│  Service          Memory (GB)  CPU    Disk                 │
│✔ Service Memory   7.1/23899.7 MiB     —                   │ ← System metric as service row
│✔ Disk Usage       —           —       45/100 GB           │ ← System metric as service row  
│⚠ CPU Load         —           2.18    —                   │ ← System metric as service row
│✔ CPU Temperature  —           47.0°C  —                   │ ← System metric as service row
│✔ docker-registry  0.0 GB      0.0%    <1 MB               │ ← Regular service
│✔ nginx            0.0/1.0 GB  0.0%    <1 MB               │ ← Regular service
│✔  ├─ docker.cmtec.se                                      │ ← Nginx site (sub-service)
│✔  ├─ git.cmtec.se                                         │ ← Nginx site (sub-service)  
│✔  └─ gitea.cmtec.se                                       │ ← Nginx site (sub-service)
│✔ sshd             0.0 GB      0.0%    <1 MB               │ ← Regular service
│  1 SSH connection                                          │ ← Service description
└────────────────────────────────────────────────────────────┘

Row Types:

  • System Metrics: CPU Load, Service Memory, Disk Usage, CPU Temperature with status indicators
  • Regular Services: Full resource data (memory, CPU, disk) with optional description lines
  • Sub-services: Nginx sites with tree structure, status indicators only (no resource columns)
  • Description Lines: Connection counts and service-specific info without status indicators

Hosts Widget (formerly Alerts)

The Hosts widget provides a summary view of all monitored hosts:

┌Hosts────────────────────────────────────────────────────────┐
│  Host    Status            Timestamp                        │
│✔ cmbox   ok                2025-10-13 05:45:28              │
│✔ srv01   ok                2025-10-13 05:45:28              │
│? labbox  No data received  —                                │
└─────────────────────────────────────────────────────────────┘

Monitoring Components

System Collector

  • CPU Load: 1/5/15 minute averages with warning/critical thresholds
  • Memory Usage: Used/total with percentage calculation
  • CPU Temperature: x86_pkg_temp prioritized for accuracy
  • C-States: Power management state distribution (C0-C10)

Service Collector

  • System Metrics as Services: CPU Load, Service Memory, Disk Usage, CPU Temperature displayed as individual service rows
  • Systemd Services: Auto-discovery of interesting services with resource monitoring
  • Nginx Site Monitoring: Individual rows for each nginx virtual host with tree structure (├─ and └─)
  • Resource Usage: Per-service memory, CPU, and disk consumption
  • Service Health: Running/stopped/degraded status with detailed failure info
  • Connection Tracking: SSH connections, database connections as description lines

SMART Collector

  • NVMe Health: Temperature, wear leveling, spare blocks
  • Drive Capacity: Total/used space with percentage
  • SMART Attributes: Critical health indicators

Backup Collector

  • Restic Integration: Backup status and history
  • Health Monitoring: Success/failure tracking
  • Storage Metrics: Backup size and retention

Keyboard Controls

Key Action
/ h Previous host
/ l / Tab Next host
? Toggle help overlay
r Force refresh
q / Esc Quit

Email Notifications

Notification Triggers

  • Status Degradation: Any status change to warning/critical
  • Recovery: Warning/critical status returning to ok
  • Service Failures: Individual service stop/start events

Example Recovery Email

✅ RESOLVED: system cpu on cmbox

Status Change Alert

Host: cmbox
Component: system
Metric: cpu
Status Change: warning → ok
Time: 2025-10-12 22:15:30 CET

Details:
Recovered from: CPU load (1/5/15min): 6.20 / 5.80 / 4.50
Current status: CPU load (1/5/15min): 3.30 / 3.17 / 2.84

--
CM Dashboard Agent
Generated at 2025-10-12 22:15:30 CET

Rate Limiting

  • Default: 30 minutes between notifications per component
  • Testing: Set to 0 for immediate notifications
  • Configurable: Adjustable per deployment needs

Development

Project Structure

cm-dashboard/
├── agent/                 # Monitoring agent
│   ├── src/
│   │   ├── collectors/    # Data collection modules
│   │   ├── notifications.rs # Email notification system
│   │   └── simple_agent.rs # Main agent logic
├── dashboard/             # TUI dashboard
│   ├── src/
│   │   ├── ui/           # Widget implementations
│   │   ├── data/         # Data structures
│   │   └── app.rs        # Application state
├── shared/               # Common data structures
└── config/              # Configuration files

Development Commands

# Format code
cargo fmt

# Check all packages
cargo check

# Run tests
cargo test

# Build release
cargo build --release

# Run with logging
RUST_LOG=debug cargo run -p cm-dashboard-agent

Architecture Principles

Status Calculation Rules

  • Agent calculates all status using predefined thresholds
  • Dashboard never calculates status - only displays agent data
  • No hardcoded thresholds in dashboard widgets
  • Use "unknown" when agent status missing (never default to "ok")

Data Flow

System Metrics → Agent Collectors → Status Calculation → ZMQ → Dashboard → Display
                                         ↓
                                 Email Notifications

Pure Auto-Discovery

  • No config files required for basic operation
  • Runtime discovery of system capabilities
  • Service auto-detection via systemd patterns
  • Storage device enumeration via /sys filesystem

Troubleshooting

Common Issues

Agent Won't Start

# Check permissions (agent requires root)
sudo cm-dashboard-agent -v

# Verify ZMQ binding
sudo netstat -tulpn | grep 6130

# Check system access
sudo smartctl --scan

Dashboard Connection Issues

# Test ZMQ connectivity
cm-dashboard --zmq-endpoint tcp://target-host:6130 -v

# Check network connectivity
telnet target-host 6130

Email Notifications Not Working

# Check postfix status
sudo systemctl status postfix

# Test SMTP manually
telnet localhost 25

# Verify notification settings
sudo cm-dashboard-agent -v | grep notification

Logging

Set RUST_LOG=debug for detailed logging:

RUST_LOG=debug sudo cm-dashboard-agent
RUST_LOG=debug cm-dashboard

License

MIT License - see LICENSE file for details.

Contributing

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open Pull Request

For bugs and feature requests, please use GitHub Issues.

NixOS Integration

Updating cm-dashboard in NixOS Configuration

When new code is pushed to the cm-dashboard repository, follow these steps to update the NixOS configuration:

1. Get the Latest Commit Hash

# Get the latest commit from the API
curl -s "https://gitea.cmtec.se/api/v1/repos/cm/cm-dashboard/commits?sha=main&limit=1" | head -20

# Or use git
git log --oneline -1

2. Update the NixOS Configuration

Edit hosts/common/cm-dashboard.nix and update the rev field:

src = pkgs.fetchFromGitea {
  domain = "gitea.cmtec.se";
  owner = "cm";
  repo = "cm-dashboard";
  rev = "f786d054f2ece80823f85e46933857af96e241b2";  # Update this
  hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";  # Reset temporarily
};

3. Get the Correct Hash

Build with placeholder hash to get the actual hash:

nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchFromGitea { 
  domain = "gitea.cmtec.se"; 
  owner = "cm"; 
  repo = "cm-dashboard"; 
  rev = "YOUR_COMMIT_HASH"; 
  hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="; 
}' 2>&1 | grep "got:"

4. Update the Hash

Replace the placeholder with the correct hash:

hash = "sha256-vjy+j91iDCHUf0RE43anK4WZ+rKcyohP/3SykwZGof8=";  # Use actual hash

5. Update Cargo Dependencies (if needed)

If Cargo.lock has changed, you may need to update cargoHash:

# Build to get cargo hash error
nix-build --no-out-link --expr 'with import <nixpkgs> {}; rustPlatform.buildRustPackage rec { 
  pname = "cm-dashboard"; 
  version = "0.1.0"; 
  src = fetchFromGitea { 
    domain = "gitea.cmtec.se"; 
    owner = "cm"; 
    repo = "cm-dashboard"; 
    rev = "YOUR_COMMIT_HASH"; 
    hash = "YOUR_SOURCE_HASH"; 
  }; 
  cargoHash = ""; 
  nativeBuildInputs = [ pkg-config ]; 
  buildInputs = [ openssl ]; 
  buildAndTestSubdir = "."; 
  cargoBuildFlags = [ "--workspace" ]; 
}' 2>&1 | grep "got:"

Then update cargoHash in the configuration.

6. Commit the Changes

git add hosts/common/cm-dashboard.nix
git commit -m "Update cm-dashboard to latest version"
git push

Example Update Process

# 1. Get latest commit
LATEST_COMMIT=$(curl -s "https://gitea.cmtec.se/api/v1/repos/cm/cm-dashboard/commits?sha=main&limit=1" | grep '"sha"' | head -1 | cut -d'"' -f4)

# 2. Get source hash
SOURCE_HASH=$(nix-build --no-out-link -E "with import <nixpkgs> {}; fetchFromGitea { domain = \"gitea.cmtec.se\"; owner = \"cm\"; repo = \"cm-dashboard\"; rev = \"$LATEST_COMMIT\"; hash = \"sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=\"; }" 2>&1 | grep "got:" | cut -d' ' -f12)

# 3. Update configuration and commit
echo "Latest commit: $LATEST_COMMIT"
echo "Source hash: $SOURCE_HASH"
Description
Linux TUI dashboard for host health overview
Readme 13 MiB
2025-12-09 10:47:18 +01:00
Languages
Rust 100%