cm-dashboard/README.md
Christoffer Martinsson 2581435b10 Implement per-service disk usage monitoring
Replaced system-wide disk usage with accurate per-service tracking by scanning
service-specific directories. Services like sshd now correctly show minimal
disk usage instead of misleading system totals.

- Rename storage widget and add drive capacity/usage columns
- Move host display to main dashboard title for cleaner layout
- Replace separate alert displays with color-coded row highlighting
- Add per-service disk usage collection using du command
- Update services widget formatting to handle small disk values
- Restructure into workspace with dedicated agent and dashboard packages
2025-10-11 22:59:16 +02:00

149 lines
6.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CM Dashboard
CM Dashboard is a Rust-powered terminal UI for real-time monitoring of CMTEC infrastructure hosts. It subscribes to the CMTEC ZMQ gossip network where lightweight agents publish SMART, service, and backup metrics, and presents them in an efficient, keyboard-driven interface built with `ratatui`.
```
┌──────────────────────────────────────────────────────────────────────────────┐
│ CM Dashboard │
├────────────────────────────┬────────────────────────────┬────────────────────┤
│ NVMe Health │ Services │ CPU / Memory │
│ Host: srv01 │ Host: srv01 │ Host: srv01 │
│ Status: Healthy │ Service memory: 1.2G/4.0G │ RAM: 6.9 / 7.8 GiB │
│ Healthy/Warning/Critical: │ Disk usage: 45 / 500 GiB │ CPU load (1/5/15): │
│ 4 / 0 / 0 │ Services tracked: 8 │ 1.2 0.9 0.7 │
│ Capacity used: 512 / 2048G │ │ CPU temp: 68°C │
│ Issue: — │ nginx running 320M │ GPU temp: — │
│ │ immich running 1.2G │ Status • ok │
│ │ backup-api running 40M │ │
├────────────────────────────┴────────────┬───────────────┴────────────────────┤
│ Backups │ Alerts │
│ Host: srv01 │ srv01: ok │
│ Overall: Healthy │ labbox: warning: RAM 82% │
│ Last success: 2024-02-01 03:12:45 │ cmbox: critical: CPU temp 92°C │
│ Snapshots: 17 • Size: 512.0 GiB │ Update: 2024-02-01 10:15:32 │
│ Pending jobs: 0 (enabled: true) │ │
└──────────────────────────────┬───────────────────────────────────────────────┘
│ Status │ │
│ Active host: srv01 (1/3) │ History retention ≈ 3600s │
│ Config: config/dashboard.toml│ Default host: labbox │
└──────────────────────────────┴───────────────────────────────────────────────┘
```
## Requirements
- Rust toolchain 1.75+ (install via [`rustup`](https://rustup.rs))
- Network access to the CMTEC metrics gossip agents (default `tcp://<host>:6130`; install `zeromq`/`libzmq` on the host)
- Configuration files under `config/` describing hosts and dashboard preferences
## Installation
Clone the repository and build with Cargo:
```bash
git clone https://github.com/cmtec/cm-dashboard.git
cd cm-dashboard
cargo build --release
```
The optimized binary is available at `target/release/cm-dashboard`. To install into your Cargo bin directory:
```bash
cargo install --path dashboard
```
## Configuration
On first launch, the dashboard will create `config/dashboard.toml` and `config/hosts.toml` automatically if they do not exist.
You can also generate starter configuration files manually with the built-in helper:
```bash
cargo run -p cm-dashboard -- init-config
# or, once installed
cm-dashboard init-config --dir ./config --force
```
This produces `config/dashboard.toml` and `config/hosts.toml`. The primary dashboard config looks like:
```toml
[hosts]
default_host = "srv01"
[[hosts.hosts]]
name = "srv01"
enabled = true
[[hosts.hosts]]
name = "labbox"
enabled = true
[dashboard]
tick_rate_ms = 250
history_duration_minutes = 60
[[dashboard.widgets]]
id = "nvme"
enabled = true
[[dashboard.widgets]]
id = "alerts"
enabled = true
[data_source]
kind = "zmq"
[data_source.zmq]
endpoints = ["tcp://127.0.0.1:6130"]
```
Adjust the host list and `data_source.zmq.endpoints` to match your CMTEC gossip network. If you prefer to manage hosts separately, edit the generated `hosts.toml` file.
## Features
- Rotating host selection with left/right arrows (`←`, `→`, `h`, `l`, `Tab`)
- Live NVMe, service, CPU/memory, backup, and alert panels per host
- Health scoring that rolls CPU/RAM/GPU pressure into alerts automatically
- Structured logging with `tracing` (`-v`/`-vv` to increase verbosity)
- Help overlay (`?`) outlining keyboard shortcuts
- Config-driven host discovery via `config/dashboard.toml`
## Getting Started
```bash
cargo run -p cm-dashboard -- --config config/dashboard.toml
# specify a single host
cargo run -p cm-dashboard -- --host srv01
# override ZMQ endpoints at runtime
cargo run -p cm-dashboard -- --zmq-endpoint tcp://srv01:6130,tcp://labbox:6130
# increase logging verbosity
cargo run -p cm-dashboard -- -v
```
### Keyboard Shortcuts
| Key | Action |
| --- | --- |
| `←` / `h` | Previous host |
| `→` / `l` / `Tab` | Next host |
| `?` | Toggle help overlay |
| `r` | Update status message |
| `q` / `Esc` | Quit |
## Agent
The metrics agent publishes SMART/service/backup data to the gossip network. Run it on each host (or under systemd/NixOS) and point the dashboard at its endpoint. Example:
```bash
cargo run -p cm-dashboard-agent -- --hostname srv01 --bind tcp://*:6130 --interval-ms 5000
```
Use `--disable-*` flags to skip collectors when a host doesnt expose those metrics.
## Development
- Format: `cargo fmt`
- Check workspace: `cargo check`
- Build release binaries: `cargo build --release`
The dashboard subscribes to the CMTEC ZMQ gossip network (default `tcp://127.0.0.1:6130`). Received metrics are cached per host and retained in an in-memory ring buffer for future trend analysis.