Update README with actual dashboard interface and implementation details

2025-10-21 20:36:03 +02:00 · 2025-10-21 20:36:03 +02:00 · 0417e2c1f1
commit 0417e2c1f1
parent a08670071c
1 changed files with 312 additions and 451 deletions
--- a/README.md
+++ b/README.md
@ -1,544 +1,405 @@
-# CM Dashboard - Infrastructure Monitoring TUI
+# CM Dashboard
-A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built to replace Glance with a custom solution tailored for specific monitoring needs and API integrations. Features real-time monitoring of all infrastructure components with intelligent email notifications and automatic status calculation.
+A real-time infrastructure monitoring system with intelligent status aggregation and email notifications, built with Rust and ZMQ.
 ## Current Implementation
 This is a complete rewrite implementing an **individual metrics architecture** where:
 - **Agent** collects individual metrics (e.g., `cpu_load_1min`, `memory_usage_percent`) and calculates status
 - **Dashboard** subscribes to specific metrics and composes widgets
 - **Status Aggregation** provides intelligent email notifications with batching
 - **Persistent Cache** prevents false notifications on restart
 ## Dashboard Interface
 ### System Widget
 ```
-┌System───────────────────────────────────────────────────────┐
+cm-dashboard • ● cmbox ● srv01 ● srv02 ● steambox
-│  Memory usage                                               │
+┌system───────────────────────────────────────────┐┌services────────────────────────────────────────────────────┐
-│✔ 3.0 / 7.8 GB                                               │
+│CPU:                                             ││Service:                  Status:    RAM:     Disk:         │
-│  CPU load            CPU temp                               │
+│● Load: 0.10 0.52 0.88 • 400.0 MHz               ││● docker                  active     27M      496MB         │
-│✔ 1.05 • 0.96 • 0.58  64.0°C                                 │
+│RAM:                                             ││● docker-registry         active     19M      496MB         │
-│  C1E    C3     C6     C8     C9     C10                     │
+│● Used: 30% 2.3GB/7.6GB                          ││● gitea                   active     579M     2.6GB         │
-│✔ 0.5%   0.5%   10.4%  10.2%  0.4%   77.9%                   │
+│● tmp: 0.0% 0B/2.0GB                             ││● gitea-runner-default    active     11M      2.6GB         │
-│  GPU load  GPU temp                                         │
+│Disk nvme0n1:                                    ││● haasp-core              active     9M       1MB           │
-│✔ —         —                                                │
+│● Health: PASSED                                 ││● haasp-mqtt              active     3M       1MB           │
-└─────────────────────────────────────────────────────────────┘
+│● Usage @root: 8.3% • 75.4/906.2 GB              ││● haasp-webgrid           active     10M      1MB           │
 │● Usage @boot: 5.9% • 0.1/1.0 GB                 ││● immich-server           active     240M     45.1GB        │
 │                                                 ││● mosquitto               active     1M       1MB           │
 │                                                 ││● mysql                   active     38M      225MB         │
 │                                                 ││● nginx                   active     28M      24MB          │
 │                                                 ││  ├─ ● gitea.cmtec.se     51ms                              │
 │                                                 ││  ├─ ● haasp.cmtec.se     43ms                              │
 │                                                 ││  ├─ ● haasp.net          43ms                              │
 │                                                 ││  ├─ ● pages.cmtec.se     45ms                              │
 └─────────────────────────────────────────────────┘│  ├─ ● photos.cmtec.se    41ms                              │
 ┌backup───────────────────────────────────────────┐│  ├─ ● unifi.cmtec.se     46ms                              │
 │Latest backup:                                   ││  ├─ ● vault.cmtec.se     47ms                              │
 │● Status: OK                                     ││  ├─ ● www.kryddorten.se  81ms                              │
 │Duration: 54s • Last: 4h ago                     ││  ├─ ● www.mariehall2.se  86ms                              │
 │Disk usage: 48.2GB/915.8GB                       ││● postgresql              active     112M     357MB         │
 │P/N: Samsung SSD 870 QVO 1TB                     ││● redis-immich            active     8M       45.1GB        │
 │S/N: S5RRNF0W800639Y                             ││● sshd                    active     2M       0             │
 │● gitea 2 archives 2.7GB                         ││● unifi                   active     594M     495MB         │
 │● immich 2 archives 45.0GB                       ││● vaultwarden             active     12M      1MB           │
 │● kryddorten 2 archives 67.6MB                   ││                                                            │
 │● mariehall2 2 archives 321.8MB                  ││                                                            │
 │● nixosbox 2 archives 4.5MB                      ││                                                            │
 │● unifi 2 archives 2.9MB                         ││                                                            │
 │● vaultwarden 2 archives 305kB                   ││                                                            │
 └─────────────────────────────────────────────────┘└────────────────────────────────────────────────────────────┘
 ```
-### Services Widget (Enhanced)
+**Navigation**: `←→` switch hosts, `r` refresh, `q` quit
 ```
 ┌Services────────────────────────────────────────────────────┐
 │  Service          Memory (GB)  CPU    Disk                 │
 │✔ Service Memory   7.1/23899.7 MiB     —                   │
 │✔ Disk Usage       —           —       45/100 GB           │
 │⚠ CPU Load         —           2.18    —                   │
 │✔ CPU Temperature  —           47.0°C  —                   │
 │✔ docker-registry  0.0 GB       0.0%   <1 MB               │
 │✔ gitea            0.4/4.1 GB   0.2%   970 MB               │
 │  1 active connections                                      │
 │✔ nginx            0.0/1.0 GB   0.0%   <1 MB                │
 │✔  ├─ docker.cmtec.se                                      │
 │✔  ├─ git.cmtec.se                                         │
 │✔  ├─ gitea.cmtec.se                                       │
 │✔  ├─ haasp.cmtec.se                                       │
 │✔  ├─ pages.cmtec.se                                       │
 │✔  └─ www.kryddorten.se                                    │
 │✔ postgresql       0.1 GB       0.0%   378 MB               │
 │  1 active connections                                      │
 │✔ redis-immich     0.0 GB       0.4%   <1 MB                │
 │✔ sshd             0.0 GB       0.0%   <1 MB                │
 │  1 SSH connection                                          │
 │✔ unifi            0.9/2.0 GB   0.4%   391 MB               │
 └────────────────────────────────────────────────────────────┘
 ```
-### Storage Widget
+## Features
 ```
 ┌Storage──────────────────────────────────────────────────────┐
 │  Drive    Temp   Wear   Spare  Hours  Capacity  Usage       │
 │✔ nvme0n1  57°C   4%     100%   11463  932G      23G (2%)    │
 │                                                             │
 └─────────────────────────────────────────────────────────────┘
 ```
-### Backups Widget
+- **Real-time monitoring** - Dashboard updates every 1-2 seconds
-```
+- **Individual metric collection** - Granular data for flexible dashboard composition  
-┌Backups──────────────────────────────────────────────────────┐
+- **Intelligent status aggregation** - Host-level status calculated from all services
-│  Backup  Status  Details                                    │
+- **Smart email notifications** - Batched, detailed alerts with service groupings
-│✔ Latest  3h ago  1.4 GiB                                    │
+- **Persistent state** - Prevents false notifications on restarts
-│  8 archives, 2.4 GiB total                                  │
+- **ZMQ communication** - Efficient agent-to-dashboard messaging
-│✔ Disk    ok      2.4/468 GB (1%)                            │
+- **Clean TUI** - Terminal-based dashboard with color-coded status indicators
 └─────────────────────────────────────────────────────────────┘
 ```
 ### Hosts Widget
 ```
 ┌Hosts────────────────────────────────────────────────────────┐
 │  Host    Status            Timestamp                        │
 │✔ cmbox   ok                2025-10-13 05:45:28              │
 │✔ srv01   ok                2025-10-13 05:45:28              │
 │? labbox  No data received  —                                │
 └─────────────────────────────────────────────────────────────┘
 ```
 **Navigation**: `←→` hosts, `r` refresh, `q` quit
 ## Key Features
 ### Real-time Monitoring
 - **Multi-host support** for cmbox, labbox, simonbox, steambox, srv01
 - **Performance-focused** with minimal resource usage
 - **Keyboard-driven interface** for power users
 - **ZMQ gossip network** for efficient data distribution
 ### Infrastructure Monitoring
 - **NVMe health monitoring** with wear prediction and temperature tracking
 - **CPU/Memory/GPU telemetry** with automatic thresholding
 - **Service resource monitoring** with per-service CPU and RAM usage
 - **Disk usage overview** for root filesystems
 - **Backup status** with detailed metrics and history
 - **C-state monitoring** for CPU power management analysis
 ### Intelligent Alerting
 - **Agent-calculated status** with predefined thresholds
 - **Email notifications** via SMTP with rate limiting
 - **Recovery notifications** with context about original issues
 - **Stockholm timezone** support for email timestamps
 - **Unified alert pipeline** summarizing host health
 ## Architecture
-### Agent-Dashboard Separation
+### Core Components
 The system follows a strict separation of concerns:
- **Agent**: Single source of truth for all status calculations using defined thresholds
+- **Agent** (`cm-dashboard-agent`) - Collects metrics and sends via ZMQ
- **Dashboard**: Display-only interface that shows agent-provided status
+- **Dashboard** (`cm-dashboard`) - Real-time TUI display consuming metrics  
- **Data Flow**: Agent (calculations) → Status → Dashboard (display) → Colors
+- **Shared** (`cm-dashboard-shared`) - Common types and protocol
 - **Status Aggregation** - Intelligent batching and notification management
 - **Persistent Cache** - Maintains state across restarts
-### Agent Thresholds (Production)
+### Status Levels
 - **CPU Load**: Warning ≥ 5.0, Critical ≥ 8.0
 - **Memory Usage**: Warning ≥ 80%, Critical ≥ 95%
 - **CPU Temperature**: Warning ≥ 100°C, Critical ≥ 100°C (effectively disabled)
-### Email Notification System
+- **🟢 Ok** - Service running normally
- **From**: `{hostname}@cmtec.se` (e.g., cmbox@cmtec.se)
+- **🔵 Pending** - Service starting/stopping/reloading  
- **To**: `cm@cmtec.se`
+- **🟡 Warning** - Service issues (high load, memory, disk usage)
- **SMTP**: localhost:25 (postfix)
+- **🔴 Critical** - Service failed or critical thresholds exceeded
- **Rate Limiting**: 30 minutes (configurable)
+- **❓ Unknown** - Service state cannot be determined
 - **Triggers**: Status degradation and recovery with detailed context
 ## Installation
 ### Requirements
 - Rust toolchain 1.75+ (install via [`rustup`](https://rustup.rs))
 - Root privileges for agent (hardware monitoring access)
 - Network access for ZMQ communication (default port 6130)
 - SMTP server for notifications (postfix recommended)
 ### Build from Source
 ```bash
 git clone https://github.com/cmtec/cm-dashboard.git
 cd cm-dashboard
 cargo build --release
 ```
 Optimized binaries available at:
 - Dashboard: `target/release/cm-dashboard`
 - Agent: `target/release/cm-dashboard-agent`
 ### Installation
 ```bash
 # Install dashboard
 cargo install --path dashboard
 # Install agent (requires root for hardware access)
 sudo cargo install --path agent
 ```
 ## Quick Start
-### Dashboard
+### Build
 ```bash
 # Run with default configuration
 cm-dashboard
 # Specify host to monitor
 cm-dashboard --host cmbox
 # Override ZMQ endpoints
 cm-dashboard --zmq-endpoint tcp://srv01:6130,tcp://labbox:6130
 # Increase logging verbosity
 cm-dashboard -v
 ```
 ### Agent (Pure Auto-Discovery)
 The agent requires **no configuration files** and auto-discovers all system components:
 ```bash
-# Basic agent startup (auto-detects everything)
+# With Nix (recommended)
-sudo cm-dashboard-agent
+nix-shell -p openssl pkg-config --run "cargo build --workspace"
-# With verbose logging for troubleshooting
+# Or with system dependencies
-sudo cm-dashboard-agent -v
+sudo apt install libssl-dev pkg-config  # Ubuntu/Debian
 cargo build --workspace
 ```
-The agent automatically:
+### Run
- **Discovers storage devices** for SMART monitoring
+
- **Detects running systemd services** for resource tracking
+```bash
- **Configures collection intervals** based on system capabilities
+# Start agent (requires configuration file)
- **Sets up email notifications** using hostname@cmtec.se
+./target/debug/cm-dashboard-agent --config /etc/cm-dashboard/agent.toml
 # Start dashboard 
 ./target/debug/cm-dashboard --config /path/to/dashboard.toml
 ```
 ## Configuration
-### Dashboard Configuration
+### Agent Configuration (`agent.toml`)
-The dashboard creates `config/dashboard.toml` on first run:
+
 The agent requires a comprehensive TOML configuration file:
 ```toml
-[hosts]
+collection_interval_seconds = 2
 default_host = "srv01"
-[[hosts.hosts]]
+[zmq]
-name = "srv01"
+publisher_port = 6130
 command_port = 6131
 bind_address = "0.0.0.0"
 timeout_ms = 5000
 heartbeat_interval_ms = 30000
 [collectors.cpu]
 enabled = true
 interval_seconds = 2
 load_warning_threshold = 9.0
 load_critical_threshold = 10.0
 temperature_warning_threshold = 100.0
 temperature_critical_threshold = 110.0
-[[hosts.hosts]]
+[collectors.memory]
 name = "cmbox"
 enabled = true
 interval_seconds = 2
 usage_warning_percent = 80.0
 usage_critical_percent = 95.0
-[dashboard]
+[collectors.disk]
-tick_rate_ms = 250
+enabled = true
-history_duration_minutes = 60
+interval_seconds = 300
 usage_warning_percent = 80.0
 usage_critical_percent = 90.0
-[data_source]
+[[collectors.disk.filesystems]]
-kind = "zmq"
+name = "root"
 uuid = "4cade5ce-85a5-4a03-83c8-dfd1d3888d79"
 mount_point = "/"
 fs_type = "ext4"
 monitor = true
-[data_source.zmq]
+[collectors.systemd]
-endpoints = ["tcp://127.0.0.1:6130"]
+enabled = true
 interval_seconds = 10
 memory_warning_mb = 1000.0
 memory_critical_mb = 2000.0
 service_name_filters = [
  "nginx", "postgresql", "redis", "docker", "sshd"
 ]
 excluded_services = [
  "nginx-config-reload", "sshd-keygen"
 ]
 [notifications]
 enabled = true
 smtp_host = "localhost"
 smtp_port = 25
 from_email = "{hostname}@example.com"
 to_email = "admin@example.com"
 rate_limit_minutes = 0
 trigger_on_warnings = true
 trigger_on_failures = true
 recovery_requires_all_ok = true
 suppress_individual_recoveries = true
 [status_aggregation]
 enabled = true
 aggregation_method = "worst_case"
 notification_interval_seconds = 30
 [cache]
 persist_path = "/var/lib/cm-dashboard/cache.json"
 ```
-### Agent Configuration (Optional)
+### Dashboard Configuration (`dashboard.toml`)
 The agent works without configuration but supports optional settings:
-```bash
+```toml
-# Generate example configuration
+[zmq]
-cm-dashboard-agent --help
+hosts = [
  { name = "server1", address = "192.168.1.100", port = 6130 },
  { name = "server2", address = "192.168.1.101", port = 6130 }
 ]
 connection_timeout_ms = 5000
 reconnect_interval_ms = 10000
-# Override specific settings
+[ui]
-sudo cm-dashboard-agent \
+refresh_interval_ms = 1000
-    --hostname cmbox \
+theme = "dark"
    --bind tcp://*:6130 \
    --interval 5000
 ```
-## Widget Layout
+## Collectors
-### Services Widget Structure
+The agent implements several specialized collectors:
 The Services widget now displays both system metrics and services in a unified table:
-```
+### CPU Collector (`cpu.rs`)
-┌Services────────────────────────────────────────────────────┐
+- Load average (1, 5, 15 minute)
-│  Service          Memory (GB)  CPU    Disk                 │
+- CPU temperature monitoring
-│✔ Service Memory   7.1/23899.7 MiB     —                   │ ← System metric as service row
+- Real-time process monitoring (top CPU consumers)
-│✔ Disk Usage       —           —       45/100 GB           │ ← System metric as service row  
+- Status calculation with configurable thresholds
 │⚠ CPU Load         —           2.18    —                   │ ← System metric as service row
 │✔ CPU Temperature  —           47.0°C  —                   │ ← System metric as service row
 │✔ docker-registry  0.0 GB      0.0%    <1 MB               │ ← Regular service
 │✔ nginx            0.0/1.0 GB  0.0%    <1 MB               │ ← Regular service
 │✔  ├─ docker.cmtec.se                                      │ ← Nginx site (sub-service)
 │✔  ├─ git.cmtec.se                                         │ ← Nginx site (sub-service)  
 │✔  └─ gitea.cmtec.se                                       │ ← Nginx site (sub-service)
 │✔ sshd             0.0 GB      0.0%    <1 MB               │ ← Regular service
 │  1 SSH connection                                          │ ← Service description
 └────────────────────────────────────────────────────────────┘
 ```
-**Row Types:**
+### Memory Collector (`memory.rs`)  
- **System Metrics**: CPU Load, Service Memory, Disk Usage, CPU Temperature with status indicators
+- RAM usage (total, used, available)
- **Regular Services**: Full resource data (memory, CPU, disk) with optional description lines  
+- Swap monitoring
- **Sub-services**: Nginx sites with tree structure, status indicators only (no resource columns)
+- Real-time process monitoring (top RAM consumers)
- **Description Lines**: Connection counts and service-specific info without status indicators
+- Memory pressure detection
-### Hosts Widget (formerly Alerts)
+### Disk Collector (`disk.rs`)
-The Hosts widget provides a summary view of all monitored hosts:
+- Filesystem usage per mount point
 - SMART health monitoring
 - Temperature and wear tracking
 - Configurable filesystem monitoring
-```
+### Systemd Collector (`systemd.rs`)
-┌Hosts────────────────────────────────────────────────────────┐
+- Service status monitoring (`active`, `inactive`, `failed`)
-│  Host    Status            Timestamp                        │
+- Memory usage per service
-│✔ cmbox   ok                2025-10-13 05:45:28              │
+- Service filtering and exclusions
-│✔ srv01   ok                2025-10-13 05:45:28              │
+- Handles transitional states (`Status::Pending`)
 │? labbox  No data received  —                                │
 └─────────────────────────────────────────────────────────────┘
 ```
-## Monitoring Components
+### Backup Collector (`backup.rs`)
-
+- Reads TOML status files from backup systems
-### System Collector
+- Archive age verification
- **CPU Load**: 1/5/15 minute averages with warning/critical thresholds
+- Disk usage tracking
- **Memory Usage**: Used/total with percentage calculation
+- Repository health monitoring
 - **CPU Temperature**: x86_pkg_temp prioritized for accuracy
 - **C-States**: Power management state distribution (C0-C10)
 ### Service Collector
 - **System Metrics as Services**: CPU Load, Service Memory, Disk Usage, CPU Temperature displayed as individual service rows
 - **Systemd Services**: Auto-discovery of interesting services with resource monitoring
 - **Nginx Site Monitoring**: Individual rows for each nginx virtual host with tree structure (`├─` and `└─`)
 - **Resource Usage**: Per-service memory, CPU, and disk consumption
 - **Service Health**: Running/stopped/degraded status with detailed failure info
 - **Connection Tracking**: SSH connections, database connections as description lines
 ### SMART Collector
 - **NVMe Health**: Temperature, wear leveling, spare blocks
 - **Drive Capacity**: Total/used space with percentage
 - **SMART Attributes**: Critical health indicators
 ### Backup Collector
 - **Restic Integration**: Backup status and history
 - **Health Monitoring**: Success/failure tracking
 - **Storage Metrics**: Backup size and retention
 ## Keyboard Controls
 | Key | Action |
 |-----|--------|
 | `←` / `h` | Previous host |
 | `→` / `l` / `Tab` | Next host |
 | `?` | Toggle help overlay |
 | `r` | Force refresh |
 | `q` / `Esc` | Quit |
 ## Email Notifications
-### Notification Triggers
+### Intelligent Batching
- **Status Degradation**: Any status change to warning/critical
+
- **Recovery**: Warning/critical status returning to ok
+The system implements smart notification batching to prevent email spam:
- **Service Failures**: Individual service stop/start events
+
 - **Real-time dashboard updates** - Status changes appear immediately
 - **Batched email notifications** - Aggregated every 30 seconds
 - **Detailed groupings** - Services organized by severity
 ### Example Alert Email
 ### Example Recovery Email
 ```
-✅ RESOLVED: system cpu on cmbox
+Subject: Status Alert: 2 critical, 1 warning, 15 started
-Status Change Alert
+Status Summary (30s duration)
 Host Status: Ok → Warning
-Host: cmbox
+🔴 CRITICAL ISSUES (2):
-Component: system
+  postgresql: Ok → Critical
-Metric: cpu
+  nginx: Warning → Critical
 Status Change: warning → ok
 Time: 2025-10-12 22:15:30 CET
-Details:
+🟡 WARNINGS (1):
-Recovered from: CPU load (1/5/15min): 6.20 / 5.80 / 4.50
+  redis: Ok → Warning (memory usage 85%)
-Current status: CPU load (1/5/15min): 3.30 / 3.17 / 2.84
+
 ✅ RECOVERIES (0):
 🟢 SERVICE STARTUPS (15):
  docker: Unknown → Ok
  sshd: Unknown → Ok
  ...
 --
 CM Dashboard Agent
-Generated at 2025-10-12 22:15:30 CET
+Generated at 2025-10-21 19:42:42 CET
 ```
-### Rate Limiting
+## Individual Metrics Architecture
- **Default**: 30 minutes between notifications per component
+
- **Testing**: Set to 0 for immediate notifications
+The system follows a **metrics-first architecture**:
- **Configurable**: Adjustable per deployment needs
+
 ### Agent Side
 ```rust
 // Agent collects individual metrics
 vec![
    Metric::new("cpu_load_1min".to_string(), MetricValue::Float(2.5), Status::Ok),
    Metric::new("memory_usage_percent".to_string(), MetricValue::Float(78.5), Status::Warning),
    Metric::new("service_nginx_status".to_string(), MetricValue::String("active".to_string()), Status::Ok),
 ]
 ```
 ### Dashboard Side
 ```rust
 // Widgets subscribe to specific metrics
 impl Widget for CpuWidget {
    fn update_from_metrics(&mut self, metrics: &[&Metric]) {
        for metric in metrics {
            match metric.name.as_str() {
                "cpu_load_1min" => self.load_1min = metric.value.as_f32(),
                "cpu_load_5min" => self.load_5min = metric.value.as_f32(),
                "cpu_temperature_celsius" => self.temperature = metric.value.as_f32(),
                _ => {}
            }
        }
    }
 }
 ```
 ## Persistent Cache
 The cache system prevents false notifications:
 - **Automatic saving** - Saves when service status changes
 - **Persistent storage** - Maintains state across agent restarts
 - **Simple design** - No complex TTL or cleanup logic
 - **Status preservation** - Prevents duplicate notifications
 ## Development
 ### Project Structure
 ```
 cm-dashboard/
-├── agent/                 # Monitoring agent
+├── agent/                  # Metrics collection agent
 │   ├── src/
-│   │   ├── collectors/    # Data collection modules
+│   │   ├── collectors/     # CPU, memory, disk, systemd, backup
-│   │   ├── notifications.rs # Email notification system
+│   │   ├── status/         # Status aggregation and notifications
-│   │   └── simple_agent.rs # Main agent logic
+│   │   ├── cache/          # Persistent metric caching
-├── dashboard/             # TUI dashboard
+│   │   ├── config/         # TOML configuration loading
 │   │   └── notifications/  # Email notification system
 ├── dashboard/              # TUI dashboard application
 │   ├── src/
-│   │   ├── ui/           # Widget implementations
+│   │   ├── ui/widgets/     # CPU, memory, services, backup widgets
-│   │   ├── data/         # Data structures
+│   │   ├── metrics/        # Metric storage and filtering
-│   │   └── app.rs        # Application state
+│   │   └── communication/  # ZMQ metric consumption
-├── shared/               # Common data structures
+├── shared/                 # Shared types and utilities
-└── config/              # Configuration files
+│   └── src/
 │       ├── metrics.rs      # Metric, Status, and Value types
 │       ├── protocol.rs     # ZMQ message format
 │       └── cache.rs        # Cache configuration
 └── README.md              # This file
 ```
-### Development Commands
+### Building
 ```bash
 # Format code
 cargo fmt
-# Check all packages
+```bash
-cargo check
+# Debug build
 cargo build --workspace
 # Release build  
 cargo build --workspace --release
 # Run tests
-cargo test
+cargo test --workspace
-# Build release
+# Check code formatting
-cargo build --release
+cargo fmt --all -- --check
-# Run with logging
+# Run clippy linter
-RUST_LOG=debug cargo run -p cm-dashboard-agent
+cargo clippy --workspace -- -D warnings
 ```
-### Architecture Principles
+### Dependencies
-#### Status Calculation Rules
+- **tokio** - Async runtime
- **Agent calculates all status** using predefined thresholds
+- **zmq** - Message passing between agent and dashboard
- **Dashboard never calculates status** - only displays agent data
+- **ratatui** - Terminal user interface
- **No hardcoded thresholds in dashboard** widgets
+- **serde** - Serialization for metrics and config
- **Use "unknown" when agent status missing** (never default to "ok")
+- **anyhow/thiserror** - Error handling
-
+- **tracing** - Structured logging
-#### Data Flow
+- **lettre** - SMTP email notifications
-```
+- **clap** - Command-line argument parsing
-System Metrics → Agent Collectors → Status Calculation → ZMQ → Dashboard → Display
+- **toml** - Configuration file parsing
                                         ↓
                                 Email Notifications
 ```
 #### Pure Auto-Discovery
 - **No config files required** for basic operation
 - **Runtime discovery** of system capabilities
 - **Service auto-detection** via systemd patterns
 - **Storage device enumeration** via /sys filesystem
 ## Troubleshooting
 ### Common Issues
 #### Agent Won't Start
 ```bash
 # Check permissions (agent requires root)
 sudo cm-dashboard-agent -v
 # Verify ZMQ binding
 sudo netstat -tulpn | grep 6130
 # Check system access
 sudo smartctl --scan
 ```
 #### Dashboard Connection Issues
 ```bash
 # Test ZMQ connectivity
 cm-dashboard --zmq-endpoint tcp://target-host:6130 -v
 # Check network connectivity
 telnet target-host 6130
 ```
 #### Email Notifications Not Working
 ```bash
 # Check postfix status
 sudo systemctl status postfix
 # Test SMTP manually
 telnet localhost 25
 # Verify notification settings
 sudo cm-dashboard-agent -v | grep notification
 ```
 ### Logging
 Set `RUST_LOG=debug` for detailed logging:
 ```bash
 RUST_LOG=debug sudo cm-dashboard-agent
 RUST_LOG=debug cm-dashboard
 ```
 ## License
 MIT License - see LICENSE file for details.
 ## Contributing
 1. Fork the repository
 2. Create feature branch (`git checkout -b feature/amazing-feature`)
 3. Commit changes (`git commit -m 'Add amazing feature'`)
 4. Push to branch (`git push origin feature/amazing-feature`)
 5. Open Pull Request
 For bugs and feature requests, please use GitHub Issues.
 ## NixOS Integration
-### Updating cm-dashboard in NixOS Configuration
+This project is designed for declarative deployment via NixOS:
-When new code is pushed to the cm-dashboard repository, follow these steps to update the NixOS configuration:
+### Configuration Generation
-#### 1. Get the Latest Commit Hash
+The NixOS module automatically generates the agent configuration:
 ```bash
 # Get the latest commit from the API
 curl -s "https://gitea.cmtec.se/api/v1/repos/cm/cm-dashboard/commits?sha=main&limit=1" | head -20
 # Or use git
 git log --oneline -1
 ```
 #### 2. Update the NixOS Configuration
 Edit `hosts/common/cm-dashboard.nix` and update the `rev` field:
 ```nix
-src = pkgs.fetchFromGitea {
+# hosts/common/cm-dashboard.nix
-  domain = "gitea.cmtec.se";
+services.cm-dashboard-agent = {
-  owner = "cm";
+  enable = true;
-  repo = "cm-dashboard";
+  port = 6130;
  rev = "f786d054f2ece80823f85e46933857af96e241b2";  # Update this
  hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";  # Reset temporarily
 };
 ```
-#### 3. Get the Correct Hash
+### Deployment
-Build with placeholder hash to get the actual hash:
+
 ```bash
 nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchFromGitea { 
  domain = "gitea.cmtec.se"; 
  owner = "cm"; 
  repo = "cm-dashboard"; 
  rev = "YOUR_COMMIT_HASH"; 
  hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="; 
 }' 2>&1 | grep "got:"
 ```
 Example output:
 ```
 error: hash mismatch in fixed-output derivation '/nix/store/...':
         specified: sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
            got:    sha256-x8crxNusOUYRrkP9mYEOG+Ga3JCPIdJLkEAc5P1ZxdQ=
 ```
 #### 4. Update the Hash
 Replace the placeholder with the correct hash from the error message (the "got:" line):
 ```nix
 hash = "sha256-vjy+j91iDCHUf0RE43anK4WZ+rKcyohP/3SykwZGof8=";  # Use actual hash
 ```
 #### 5. Update Cargo Dependencies (if needed)
 If Cargo.lock has changed, you may need to update `cargoHash`:
 ```bash
 # Build to get cargo hash error
 nix-build --no-out-link --expr 'with import <nixpkgs> {}; rustPlatform.buildRustPackage rec { 
  pname = "cm-dashboard"; 
  version = "0.1.0"; 
  src = fetchFromGitea { 
    domain = "gitea.cmtec.se"; 
    owner = "cm"; 
    repo = "cm-dashboard"; 
    rev = "YOUR_COMMIT_HASH"; 
    hash = "YOUR_SOURCE_HASH"; 
  }; 
  cargoHash = ""; 
  nativeBuildInputs = [ pkg-config ]; 
  buildInputs = [ openssl ]; 
  buildAndTestSubdir = "."; 
  cargoBuildFlags = [ "--workspace" ]; 
 }' 2>&1 | grep "got:"
 ```
 Then update `cargoHash` in the configuration.
 #### 6. Commit the Changes
 ```bash
 # Update NixOS configuration
 git add hosts/common/cm-dashboard.nix
-git commit -m "Update cm-dashboard to latest version"
+git commit -m "Update cm-dashboard configuration"
 git push
 # Rebuild system (user-performed)
 sudo nixos-rebuild switch --flake .
 ```
-### Example Update Process
+## Monitoring Intervals
 ```bash
 # 1. Get latest commit
 LATEST_COMMIT=$(curl -s "https://gitea.cmtec.se/api/v1/repos/cm/cm-dashboard/commits?sha=main&limit=1" | grep '"sha"' | head -1 | cut -d'"' -f4)
-# 2. Get source hash
+- **CPU/Memory**: 2 seconds (real-time monitoring)
-SOURCE_HASH=$(nix-build --no-out-link -E "with import <nixpkgs> {}; fetchFromGitea { domain = \"gitea.cmtec.se\"; owner = \"cm\"; repo = \"cm-dashboard\"; rev = \"$LATEST_COMMIT\"; hash = \"sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=\"; }" 2>&1 | grep "got:" | cut -d' ' -f12)
+- **Disk usage**: 300 seconds (5 minutes)
 - **Systemd services**: 10 seconds
 - **SMART health**: 600 seconds (10 minutes)  
 - **Backup status**: 60 seconds (1 minute)
 - **Email notifications**: 30 seconds (batched)
 - **Dashboard updates**: 1 second (real-time display)
-# 3. Update configuration and commit
+## License
-echo "Latest commit: $LATEST_COMMIT"
+
-echo "Source hash: $SOURCE_HASH"
+MIT License - see LICENSE file for details
 ```