Implement per-service disk usage monitoring

Replaced system-wide disk usage with accurate per-service tracking by scanning
service-specific directories. Services like sshd now correctly show minimal
disk usage instead of misleading system totals.

- Rename storage widget and add drive capacity/usage columns
- Move host display to main dashboard title for cleaner layout
- Replace separate alert displays with color-coded row highlighting
- Add per-service disk usage collection using du command
- Update services widget formatting to handle small disk values
- Restructure into workspace with dedicated agent and dashboard packages
This commit is contained in:
2025-10-11 22:59:16 +02:00
parent 82afe3d4f1
commit 2581435b10
30 changed files with 4801 additions and 446 deletions

View File

@@ -14,10 +14,11 @@ A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure.
### Key Features
- **NVMe health monitoring** with wear prediction
- **RAM optimization tracking** (tmpfs, zram, kernel metrics)
- **Service resource monitoring** with sandboxed limits
- **CPU / memory / GPU telemetry** with automatic thresholding
- **Service resource monitoring** with per-service CPU and RAM usage
- **Disk usage overview** for root filesystems
- **Backup status** with detailed metrics and history
- **Email notification integration**
- **Unified alert pipeline** summarising host health
- **Historical data tracking** and trend analysis
## Technical Architecture
@@ -93,8 +94,10 @@ cm-dashboard/
2. **Service Metrics API** (port 6128)
- Service status and resource usage
- Memory consumption vs limits
- Disk usage per service
- Service memory consumption vs limits
- Host CPU load / frequency / temperature
- Root disk utilisation snapshot
- GPU utilisation and temperature (if available)
3. **Backup Metrics API** (port 6129)
- Backup status and history
@@ -119,6 +122,26 @@ pub struct ServiceMetrics {
pub timestamp: u64,
}
#[derive(Deserialize, Debug)]
pub struct ServiceSummary {
pub healthy: usize,
pub degraded: usize,
pub failed: usize,
pub memory_used_mb: f32,
pub memory_quota_mb: f32,
pub system_memory_used_mb: f32,
pub system_memory_total_mb: f32,
pub disk_used_gb: f32,
pub disk_total_gb: f32,
pub cpu_load_1: f32,
pub cpu_load_5: f32,
pub cpu_load_15: f32,
pub cpu_freq_mhz: Option<f32>,
pub cpu_temp_c: Option<f32>,
pub gpu_load_percent: Option<f32>,
pub gpu_temp_c: Option<f32>,
}
#[derive(Deserialize, Debug)]
pub struct BackupMetrics {
pub overall_status: String,
@@ -617,4 +640,4 @@ smartmontools-rs = "0.1" # Or direct smartctl bindings
**Performance Targets**:
- **Agent footprint**: < 2MB RAM, < 1% CPU
- **Metric latency**: < 100ms propagation across network
- **Network efficiency**: < 1KB/s per host steady state
- **Network efficiency**: < 1KB/s per host steady state