Fix Data_3 timeout by parallelizing SMART collection

Root cause: SMART data was collected sequentially, one drive at a time. With 5 drives taking ~500ms each, total collection time was 2.5+ seconds. When disk collector runs every 1 second, this caused overlapping collections creating resource contention. The last drive (sda/Data_3) would timeout due to the drive being accessed by the previous collection. Solution: Query all drives in parallel using futures::join_all. Now all drives get their SMART data collected simultaneously with independent 3-second timeouts, eliminating contention and reducing total collection time from 2.5+ seconds to ~500ms (the slowest single drive). Benefits: - All drives complete in ~500ms instead of 2.5+ seconds - No overlapping collections causing resource contention - Each drive gets full 3-second timeout window - sda/Data_3 should now show temperature and serial number Bump version to v0.1.223
Fix empty Storage section by configuring stdio pipes
2025-11-29 23:51:43 +01:00 · 2025-11-29 23:25:17 +01:00 · 2025-11-29 21:29:33 +01:00 · 2025-11-29 21:09:04 +01:00 · 2025-11-29 18:35:14 +01:00 · 2025-11-29 17:59:33 +01:00
19 changed files with 659 additions and 808 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -156,86 +156,6 @@ Complete migration from string-based metrics to structured JSON data. Eliminates
 - ✅ Backward compatibility via bridge conversion to existing UI widgets
 - ✅ All string parsing bugs eliminated
 ### Cached Collector Architecture (✅ IMPLEMENTED)
 **Problem:** Blocking collectors prevent timely ZMQ transmission, causing false "host offline" alerts.
 **Previous (Sequential Blocking):**
 ```
 Every 1 second:
  └─ collect_all_data() [BLOCKS for 2-10+ seconds]
      ├─ CPU (fast: 10ms)
      ├─ Memory (fast: 20ms)
      ├─ Disk SMART (slow: 3s per drive × 4 drives = 12s)
      ├─ Service disk usage (slow: 2-8s per service)
      └─ Docker (medium: 500ms)
  └─ send_via_zmq()  [Only after ALL collection completes]
 Result: If any collector takes >10s → "host offline" false alert
 ```
 **New (Cached Independent Collectors):**
 ```
 Shared Cache: Arc<RwLock<AgentData>>
 Background Collectors (independent async tasks):
 ├─ Fast collectors (CPU, RAM, Network)
 │   └─ Update cache every 1 second
 ├─ Medium collectors (Services, Docker)
 │   └─ Update cache every 5 seconds
 └─ Slow collectors (Disk usage, SMART data)
    └─ Update cache every 60 seconds
 ZMQ Sender (separate async task):
 Every 1 second:
  └─ Read current cache
  └─ Send via ZMQ [Always instant, never blocked]
 ```
 **Benefits:**
 - ✅ ZMQ sends every 1 second regardless of collector speed
 - ✅ No false "host offline" alerts from slow collectors
 - ✅ Different update rates for different metrics (CPU=1s, SMART=60s)
 - ✅ System stays responsive even with slow operations
 - ✅ Slow collectors can use longer timeouts without blocking
 **Implementation Details:**
 - **Shared cache**: `Arc<RwLock<AgentData>>` initialized at agent startup
 - **Collector intervals**: Fully configurable via NixOS config (`interval_seconds` per collector)
  - Recommended: Fast (1-10s): CPU, Memory, Network
  - Recommended: Medium (30-60s): Backup, NixOS
  - Recommended: Slow (60-300s): Disk, Systemd
 - **Independent tasks**: Each collector spawned as separate tokio task in `Agent::new()`
 - **Cache updates**: Collectors acquire write lock → update → release immediately
 - **ZMQ sender**: Main loop reads cache every `collection_interval_seconds` and broadcasts
 - **Notification check**: Runs every `notifications.check_interval_seconds`
 - **Lock strategy**: Short-lived write locks prevent blocking, read locks for transmission
 - **Stale data**: Acceptable for slow-changing metrics (SMART data, disk usage)
 **Configuration (NixOS):**
 All intervals and timeouts configurable in `services/cm-dashboard.nix`:
 Collection Intervals:
 - `collectors.cpu.interval_seconds` (default: 10s)
 - `collectors.memory.interval_seconds` (default: 2s)
 - `collectors.disk.interval_seconds` (default: 300s)
 - `collectors.systemd.interval_seconds` (default: 10s)
 - `collectors.backup.interval_seconds` (default: 60s)
 - `collectors.network.interval_seconds` (default: 10s)
 - `collectors.nixos.interval_seconds` (default: 60s)
 - `notifications.check_interval_seconds` (default: 30s)
 - `collection_interval_seconds` - ZMQ transmission rate (default: 2s)
 Command Timeouts (prevent resource leaks from hung commands):
 - `collectors.disk.command_timeout_seconds` (default: 30s) - lsblk, smartctl, etc.
 - `collectors.systemd.command_timeout_seconds` (default: 15s) - systemctl, docker, du
 - `collectors.network.command_timeout_seconds` (default: 10s) - ip route, ip addr
 **Code Locations:**
 - agent/src/agent.rs:59-133 - Collector task spawning
 - agent/src/agent.rs:151-179 - Independent collector task runner
 - agent/src/agent.rs:199-207 - ZMQ sender in main loop
 ### Maintenance Mode
 - Agent checks for `/tmp/cm-maintenance` file before sending notifications
@@ -407,9 +327,16 @@ Storage:
  ├─ ● Data_2: GGA04461 T: 28°C
  └─ ● Parity: WDZS8RY0 T: 29°C
 Backup:
 ● Repo: 4
  ├─ getea
  ├─ vaultwarden
  ├─ mysql
  └─ immich
 ● W800639Y W: 2%
  ├─ ● Backup: 2025-11-29T04:00:01.324623
  └─ ● Usage: 8% 70GB/916GB
 ● WD-WCC7K1234567 T: 32°C W: 12%
-  ├─ Last: 2h ago (12.3GB)
+  ├─ ● Backup: 2025-11-29T04:00:01.324623
  ├─ Next: in 22h
  └─ ● Usage: 45% 678GB/1.5TB
 ```
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -279,7 +279,7 @@ checksum = "a1d728cc89cf3aee9ff92b05e62b19ee65a02b5702cff7d5a377e32c6ae29d8d"
 [[package]]
 name = "cm-dashboard"
-version = "0.1.193"
+version = "0.1.222"
 dependencies = [
 "anyhow",
 "chrono",
@@ -301,7 +301,7 @@ dependencies = [
 [[package]]
 name = "cm-dashboard-agent"
-version = "0.1.193"
+version = "0.1.222"
 dependencies = [
 "anyhow",
 "async-trait",
@@ -309,6 +309,7 @@ dependencies = [
 "chrono-tz",
 "clap",
 "cm-dashboard-shared",
 "futures",
 "gethostname",
 "lettre",
 "reqwest",
@@ -324,7 +325,7 @@ dependencies = [
 [[package]]
 name = "cm-dashboard-shared"
-version = "0.1.193"
+version = "0.1.222"
 dependencies = [
 "chrono",
 "serde",
@@ -552,6 +553,21 @@ dependencies = [
 "percent-encoding",
 ]
 [[package]]
 name = "futures"
 version = "0.3.31"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "65bc07b1a8bc7c85c5f2e110c476c7389b4554ba72af57d8445ea63a576b0876"
 dependencies = [
 "futures-channel",
 "futures-core",
 "futures-executor",
 "futures-io",
 "futures-sink",
 "futures-task",
 "futures-util",
 ]
 [[package]]
 name = "futures-channel"
 version = "0.3.31"
@@ -559,6 +575,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "2dff15bf788c671c1934e366d07e30c1814a8ef514e1af724a602e8a2fbe1b10"
 dependencies = [
 "futures-core",
 "futures-sink",
 ]
 [[package]]
@@ -567,12 +584,34 @@ version = "0.3.31"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "05f29059c0c2090612e8d742178b0580d2dc940c837851ad723096f87af6663e"
 [[package]]
 name = "futures-executor"
 version = "0.3.31"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "1e28d1d997f585e54aebc3f97d39e72338912123a67330d723fdbb564d646c9f"
 dependencies = [
 "futures-core",
 "futures-task",
 "futures-util",
 ]
 [[package]]
 name = "futures-io"
 version = "0.3.31"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "9e5c1b78ca4aae1ac06c48a526a655760685149f0d465d21f37abfe57ce075c6"
 [[package]]
 name = "futures-macro"
 version = "0.3.31"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "162ee34ebcb7c64a8abebc059ce0fee27c2262618d7b60ed8faf72fef13c3650"
 dependencies = [
 "proc-macro2",
 "quote",
 "syn",
 ]
 [[package]]
 name = "futures-sink"
 version = "0.3.31"
@@ -591,8 +630,11 @@ version = "0.3.31"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "9fa08315bb612088cc391249efdc3bc77536f16c91f6cf495e6fbe85b20a4a81"
 dependencies = [
 "futures-channel",
 "futures-core",
 "futures-io",
 "futures-macro",
 "futures-sink",
 "futures-task",
 "memchr",
 "pin-project-lite",
--- a/agent/Cargo.toml
+++ b/agent/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "cm-dashboard-agent"
-version = "0.1.194"
+version = "0.1.223"
 edition = "2021"
 [dependencies]
@@ -20,4 +20,5 @@ gethostname = { workspace = true }
 chrono-tz = "0.8"
 toml = { workspace = true }
 async-trait = "0.1"
-reqwest = { version = "0.11", features = ["json", "blocking"] }
+reqwest = { version = "0.11", features = ["json", "blocking"] }
 futures = "0.3"
--- a/agent/src/agent.rs
+++ b/agent/src/agent.rs
@@ -1,14 +1,13 @@
 use anyhow::Result;
 use gethostname::gethostname;
 use std::sync::Arc;
 use std::time::Duration;
 use tokio::sync::RwLock;
 use tokio::time::interval;
 use tracing::{debug, error, info};
-use crate::communication::{AgentCommand, ZmqHandler};
+use crate::communication::ZmqHandler;
 use crate::config::AgentConfig;
 use crate::collectors::{
    Collector,
    backup::BackupCollector,
    cpu::CpuCollector,
    disk::DiskCollector,
@@ -24,7 +23,7 @@ pub struct Agent {
    hostname: String,
    config: AgentConfig,
    zmq_handler: ZmqHandler,
-    cache: Arc<RwLock<AgentData>>,
+    collectors: Vec<Box<dyn Collector>>,
    notification_manager: NotificationManager,
    previous_status: Option<SystemStatus>,
 }
@@ -56,94 +55,39 @@ impl Agent {
            config.zmq.publisher_port
        );
-        // Initialize shared cache
+        // Initialize collectors
-        let cache = Arc::new(RwLock::new(AgentData::new(
+        let mut collectors: Vec<Box<dyn Collector>> = Vec::new();
-            hostname.clone(),
+        
-            env!("CARGO_PKG_VERSION").to_string()
+        // Add enabled collectors
        )));
        info!("Initialized shared agent data cache");
        // Spawn independent collector tasks
        let mut collector_count = 0;
        // CPU collector
        if config.collectors.cpu.enabled {
-            let cache_clone = cache.clone();
+            collectors.push(Box::new(CpuCollector::new(config.collectors.cpu.clone())));
            let collector = CpuCollector::new(config.collectors.cpu.clone());
            let interval = config.collectors.cpu.interval_seconds;
            tokio::spawn(async move {
                Self::run_collector_task(cache_clone, collector, Duration::from_secs(interval), "CPU").await;
            });
            collector_count += 1;
        }
-
+        
        // Memory collector
        if config.collectors.memory.enabled {
-            let cache_clone = cache.clone();
+            collectors.push(Box::new(MemoryCollector::new(config.collectors.memory.clone())));
            let collector = MemoryCollector::new(config.collectors.memory.clone());
            let interval = config.collectors.memory.interval_seconds;
            tokio::spawn(async move {
                Self::run_collector_task(cache_clone, collector, Duration::from_secs(interval), "Memory").await;
            });
            collector_count += 1;
        }
-
+        
        // Network collector
        if config.collectors.network.enabled {
            let cache_clone = cache.clone();
            let collector = NetworkCollector::new(config.collectors.network.clone());
            let interval = config.collectors.network.interval_seconds;
            tokio::spawn(async move {
                Self::run_collector_task(cache_clone, collector, Duration::from_secs(interval), "Network").await;
            });
            collector_count += 1;
        }
        // Backup collector
        if config.collectors.backup.enabled {
            let cache_clone = cache.clone();
            let collector = BackupCollector::new();
            let interval = config.collectors.backup.interval_seconds;
            tokio::spawn(async move {
                Self::run_collector_task(cache_clone, collector, Duration::from_secs(interval), "Backup").await;
            });
            collector_count += 1;
        }
        // NixOS collector
        if config.collectors.nixos.enabled {
            let cache_clone = cache.clone();
            let collector = NixOSCollector::new(config.collectors.nixos.clone());
            let interval = config.collectors.nixos.interval_seconds;
            tokio::spawn(async move {
                Self::run_collector_task(cache_clone, collector, Duration::from_secs(interval), "NixOS").await;
            });
            collector_count += 1;
        }
        // Disk collector
        if config.collectors.disk.enabled {
-            let cache_clone = cache.clone();
+            collectors.push(Box::new(DiskCollector::new(config.collectors.disk.clone())));
            let collector = DiskCollector::new(config.collectors.disk.clone());
            let interval = config.collectors.disk.interval_seconds;
            tokio::spawn(async move {
                Self::run_collector_task(cache_clone, collector, Duration::from_secs(interval), "Disk").await;
            });
            collector_count += 1;
        }
-
+        
        // Systemd collector
        if config.collectors.systemd.enabled {
-            let cache_clone = cache.clone();
+            collectors.push(Box::new(SystemdCollector::new(config.collectors.systemd.clone())));
-            let collector = SystemdCollector::new(config.collectors.systemd.clone());
+        }
-            let interval = config.collectors.systemd.interval_seconds;
+        
-            tokio::spawn(async move {
+        if config.collectors.backup.enabled {
-                Self::run_collector_task(cache_clone, collector, Duration::from_secs(interval), "Systemd").await;
+            collectors.push(Box::new(BackupCollector::new()));
            });
            collector_count += 1;
        }
-        info!("Spawned {} independent collector tasks", collector_count);
+        if config.collectors.network.enabled {
            collectors.push(Box::new(NetworkCollector::new(config.collectors.network.clone())));
        }
        if config.collectors.nixos.enabled {
            collectors.push(Box::new(NixOSCollector::new(config.collectors.nixos.clone())));
        }
        info!("Initialized {} collectors", collectors.len());
        // Initialize notification manager
        let notification_manager = NotificationManager::new(&config.notifications, &hostname)?;
@@ -153,82 +97,42 @@ impl Agent {
            hostname,
            config,
            zmq_handler,
-            cache,
+            collectors,
            notification_manager,
            previous_status: None,
        })
    }
-    /// Independent collector task runner
+    /// Main agent loop with structured data collection
    async fn run_collector_task<C>(
        cache: Arc<RwLock<AgentData>>,
        collector: C,
        interval_duration: Duration,
        name: &str,
    ) where
        C: crate::collectors::Collector + Send + 'static,
    {
        let mut interval_timer = interval(interval_duration);
        info!("{} collector task started (interval: {:?})", name, interval_duration);
        loop {
            interval_timer.tick().await;
            // Acquire write lock and update cache
            {
                let mut agent_data = cache.write().await;
                match collector.collect_structured(&mut *agent_data).await {
                    Ok(_) => {
                        debug!("{} collector updated cache", name);
                    }
                    Err(e) => {
                        error!("{} collector failed: {}", name, e);
                    }
                }
            } // Release lock immediately after collection
        }
    }
    /// Main agent loop with cached data architecture
    pub async fn run(&mut self, mut shutdown_rx: tokio::sync::oneshot::Receiver<()>) -> Result<()> {
-        info!("Starting agent main loop with cached collector architecture");
+        info!("Starting agent main loop");
-        // Set up intervals from config
+        // Initial collection
        if let Err(e) = self.collect_and_broadcast().await {
            error!("Initial metric collection failed: {}", e);
        }
        // Set up intervals
        let mut transmission_interval = interval(Duration::from_secs(
-            self.config.collection_interval_seconds,
+            self.config.zmq.transmission_interval_seconds,
        ));
-        let mut notification_interval = interval(Duration::from_secs(
+        let mut notification_interval = interval(Duration::from_secs(30)); // Check notifications every 30s
            self.config.notifications.check_interval_seconds,
        ));
        let mut command_interval = interval(Duration::from_millis(100));
-        // Skip initial ticks
+        // Skip initial ticks to avoid immediate execution
        transmission_interval.tick().await;
        notification_interval.tick().await;
        command_interval.tick().await;
        loop {
            tokio::select! {
                _ = transmission_interval.tick() => {
-                    // Read current cache state and broadcast via ZMQ
+                    if let Err(e) = self.collect_and_broadcast().await {
-                    let agent_data = self.cache.read().await.clone();
+                        error!("Failed to collect and broadcast metrics: {}", e);
                    if let Err(e) = self.zmq_handler.publish_agent_data(&agent_data).await {
                        error!("Failed to broadcast agent data: {}", e);
                    } else {
                        debug!("Successfully broadcast agent data");
                    }
                }
                _ = notification_interval.tick() => {
-                    // Read cache and check for status changes
+                    // Process any pending notifications
-                    let agent_data = self.cache.read().await.clone();
+                    // NOTE: With structured data, we might need to implement status tracking differently
-                    if let Err(e) = self.check_status_changes_and_notify(&agent_data).await {
+                    // For now, we skip this until status evaluation is migrated
                        error!("Failed to check status changes: {}", e);
                    }
                }
                _ = command_interval.tick() => {
                    if let Err(e) = self.handle_commands().await {
                        error!("Error handling commands: {}", e);
                    }
                }
                _ = &mut shutdown_rx => {
                    info!("Shutdown signal received, stopping agent loop");
@@ -241,6 +145,35 @@ impl Agent {
        Ok(())
    }
    /// Collect structured data from all collectors and broadcast via ZMQ
    async fn collect_and_broadcast(&mut self) -> Result<()> {
        debug!("Starting structured data collection");
        // Initialize empty AgentData
        let mut agent_data = AgentData::new(self.hostname.clone(), env!("CARGO_PKG_VERSION").to_string());
        // Collect data from all collectors
        for collector in &self.collectors {
            if let Err(e) = collector.collect_structured(&mut agent_data).await {
                error!("Collector failed: {}", e);
                // Continue with other collectors even if one fails
            }
        }
        // Check for status changes and send notifications
        if let Err(e) = self.check_status_changes_and_notify(&agent_data).await {
            error!("Failed to check status changes: {}", e);
        }
        // Broadcast the structured data via ZMQ
        if let Err(e) = self.zmq_handler.publish_agent_data(&agent_data).await {
            error!("Failed to broadcast agent data: {}", e);
        } else {
            debug!("Successfully broadcast structured agent data");
        }
        Ok(())
    }
    /// Check for status changes and send notifications
    async fn check_status_changes_and_notify(&mut self, agent_data: &AgentData) -> Result<()> {
@@ -320,39 +253,4 @@ impl Agent {
        Ok(())
    }
    /// Handle incoming commands from dashboard
    async fn handle_commands(&mut self) -> Result<()> {
        // Try to receive a command (non-blocking)
        if let Ok(Some(command)) = self.zmq_handler.try_receive_command() {
            info!("Received command: {:?}", command);
            match command {
                AgentCommand::CollectNow => {
                    info!("Received immediate transmission request");
                    // With cached architecture, collectors run independently
                    // Just send current cache state immediately
                    let agent_data = self.cache.read().await.clone();
                    if let Err(e) = self.zmq_handler.publish_agent_data(&agent_data).await {
                        error!("Failed to broadcast on demand: {}", e);
                    }
                }
                AgentCommand::SetInterval { seconds } => {
                    info!("Received interval change request: {}s", seconds);
                    // Note: This would require more complex handling to update the interval
                    // For now, just acknowledge
                }
                AgentCommand::ToggleCollector { name, enabled } => {
                    info!("Received collector toggle request: {} -> {}", name, enabled);
                    // Note: This would require more complex handling to enable/disable collectors
                    // For now, just acknowledge
                }
                AgentCommand::Ping => {
                    info!("Received ping command");
                    // Maybe send back a pong or status
                }
            }
        }
        Ok(())
    }
 }
--- a/agent/src/collectors/backup.rs
+++ b/agent/src/collectors/backup.rs
@@ -1,36 +1,66 @@
 use async_trait::async_trait;
-use cm_dashboard_shared::{AgentData, BackupData, BackupDiskData};
+use cm_dashboard_shared::{AgentData, BackupData, BackupDiskData, Status};
 use serde::{Deserialize, Serialize};
-use std::collections::HashMap;
+use std::collections::{HashMap, HashSet};
 use std::fs;
-use std::path::Path;
+use std::path::{Path, PathBuf};
-use tracing::debug;
+use tracing::{debug, warn};
 use super::{Collector, CollectorError};
 /// Backup collector that reads backup status from TOML files with structured data output
 pub struct BackupCollector {
-    /// Path to backup status file
+    /// Directory containing backup status files
-    status_file_path: String,
+    status_dir: String,
 }
 impl BackupCollector {
    pub fn new() -> Self {
        Self {
-            status_file_path: "/var/lib/backup/backup-status.toml".to_string(),
+            status_dir: "/var/lib/backup/status".to_string(),
        }
    }
-    /// Read backup status from TOML file
+    /// Scan directory for all backup status files
-    async fn read_backup_status(&self) -> Result<Option<BackupStatusToml>, CollectorError> {
+    async fn scan_status_files(&self) -> Result<Vec<PathBuf>, CollectorError> {
-        if !Path::new(&self.status_file_path).exists() {
+        let status_path = Path::new(&self.status_dir);
-            debug!("Backup status file not found: {}", self.status_file_path);
+
-            return Ok(None);
+        if !status_path.exists() {
            debug!("Backup status directory not found: {}", self.status_dir);
            return Ok(Vec::new());
        }
-        let content = fs::read_to_string(&self.status_file_path)
+        let mut status_files = Vec::new();
        match fs::read_dir(status_path) {
            Ok(entries) => {
                for entry in entries {
                    if let Ok(entry) = entry {
                        let path = entry.path();
                        if path.is_file() {
                            if let Some(filename) = path.file_name().and_then(|n| n.to_str()) {
                                if filename.starts_with("backup-status-") && filename.ends_with(".toml") {
                                    status_files.push(path);
                                }
                            }
                        }
                    }
                }
            }
            Err(e) => {
                warn!("Failed to read backup status directory: {}", e);
                return Ok(Vec::new());
            }
        }
        Ok(status_files)
    }
    /// Read a single backup status file
    async fn read_status_file(&self, path: &Path) -> Result<BackupStatusToml, CollectorError> {
        let content = fs::read_to_string(path)
            .map_err(|e| CollectorError::SystemRead {
-                path: self.status_file_path.clone(),
+                path: path.to_string_lossy().to_string(),
                error: e.to_string(),
            })?;
@@ -40,66 +70,122 @@ impl BackupCollector {
                error: format!("Failed to parse backup status TOML: {}", e),
            })?;
-        Ok(Some(status))
+        Ok(status)
    }
    /// Calculate backup status from TOML status field
    fn calculate_backup_status(status_str: &str) -> Status {
        match status_str.to_lowercase().as_str() {
            "success" => Status::Ok,
            "warning" => Status::Warning,
            "failed" | "error" => Status::Critical,
            _ => Status::Unknown,
        }
    }
    /// Calculate usage status from disk usage percentage
    fn calculate_usage_status(usage_percent: f32) -> Status {
        if usage_percent < 80.0 {
            Status::Ok
        } else if usage_percent < 90.0 {
            Status::Warning
        } else {
            Status::Critical
        }
    }
    /// Convert BackupStatusToml to BackupData and populate AgentData
    async fn populate_backup_data(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
-        if let Some(backup_status) = self.read_backup_status().await? {
+        let status_files = self.scan_status_files().await?;
            // Use raw start_time string from TOML
-            // Extract disk information
+        if status_files.is_empty() {
-            let repository_disk = if let Some(disk_space) = &backup_status.disk_space {
+            debug!("No backup status files found");
                Some(BackupDiskData {
                    serial: backup_status.disk_serial_number.clone().unwrap_or_else(|| "Unknown".to_string()),
                    usage_percent: disk_space.usage_percent as f32,
                    used_gb: disk_space.used_gb as f32,
                    total_gb: disk_space.total_gb as f32,
                    wear_percent: backup_status.disk_wear_percent,
                    temperature_celsius: None, // Not available in current TOML
                })
            } else if let Some(serial) = &backup_status.disk_serial_number {
                // Fallback: create minimal disk info if we have serial but no disk_space
                Some(BackupDiskData {
                    serial: serial.clone(),
                    usage_percent: 0.0,
                    used_gb: 0.0,
                    total_gb: 0.0,
                    wear_percent: backup_status.disk_wear_percent,
                    temperature_celsius: None,
                })
            } else {
                None
            };
            // Calculate total repository size from services
            let total_size_gb = backup_status.services
                .values()
                .map(|service| service.repo_size_bytes as f32 / (1024.0 * 1024.0 * 1024.0))
                .sum::<f32>();
            let backup_data = BackupData {
                status: backup_status.status,
                total_size_gb: Some(total_size_gb),
                repository_health: Some("ok".to_string()), // Derive from status if needed
                repository_disk,
                last_backup_size_gb: None, // Not available in current TOML format
                start_time_raw: Some(backup_status.start_time),
            };
            agent_data.backup = backup_data;
        } else {
            // No backup status available - set default values
            agent_data.backup = BackupData {
-                status: "unavailable".to_string(),
+                repositories: Vec::new(),
-                total_size_gb: None,
+                repository_status: Status::Unknown,
-                repository_health: None,
+                disks: Vec::new(),
                repository_disk: None,
                last_backup_size_gb: None,
                start_time_raw: None,
            };
            return Ok(());
        }
        let mut all_repositories = HashSet::new();
        let mut disks = Vec::new();
        let mut worst_status = Status::Ok;
        for status_file in status_files {
            match self.read_status_file(&status_file).await {
                Ok(backup_status) => {
                    // Collect all service names
                    for service_name in backup_status.services.keys() {
                        all_repositories.insert(service_name.clone());
                    }
                    // Calculate backup status
                    let backup_status_enum = Self::calculate_backup_status(&backup_status.status);
                    // Calculate usage status from disk space
                    let (usage_percent, used_gb, total_gb, usage_status) = if let Some(disk_space) = &backup_status.disk_space {
                        let usage_pct = disk_space.usage_percent as f32;
                        (
                            usage_pct,
                            disk_space.used_gb as f32,
                            disk_space.total_gb as f32,
                            Self::calculate_usage_status(usage_pct),
                        )
                    } else {
                        (0.0, 0.0, 0.0, Status::Unknown)
                    };
                    // Update worst status
                    worst_status = worst_status.max(backup_status_enum).max(usage_status);
                    // Build service list for this disk
                    let services: Vec<String> = backup_status.services.keys().cloned().collect();
                    // Get min and max archive counts to detect inconsistencies
                    let archives_min: i64 = backup_status.services.values()
                        .map(|service| service.archive_count)
                        .min()
                        .unwrap_or(0);
                    let archives_max: i64 = backup_status.services.values()
                        .map(|service| service.archive_count)
                        .max()
                        .unwrap_or(0);
                    // Create disk data
                    let disk_data = BackupDiskData {
                        serial: backup_status.disk_serial_number.unwrap_or_else(|| "Unknown".to_string()),
                        product_name: backup_status.disk_product_name,
                        wear_percent: backup_status.disk_wear_percent,
                        temperature_celsius: None, // Not available in current TOML
                        last_backup_time: Some(backup_status.start_time),
                        backup_status: backup_status_enum,
                        disk_usage_percent: usage_percent,
                        disk_used_gb: used_gb,
                        disk_total_gb: total_gb,
                        usage_status,
                        services,
                        archives_min,
                        archives_max,
                    };
                    disks.push(disk_data);
                }
                Err(e) => {
                    warn!("Failed to read backup status file {:?}: {}", status_file, e);
                }
            }
        }
        let repositories: Vec<String> = all_repositories.into_iter().collect();
        agent_data.backup = BackupData {
            repositories,
            repository_status: worst_status,
            disks,
        };
        Ok(())
    }
 }
--- a/agent/src/collectors/cpu.rs
+++ b/agent/src/collectors/cpu.rs
@@ -119,36 +119,69 @@ impl CpuCollector {
        utils::parse_u64(content.trim())
    }
-    /// Collect CPU frequency and populate AgentData
+    /// Collect CPU C-state (idle depth) and populate AgentData with top 3 C-states by usage
-    async fn collect_frequency(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
+    async fn collect_cstate(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
-        // Try scaling frequency first (more accurate for current frequency)
+        // Read C-state usage from first CPU (representative of overall system)
-        if let Ok(freq) =
+        // C-states indicate CPU idle depth: C1=light sleep, C6=deep sleep, C10=deepest
            utils::read_proc_file("/sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq")
        {
            if let Ok(freq_khz) = utils::parse_u64(freq.trim()) {
                let freq_mhz = freq_khz as f32 / 1000.0;
                agent_data.system.cpu.frequency_mhz = freq_mhz;
                return Ok(());
            }
        }
-        // Fallback: parse /proc/cpuinfo for base frequency
+        let mut cstate_times: Vec<(String, u64)> = Vec::new();
-        if let Ok(content) = utils::read_proc_file("/proc/cpuinfo") {
+        let mut total_time: u64 = 0;
-            for line in content.lines() {
+
-                if line.starts_with("cpu MHz") {
+        // Collect all C-state times from CPU0
-                    if let Some(freq_str) = line.split(':').nth(1) {
+        for state_num in 0..=10 {
-                        if let Ok(freq_mhz) = utils::parse_f32(freq_str) {
+            let time_path = format!("/sys/devices/system/cpu/cpu0/cpuidle/state{}/time", state_num);
-                            agent_data.system.cpu.frequency_mhz = freq_mhz;
+            let name_path = format!("/sys/devices/system/cpu/cpu0/cpuidle/state{}/name", state_num);
-                            return Ok(());
+
            if let Ok(time_str) = utils::read_proc_file(&time_path) {
                if let Ok(time) = utils::parse_u64(time_str.trim()) {
                    if let Ok(name) = utils::read_proc_file(&name_path) {
                        let state_name = name.trim();
                        // Skip POLL state (not real idle)
                        if state_name != "POLL" && time > 0 {
                            // Extract "C" + digits pattern (C3, C10, etc.) to reduce JSON size
                            // Handles formats like "C3_ACPI", "C10_MWAIT", etc.
                            let clean_name = if let Some(c_pos) = state_name.find('C') {
                                let rest = &state_name[c_pos + 1..];
                                let digit_count = rest.chars().take_while(|c| c.is_ascii_digit()).count();
                                if digit_count > 0 {
                                    state_name[c_pos..c_pos + 1 + digit_count].to_string()
                                } else {
                                    state_name.to_string()
                                }
                            } else {
                                state_name.to_string()
                            };
                            cstate_times.push((clean_name, time));
                            total_time += time;
                        }
                    }
                    break; // Only need first CPU entry
                }
            } else {
                // No more states available
                break;
            }
        }
-        debug!("CPU frequency not available");
+        // Sort by time descending to get top 3
-        // Leave frequency as 0.0 if not available
+        cstate_times.sort_by(|a, b| b.1.cmp(&a.1));
        // Calculate percentages for top 3 and populate AgentData
        agent_data.system.cpu.cstates = cstate_times
            .iter()
            .take(3)
            .map(|(name, time)| {
                let percent = if total_time > 0 {
                    (*time as f32 / total_time as f32) * 100.0
                } else {
                    0.0
                };
                cm_dashboard_shared::CStateInfo {
                    name: name.clone(),
                    percent,
                }
            })
            .collect();
        Ok(())
    }
 }
@@ -165,8 +198,8 @@ impl Collector for CpuCollector {
        // Collect temperature (optional)
        self.collect_temperature(agent_data).await?;
-        // Collect frequency (optional)
+        // Collect C-state (CPU idle depth)
-        self.collect_frequency(agent_data).await?;
+        self.collect_cstate(agent_data).await?;
        let duration = start.elapsed();
        debug!("CPU collection completed in {:?}", duration);
--- a/agent/src/collectors/disk.rs
+++ b/agent/src/collectors/disk.rs
@@ -3,7 +3,8 @@ use async_trait::async_trait;
 use cm_dashboard_shared::{AgentData, DriveData, FilesystemData, PoolData, HysteresisThresholds, Status};
 use crate::config::DiskConfig;
-use std::process::Command;
+use tokio::process::Command as TokioCommand;
 use std::process::Command as StdCommand;
 use std::time::Instant;
 use std::collections::HashMap;
 use tracing::debug;
@@ -114,10 +115,10 @@ impl DiskCollector {
    async fn get_mount_devices(&self) -> Result<HashMap<String, String>, CollectorError> {
        use super::run_command_with_timeout;
-        let mut cmd = Command::new("lsblk");
+        let mut cmd = TokioCommand::new("lsblk");
        cmd.args(&["-rn", "-o", "NAME,MOUNTPOINT"]);
-        let output = run_command_with_timeout(cmd, self.config.command_timeout_seconds).await
+        let output = run_command_with_timeout(cmd, 2).await
            .map_err(|e| CollectorError::SystemRead {
                path: "block devices".to_string(),
                error: e.to_string(),
@@ -189,7 +190,7 @@ impl DiskCollector {
    /// Get filesystem info for a single mount point
    fn get_filesystem_info(&self, mount_point: &str) -> Result<(u64, u64), CollectorError> {
-        let output = std::process::Command::new("timeout")
+        let output = StdCommand::new("timeout")
            .args(&["2", "df", "--block-size=1", mount_point])
            .output()
            .map_err(|e| CollectorError::SystemRead {
@@ -386,9 +387,9 @@ impl DiskCollector {
        device.to_string()
    }
-    /// Get SMART data for drives
+    /// Get SMART data for drives in parallel
    async fn get_smart_data_for_drives(&self, physical_drives: &[PhysicalDrive], mergerfs_pools: &[MergerfsPool]) -> HashMap<String, SmartData> {
-        let mut smart_data = HashMap::new();
+        use futures::future::join_all;
        // Collect all drive names
        let mut all_drives = std::collections::HashSet::new();
@@ -404,9 +405,24 @@ impl DiskCollector {
            }
        }
-        // Get SMART data for each drive
+        // Collect SMART data for all drives in parallel
-        for drive_name in all_drives {
+        let futures: Vec<_> = all_drives
-            if let Ok(data) = self.get_smart_data(&drive_name).await {
+            .iter()
            .map(|drive_name| {
                let drive = drive_name.clone();
                async move {
                    let result = self.get_smart_data(&drive).await;
                    (drive, result)
                }
            })
            .collect();
        let results = join_all(futures).await;
        // Build HashMap from results
        let mut smart_data = HashMap::new();
        for (drive_name, result) in results {
            if let Ok(data) = result {
                smart_data.insert(drive_name, data);
            }
        }
@@ -420,7 +436,7 @@ impl DiskCollector {
        // Use direct smartctl (no sudo) - service has CAP_SYS_RAWIO and CAP_SYS_ADMIN capabilities
        // For NVMe drives, specify device type explicitly
-        let mut cmd = Command::new("smartctl");
+        let mut cmd = TokioCommand::new("smartctl");
        if drive_name.starts_with("nvme") {
            cmd.args(&["-d", "nvme", "-a", &format!("/dev/{}", drive_name)]);
        } else {
@@ -530,9 +546,6 @@ impl DiskCollector {
    /// Populate drives data into AgentData
    fn populate_drives_data(&self, physical_drives: &[PhysicalDrive], smart_data: &HashMap<String, SmartData>, agent_data: &mut AgentData) -> Result<(), CollectorError> {
        // Clear existing drives data to prevent duplicates in cached architecture
        agent_data.system.storage.drives.clear();
        for drive in physical_drives {
            let smart = smart_data.get(&drive.name);
@@ -570,9 +583,6 @@ impl DiskCollector {
    /// Populate pools data into AgentData
    fn populate_pools_data(&self, mergerfs_pools: &[MergerfsPool], smart_data: &HashMap<String, SmartData>, agent_data: &mut AgentData) -> Result<(), CollectorError> {
        // Clear existing pools data to prevent duplicates in cached architecture
        agent_data.system.storage.pools.clear();
        for pool in mergerfs_pools {
            // Calculate pool health and statuses based on member drive health
            let (pool_health, health_status, usage_status, data_drive_data, parity_drive_data) = self.calculate_pool_health(pool, smart_data);
@@ -769,7 +779,7 @@ impl DiskCollector {
    /// Get drive information for a mount path
    fn get_drive_info_for_path(&self, path: &str) -> anyhow::Result<PoolDrive> {
        // Use lsblk to find the backing device with timeout
-        let output = Command::new("timeout")
+        let output = StdCommand::new("timeout")
            .args(&["2", "lsblk", "-rn", "-o", "NAME,MOUNTPOINT"])
            .output()
            .map_err(|e| anyhow::anyhow!("Failed to run lsblk: {}", e))?;
--- a/agent/src/collectors/memory.rs
+++ b/agent/src/collectors/memory.rs
@@ -97,12 +97,9 @@ impl MemoryCollector {
    /// Populate tmpfs data into AgentData
    async fn populate_tmpfs_data(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
        // Clear existing tmpfs data to prevent duplicates in cached architecture
        agent_data.system.memory.tmpfs.clear();
        // Discover all tmpfs mount points
        let tmpfs_mounts = self.discover_tmpfs_mounts()?;
-
+        
        if tmpfs_mounts.is_empty() {
            debug!("No tmpfs mounts found to monitor");
            return Ok(());
--- a/agent/src/collectors/mod.rs
+++ b/agent/src/collectors/mod.rs
@@ -1,8 +1,7 @@
 use async_trait::async_trait;
 use cm_dashboard_shared::{AgentData};
-use std::process::{Command, Output};
+use std::process::Output;
 use std::time::Duration;
 use tokio::time::timeout;
 pub mod backup;
 pub mod cpu;
@@ -16,16 +15,34 @@ pub mod systemd;
 pub use error::CollectorError;
 /// Run a command with a timeout to prevent blocking
-pub async fn run_command_with_timeout(mut cmd: Command, timeout_secs: u64) -> std::io::Result<Output> {
+/// Properly kills the process if timeout is exceeded
 pub async fn run_command_with_timeout(mut cmd: tokio::process::Command, timeout_secs: u64) -> std::io::Result<Output> {
    use tokio::time::timeout;
    use std::process::Stdio;
    let timeout_duration = Duration::from_secs(timeout_secs);
-    match timeout(timeout_duration, tokio::task::spawn_blocking(move || cmd.output())).await {
+    // Configure stdio to capture output
-        Ok(Ok(result)) => result,
+    cmd.stdout(Stdio::piped());
-        Ok(Err(e)) => Err(std::io::Error::new(std::io::ErrorKind::Other, e)),
+    cmd.stderr(Stdio::piped());
-        Err(_) => Err(std::io::Error::new(
+
-            std::io::ErrorKind::TimedOut,
+    let child = cmd.spawn()?;
-            format!("Command timed out after {} seconds", timeout_secs)
+    let pid = child.id();
-        )),
+
    match timeout(timeout_duration, child.wait_with_output()).await {
        Ok(result) => result,
        Err(_) => {
            // Timeout - force kill the process using system kill command
            if let Some(process_id) = pid {
                let _ = tokio::process::Command::new("kill")
                    .args(&["-9", &process_id.to_string()])
                    .output()
                    .await;
            }
            Err(std::io::Error::new(
                std::io::ErrorKind::TimedOut,
                format!("Command timed out after {} seconds", timeout_secs)
            ))
        }
    }
 }
--- a/agent/src/collectors/network.rs
+++ b/agent/src/collectors/network.rs
@@ -8,12 +8,12 @@ use crate::config::NetworkConfig;
 /// Network interface collector with physical/virtual classification and link status
 pub struct NetworkCollector {
-    config: NetworkConfig,
+    _config: NetworkConfig,
 }
 impl NetworkCollector {
    pub fn new(config: NetworkConfig) -> Self {
-        Self { config }
+        Self { _config: config }
    }
    /// Check if interface is physical (not virtual)
@@ -50,9 +50,8 @@ impl NetworkCollector {
    }
    /// Get the primary physical interface (the one with default route)
-    fn get_primary_physical_interface(&self) -> Option<String> {
+    fn get_primary_physical_interface() -> Option<String> {
-        let timeout_str = self.config.command_timeout_seconds.to_string();
+        match Command::new("timeout").args(["2", "ip", "route", "show", "default"]).output() {
        match Command::new("timeout").args([&timeout_str, "ip", "route", "show", "default"]).output() {
            Ok(output) if output.status.success() => {
                let output_str = String::from_utf8_lossy(&output.stdout);
                // Parse: "default via 192.168.1.1 dev eno1 ..."
@@ -111,8 +110,7 @@ impl NetworkCollector {
        // Parse VLAN configuration
        let vlan_map = Self::parse_vlan_config();
-        let timeout_str = self.config.command_timeout_seconds.to_string();
+        match Command::new("timeout").args(["2", "ip", "-j", "addr"]).output() {
        match Command::new("timeout").args([&timeout_str, "ip", "-j", "addr"]).output() {
            Ok(output) if output.status.success() => {
                let json_str = String::from_utf8_lossy(&output.stdout);
@@ -197,7 +195,7 @@ impl NetworkCollector {
        }
        // Assign primary physical interface as parent to virtual interfaces without explicit parent
-        let primary_interface = self.get_primary_physical_interface();
+        let primary_interface = Self::get_primary_physical_interface();
        if let Some(primary) = primary_interface {
            for interface in interfaces.iter_mut() {
                // Only assign parent to virtual interfaces that don't already have one
--- a/agent/src/collectors/systemd.rs
+++ b/agent/src/collectors/systemd.rs
@@ -4,7 +4,7 @@ use cm_dashboard_shared::{AgentData, ServiceData, SubServiceData, SubServiceMetr
 use std::process::Command;
 use std::sync::RwLock;
 use std::time::Instant;
-use tracing::{debug, warn};
+use tracing::debug;
 use super::{Collector, CollectorError};
 use crate::config::SystemdConfig;
@@ -43,9 +43,10 @@ struct ServiceCacheState {
 /// Cached service status information from systemctl list-units
 #[derive(Debug, Clone)]
 struct ServiceStatusInfo {
    load_state: String,
    active_state: String,
-    sub_state: String,
+    memory_bytes: Option<u64>,
    restart_count: Option<u32>,
    start_timestamp: Option<u64>,
 }
 impl SystemdCollector {
@@ -86,14 +87,20 @@ impl SystemdCollector {
        let mut complete_service_data = Vec::new();
        for service_name in &monitored_services {
            match self.get_service_status(service_name) {
-                Ok((active_status, _detailed_info)) => {
+                Ok(status_info) => {
                    let memory_mb = self.get_service_memory_usage(service_name).await.unwrap_or(0.0);
                    let disk_gb = self.get_service_disk_usage(service_name).await.unwrap_or(0.0);
                    let mut sub_services = Vec::new();
                    // Calculate uptime if we have start timestamp
                    let uptime_seconds = status_info.start_timestamp.and_then(|start| {
                        let now = std::time::SystemTime::now()
                            .duration_since(std::time::UNIX_EPOCH)
                            .ok()?
                            .as_secs();
                        Some(now.saturating_sub(start))
                    });
                    // Sub-service metrics for specific services (always include cached results)
-                    if service_name.contains("nginx") && active_status == "active" {
+                    if service_name.contains("nginx") && status_info.active_state == "active" {
                        let nginx_sites = self.get_nginx_site_metrics();
                        for (site_name, latency_ms) in nginx_sites {
                            let site_status = if latency_ms >= 0.0 && latency_ms < self.config.nginx_latency_critical_ms {
@@ -118,7 +125,7 @@ impl SystemdCollector {
                        }
                    }
-                    if service_name.contains("docker") && active_status == "active" {
+                    if service_name.contains("docker") && status_info.active_state == "active" {
                        let docker_containers = self.get_docker_containers();
                        for (container_name, container_status) in docker_containers {
                            // For now, docker containers have no additional metrics
@@ -155,11 +162,12 @@ impl SystemdCollector {
                    // Create complete service data
                    let service_data = ServiceData {
                        name: service_name.clone(),
                        memory_mb,
                        disk_gb,
                        user_stopped: false, // TODO: Integrate with service tracker
-                        service_status: self.calculate_service_status(service_name, &active_status),
+                        service_status: self.calculate_service_status(service_name, &status_info.active_state),
                        sub_services,
                        memory_bytes: status_info.memory_bytes,
                        restart_count: status_info.restart_count,
                        uptime_seconds,
                    };
                    // Add to AgentData and cache
@@ -254,19 +262,18 @@ impl SystemdCollector {
    /// Auto-discover interesting services to monitor
    fn discover_services_internal(&self) -> Result<(Vec<String>, std::collections::HashMap<String, ServiceStatusInfo>)> {
-        // First: Get all service unit files
+        // First: Get all service unit files (with 3 second timeout)
        let timeout_str = self.config.command_timeout_seconds.to_string();
        let unit_files_output = Command::new("timeout")
-            .args(&[&timeout_str, "systemctl", "list-unit-files", "--type=service", "--no-pager", "--plain"])
+            .args(&["3", "systemctl", "list-unit-files", "--type=service", "--no-pager", "--plain"])
            .output()?;
        if !unit_files_output.status.success() {
            return Err(anyhow::anyhow!("systemctl list-unit-files command failed"));
        }
-        // Second: Get runtime status of all units
+        // Second: Get runtime status of all units (with 3 second timeout)
        let units_status_output = Command::new("timeout")
-            .args(&[&timeout_str, "systemctl", "list-units", "--type=service", "--all", "--no-pager", "--plain"])
+            .args(&["3", "systemctl", "list-units", "--type=service", "--all", "--no-pager", "--plain"])
            .output()?;
        if !units_status_output.status.success() {
@@ -296,14 +303,13 @@ impl SystemdCollector {
            let fields: Vec<&str> = line.split_whitespace().collect();
            if fields.len() >= 4 && fields[0].ends_with(".service") {
                let service_name = fields[0].trim_end_matches(".service");
                let load_state = fields.get(1).unwrap_or(&"unknown").to_string();
                let active_state = fields.get(2).unwrap_or(&"unknown").to_string();
                let sub_state = fields.get(3).unwrap_or(&"unknown").to_string();
                status_cache.insert(service_name.to_string(), ServiceStatusInfo {
                    load_state,
                    active_state,
-                    sub_state,
+                    memory_bytes: None,
                    restart_count: None,
                    start_timestamp: None,
                });
            }
        }
@@ -312,9 +318,10 @@ impl SystemdCollector {
        for service_name in &all_service_names {
            if !status_cache.contains_key(service_name) {
                status_cache.insert(service_name.to_string(), ServiceStatusInfo {
                    load_state: "not-loaded".to_string(),
                    active_state: "inactive".to_string(),
-                    sub_state: "dead".to_string(),
+                    memory_bytes: None,
                    restart_count: None,
                    start_timestamp: None,
                });
            }
        }
@@ -346,37 +353,60 @@ impl SystemdCollector {
        Ok((services, status_cache))
    }
-    /// Get service status from cache (if available) or fallback to systemctl
+    /// Get service status with detailed metrics from systemctl
-    fn get_service_status(&self, service: &str) -> Result<(String, String)> {
+    fn get_service_status(&self, service: &str) -> Result<ServiceStatusInfo> {
-        // Try to get status from cache first
+        // Always fetch fresh data to get detailed metrics (memory, restarts, uptime)
-        if let Ok(state) = self.state.read() {
+        // Note: Cache in service_status_cache only has basic active_state from discovery,
-            if let Some(cached_info) = state.service_status_cache.get(service) {
+        // with all detailed metrics set to None. We need fresh systemctl show data.
-                let active_status = cached_info.active_state.clone();
+
-                let detailed_info = format!(
+        let output = Command::new("timeout")
-                    "LoadState={}\nActiveState={}\nSubState={}",
+            .args(&[
-                    cached_info.load_state,
+                "2",
-                    cached_info.active_state,
+                "systemctl",
-                    cached_info.sub_state
+                "show",
-                );
+                &format!("{}.service", service),
-                return Ok((active_status, detailed_info));
+                "--property=LoadState,ActiveState,SubState,MemoryCurrent,NRestarts,ExecMainStartTimestamp"
            ])
            .output()?;
        let output_str = String::from_utf8(output.stdout)?;
        // Parse properties
        let mut active_state = String::new();
        let mut memory_bytes = None;
        let mut restart_count = None;
        let mut start_timestamp = None;
        for line in output_str.lines() {
            if let Some(value) = line.strip_prefix("ActiveState=") {
                active_state = value.to_string();
            } else if let Some(value) = line.strip_prefix("MemoryCurrent=") {
                if value != "[not set]" {
                    memory_bytes = value.parse().ok();
                }
            } else if let Some(value) = line.strip_prefix("NRestarts=") {
                restart_count = value.parse().ok();
            } else if let Some(value) = line.strip_prefix("ExecMainStartTimestamp=") {
                if value != "[not set]" && !value.is_empty() {
                    // Parse timestamp to seconds since epoch
                    if let Ok(output) = Command::new("date")
                        .args(&["+%s", "-d", value])
                        .output()
                    {
                        if let Ok(timestamp_str) = String::from_utf8(output.stdout) {
                            start_timestamp = timestamp_str.trim().parse().ok();
                        }
                    }
                }
            }
        }
-        // Fallback to systemctl if not in cache
+        Ok(ServiceStatusInfo {
-        let timeout_str = self.config.command_timeout_seconds.to_string();
+            active_state,
-        let output = Command::new("timeout")
+            memory_bytes,
-            .args(&[&timeout_str, "systemctl", "is-active", &format!("{}.service", service)])
+            restart_count,
-            .output()?;
+            start_timestamp,
-
+        })
        let active_status = String::from_utf8(output.stdout)?.trim().to_string();
        // Get more detailed info
        let output = Command::new("timeout")
            .args(&[&timeout_str, "systemctl", "show", &format!("{}.service", service), "--property=LoadState,ActiveState,SubState"])
            .output()?;
        let detailed_info = String::from_utf8(output.stdout)?;
        Ok((active_status, detailed_info))
    }
    /// Check if service name matches pattern (supports wildcards like nginx*)
@@ -418,81 +448,6 @@ impl SystemdCollector {
        true
    }
    /// Get disk usage for a specific service
    async fn get_service_disk_usage(&self, service_name: &str) -> Result<f32, CollectorError> {
        // Check if this service has configured directory paths
        if let Some(dirs) = self.config.service_directories.get(service_name) {
            // Service has configured paths - use the first accessible one
            for dir in dirs {
                if let Some(size) = self.get_directory_size(dir).await {
                    return Ok(size);
                }
            }
            // If configured paths failed, return 0
            return Ok(0.0);
        }
        // No configured path - try to get WorkingDirectory from systemctl
        let timeout_str = self.config.command_timeout_seconds.to_string();
        let output = Command::new("timeout")
            .args(&[&timeout_str, "systemctl", "show", &format!("{}.service", service_name), "--property=WorkingDirectory"])
            .output()
            .map_err(|e| CollectorError::SystemRead {
                path: format!("WorkingDirectory for {}", service_name),
                error: e.to_string(),
            })?;
        let output_str = String::from_utf8_lossy(&output.stdout);
        for line in output_str.lines() {
            if line.starts_with("WorkingDirectory=") && !line.contains("[not set]") {
                let dir = line.strip_prefix("WorkingDirectory=").unwrap_or("");
                if !dir.is_empty() && dir != "/" {
                    return Ok(self.get_directory_size(dir).await.unwrap_or(0.0));
                }
            }
        }
        Ok(0.0)
    }
    /// Get size of a directory in GB
    async fn get_directory_size(&self, path: &str) -> Option<f32> {
        use super::run_command_with_timeout;
        // Use -s (summary) and --apparent-size for speed
        let mut cmd = Command::new("sudo");
        cmd.args(&["du", "-s", "--apparent-size", "--block-size=1", path]);
        let output = run_command_with_timeout(cmd, self.config.command_timeout_seconds).await.ok()?;
        if !output.status.success() {
            // Log permission errors for debugging but don't spam logs
            let stderr = String::from_utf8_lossy(&output.stderr);
            if stderr.contains("Permission denied") {
                debug!("Permission denied accessing directory: {}", path);
            } else if stderr.contains("timed out") {
                warn!("Directory size check timed out for {}", path);
            } else {
                debug!("Failed to get size for directory {}: {}", path, stderr);
            }
            return None;
        }
        let output_str = String::from_utf8(output.stdout).ok()?;
        let size_str = output_str.split_whitespace().next()?;
        if let Ok(size_bytes) = size_str.parse::<u64>() {
            let size_gb = size_bytes as f32 / (1024.0 * 1024.0 * 1024.0);
            // Return size even if very small (minimum 0.001 GB = 1MB for visibility)
            if size_gb > 0.0 {
                Some(size_gb.max(0.001))
            } else {
                None
            }
        } else {
            None
        }
    }
    /// Calculate service status, taking user-stopped services into account
    fn calculate_service_status(&self, service_name: &str, active_status: &str) -> Status {
        match active_status.to_lowercase().as_str() {
@@ -510,33 +465,6 @@ impl SystemdCollector {
        }
    }
    /// Get memory usage for a specific service
    async fn get_service_memory_usage(&self, service_name: &str) -> Result<f32, CollectorError> {
        let output = Command::new("systemctl")
            .args(&["show", &format!("{}.service", service_name), "--property=MemoryCurrent"])
            .output()
            .map_err(|e| CollectorError::SystemRead {
                path: format!("memory usage for {}", service_name),
                error: e.to_string(),
            })?;
        let output_str = String::from_utf8_lossy(&output.stdout);
        for line in output_str.lines() {
            if line.starts_with("MemoryCurrent=") {
                if let Some(mem_str) = line.strip_prefix("MemoryCurrent=") {
                    if mem_str != "[not set]" {
                        if let Ok(memory_bytes) = mem_str.parse::<u64>() {
                            return Ok(memory_bytes as f32 / (1024.0 * 1024.0)); // Convert to MB
                        }
                    }
                }
            }
        }
        Ok(0.0)
    }
    /// Check if service collection cache should be updated
    fn should_update_cache(&self) -> bool {
        let state = self.state.read().unwrap();
@@ -789,10 +717,9 @@ impl SystemdCollector {
        let mut containers = Vec::new();
        // Check if docker is available (cm-agent user is in docker group)
-        // Use -a to show ALL containers (running and stopped)
+        // Use -a to show ALL containers (running and stopped) with 3 second timeout
        let timeout_str = self.config.command_timeout_seconds.to_string();
        let output = Command::new("timeout")
-            .args(&[&timeout_str, "docker", "ps", "-a", "--format", "{{.Names}},{{.Status}}"])
+            .args(&["3", "docker", "ps", "-a", "--format", "{{.Names}},{{.Status}}"])
            .output();
        let output = match output {
@@ -833,10 +760,9 @@ impl SystemdCollector {
    /// Get docker images as sub-services
    fn get_docker_images(&self) -> Vec<(String, String, f32)> {
        let mut images = Vec::new();
-        // Check if docker is available (cm-agent user is in docker group)
+        // Check if docker is available (cm-agent user is in docker group) with 3 second timeout
        let timeout_str = self.config.command_timeout_seconds.to_string();
        let output = Command::new("timeout")
-            .args(&[&timeout_str, "docker", "images", "--format", "{{.Repository}}:{{.Tag}},{{.Size}}"])
+            .args(&["3", "docker", "images", "--format", "{{.Repository}}:{{.Tag}},{{.Size}}"])
            .output();
        let output = match output {
@@ -915,9 +841,6 @@ impl SystemdCollector {
 #[async_trait]
 impl Collector for SystemdCollector {
    async fn collect_structured(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
        // Clear existing services data to prevent duplicates in cached architecture
        agent_data.services.clear();
        // Use cached complete data if available and fresh
        if let Some(cached_complete_services) = self.get_cached_complete_services() {
            for service_data in cached_complete_services {
--- a/agent/src/communication/mod.rs
+++ b/agent/src/communication/mod.rs
@@ -5,10 +5,9 @@ use zmq::{Context, Socket, SocketType};
 use crate::config::ZmqConfig;
-/// ZMQ communication handler for publishing metrics and receiving commands
+/// ZMQ communication handler for publishing metrics
 pub struct ZmqHandler {
    publisher: Socket,
    command_receiver: Socket,
 }
 impl ZmqHandler {
@@ -26,20 +25,8 @@ impl ZmqHandler {
        publisher.set_sndhwm(1000)?; // High water mark for outbound messages
        publisher.set_linger(1000)?; // Linger time on close
        // Create command receiver socket (PULL socket to receive commands from dashboard)
        let command_receiver = context.socket(SocketType::PULL)?;
        let cmd_bind_address = format!("tcp://{}:{}", config.bind_address, config.command_port);
        command_receiver.bind(&cmd_bind_address)?;
        info!("ZMQ command receiver bound to {}", cmd_bind_address);
        // Set non-blocking mode for command receiver
        command_receiver.set_rcvtimeo(0)?; // Non-blocking receive
        command_receiver.set_linger(1000)?;
        Ok(Self {
            publisher,
            command_receiver,
        })
    }
@@ -65,36 +52,4 @@ impl ZmqHandler {
        Ok(())
    }
    /// Try to receive a command (non-blocking)
    pub fn try_receive_command(&self) -> Result<Option<AgentCommand>> {
        match self.command_receiver.recv_bytes(zmq::DONTWAIT) {
            Ok(bytes) => {
                debug!("Received command message ({} bytes)", bytes.len());
                let command: AgentCommand = serde_json::from_slice(&bytes)
                    .map_err(|e| anyhow::anyhow!("Failed to deserialize command: {}", e))?;
                debug!("Parsed command: {:?}", command);
                Ok(Some(command))
            }
            Err(zmq::Error::EAGAIN) => {
                // No message available (non-blocking)
                Ok(None)
            }
            Err(e) => Err(anyhow::anyhow!("ZMQ receive error: {}", e)),
        }
    }
 }
 /// Commands that can be sent to the agent
 #[derive(Debug, Clone, serde::Deserialize, serde::Serialize)]
 pub enum AgentCommand {
    /// Request immediate metric collection
    CollectNow,
    /// Change collection interval
    SetInterval { seconds: u64 },
    /// Enable/disable a collector
    ToggleCollector { name: String, enabled: bool },
    /// Request status/health check
    Ping,
 }
--- a/agent/src/config/mod.rs
+++ b/agent/src/config/mod.rs
@@ -13,14 +13,12 @@ pub struct AgentConfig {
    pub collectors: CollectorConfig,
    pub cache: CacheConfig,
    pub notifications: NotificationConfig,
    pub collection_interval_seconds: u64,
 }
 /// ZMQ communication configuration
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct ZmqConfig {
    pub publisher_port: u16,
    pub command_port: u16,
    pub bind_address: String,
    pub transmission_interval_seconds: u64,
    /// Heartbeat transmission interval in seconds for host connectivity detection
@@ -79,9 +77,6 @@ pub struct DiskConfig {
    pub temperature_critical_celsius: f32,
    pub wear_warning_percent: f32,
    pub wear_critical_percent: f32,
    /// Command timeout in seconds for lsblk, smartctl, etc.
    #[serde(default = "default_disk_command_timeout")]
    pub command_timeout_seconds: u64,
 }
 /// Filesystem configuration entry
@@ -111,9 +106,6 @@ pub struct SystemdConfig {
    pub http_timeout_seconds: u64,
    pub http_connect_timeout_seconds: u64,
    pub nginx_latency_critical_ms: f32,
    /// Command timeout in seconds for systemctl, docker, du commands
    #[serde(default = "default_systemd_command_timeout")]
    pub command_timeout_seconds: u64,
 }
@@ -138,9 +130,6 @@ pub struct BackupConfig {
 pub struct NetworkConfig {
    pub enabled: bool,
    pub interval_seconds: u64,
    /// Command timeout in seconds for ip route, ip addr commands
    #[serde(default = "default_network_command_timeout")]
    pub command_timeout_seconds: u64,
 }
 /// Notification configuration
@@ -154,9 +143,6 @@ pub struct NotificationConfig {
    pub rate_limit_minutes: u64,
    /// Email notification batching interval in seconds (default: 60)
    pub aggregation_interval_seconds: u64,
    /// Status check interval in seconds for detecting changes (default: 30)
    #[serde(default = "default_notification_check_interval")]
    pub check_interval_seconds: u64,
    /// List of metric names to exclude from email notifications
    #[serde(default)]
    pub exclude_email_metrics: Vec<String>,
@@ -170,26 +156,10 @@ fn default_heartbeat_interval_seconds() -> u64 {
    5
 }
 fn default_notification_check_interval() -> u64 {
    30
 }
 fn default_maintenance_mode_file() -> String {
    "/tmp/cm-maintenance".to_string()
 }
 fn default_disk_command_timeout() -> u64 {
    30
 }
 fn default_systemd_command_timeout() -> u64 {
    15
 }
 fn default_network_command_timeout() -> u64 {
    10
 }
 impl AgentConfig {
    pub fn from_file<P: AsRef<Path>>(path: P) -> Result<Self> {
        loader::load_config(path)
--- a/agent/src/config/validation.rs
+++ b/agent/src/config/validation.rs
@@ -7,21 +7,13 @@ pub fn validate_config(config: &AgentConfig) -> Result<()> {
        bail!("ZMQ publisher port cannot be 0");
    }
    if config.zmq.command_port == 0 {
        bail!("ZMQ command port cannot be 0");
    }
    if config.zmq.publisher_port == config.zmq.command_port {
        bail!("ZMQ publisher and command ports cannot be the same");
    }
    if config.zmq.bind_address.is_empty() {
        bail!("ZMQ bind address cannot be empty");
    }
-    // Validate collection interval
+    // Validate ZMQ transmission interval
-    if config.collection_interval_seconds == 0 {
+    if config.zmq.transmission_interval_seconds == 0 {
-        bail!("Collection interval cannot be 0");
+        bail!("ZMQ transmission interval cannot be 0");
    }
    // Validate CPU thresholds
--- a/dashboard/Cargo.toml
+++ b/dashboard/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "cm-dashboard"
-version = "0.1.194"
+version = "0.1.223"
 edition = "2021"
 [dependencies]
--- a/dashboard/src/ui/widgets/services.rs
+++ b/dashboard/src/ui/widgets/services.rs
@@ -28,11 +28,12 @@ pub struct ServicesWidget {
 #[derive(Clone)]
 struct ServiceInfo {
    memory_mb: Option<f32>,
    disk_gb: Option<f32>,
    metrics: Vec<(String, f32, Option<String>)>, // (label, value, unit)
    widget_status: Status,
    service_type: String, // "nginx_site", "container", "image", or empty for parent services
    memory_bytes: Option<u64>,
    restart_count: Option<u32>,
    uptime_seconds: Option<u64>,
 }
 impl ServicesWidget {
@@ -52,8 +53,6 @@ impl ServicesWidget {
        if metric_name.starts_with("service_") {
            if let Some(end_pos) = metric_name
                .rfind("_status")
                .or_else(|| metric_name.rfind("_memory_mb"))
                .or_else(|| metric_name.rfind("_disk_gb"))
                .or_else(|| metric_name.rfind("_latency_ms"))
            {
                let service_part = &metric_name[8..end_pos]; // Remove "service_" prefix
@@ -76,36 +75,8 @@ impl ServicesWidget {
        None
    }
    /// Format disk size with appropriate units (kB/MB/GB)
    fn format_disk_size(size_gb: f32) -> String {
        let size_mb = size_gb * 1024.0; // Convert GB to MB
        if size_mb >= 1024.0 {
            // Show as GB
            format!("{:.1}GB", size_gb)
        } else if size_mb >= 1.0 {
            // Show as MB
            format!("{:.0}MB", size_mb)
        } else if size_mb >= 0.001 {
            // Convert to kB
            let size_kb = size_mb * 1024.0;
            format!("{:.0}kB", size_kb)
        } else {
            // Show very small sizes as bytes
            let size_bytes = size_mb * 1024.0 * 1024.0;
            format!("{:.0}B", size_bytes)
        }
    }
    /// Format parent service line - returns text without icon for span formatting
    fn format_parent_service_line(&self, name: &str, info: &ServiceInfo) -> String {
        let memory_str = info
            .memory_mb
            .map_or("0M".to_string(), |m| format!("{:.0}M", m));
        let disk_str = info
            .disk_gb
            .map_or("0".to_string(), |d| Self::format_disk_size(d));
        // Truncate long service names to fit layout (account for icon space)
        let short_name = if name.len() > 22 {
            format!("{}...", &name[..19])
@@ -116,7 +87,7 @@ impl ServicesWidget {
        // Convert Status enum to display text
        let status_str = match info.widget_status {
            Status::Ok => "active",
-            Status::Inactive => "inactive", 
+            Status::Inactive => "inactive",
            Status::Critical => "failed",
            Status::Pending => "pending",
            Status::Warning => "warning",
@@ -124,9 +95,43 @@ impl ServicesWidget {
            Status::Offline => "offline",
        };
        // Format memory
        let memory_str = info.memory_bytes.map_or("-".to_string(), |bytes| {
            let mb = bytes as f64 / (1024.0 * 1024.0);
            if mb >= 1000.0 {
                format!("{:.1}G", mb / 1024.0)
            } else {
                format!("{:.0}M", mb)
            }
        });
        // Format uptime
        let uptime_str = info.uptime_seconds.map_or("-".to_string(), |secs| {
            let days = secs / 86400;
            let hours = (secs % 86400) / 3600;
            let mins = (secs % 3600) / 60;
            if days > 0 {
                format!("{}d{}h", days, hours)
            } else if hours > 0 {
                format!("{}h{}m", hours, mins)
            } else {
                format!("{}m", mins)
            }
        });
        // Format restarts (show "!" if > 0 to indicate instability)
        let restart_str = info.restart_count.map_or("-".to_string(), |count| {
            if count > 0 {
                format!("!{}", count)
            } else {
                "0".to_string()
            }
        });
        format!(
-            "{:<23} {:<10} {:<8} {:<8}",
+            "{:<23} {:<10} {:<8} {:<8} {:<5}",
-            short_name, status_str, memory_str, disk_str
+            short_name, status_str, memory_str, uptime_str, restart_str
        )
    }
@@ -309,11 +314,12 @@ impl Widget for ServicesWidget {
        for service in &agent_data.services {
            // Store parent service
            let parent_info = ServiceInfo {
                memory_mb: Some(service.memory_mb),
                disk_gb: Some(service.disk_gb),
                metrics: Vec::new(), // Parent services don't have custom metrics
                widget_status: service.service_status,
                service_type: String::new(), // Parent services have no type
                memory_bytes: service.memory_bytes,
                restart_count: service.restart_count,
                uptime_seconds: service.uptime_seconds,
            };
            self.parent_services.insert(service.name.clone(), parent_info);
@@ -327,11 +333,12 @@ impl Widget for ServicesWidget {
                        .collect();
                    let sub_info = ServiceInfo {
                        memory_mb: None, // Not used for sub-services
                        disk_gb: None,   // Not used for sub-services
                        metrics,
                        widget_status: sub_service.service_status,
                        service_type: sub_service.service_type.clone(),
                        memory_bytes: None, // Sub-services don't have individual metrics yet
                        restart_count: None,
                        uptime_seconds: None,
                    };
                    sub_list.push((sub_service.name.clone(), sub_info));
                }
@@ -371,23 +378,16 @@ impl ServicesWidget {
                            self.parent_services
                                .entry(parent_service)
                                .or_insert(ServiceInfo {
                                    memory_mb: None,
                                    disk_gb: None,
                                    metrics: Vec::new(),
                                    widget_status: Status::Unknown,
                                    service_type: String::new(),
                                    memory_bytes: None,
                                    restart_count: None,
                                    uptime_seconds: None,
                                });
                        if metric.name.ends_with("_status") {
                            service_info.widget_status = metric.status;
                        } else if metric.name.ends_with("_memory_mb") {
                            if let Some(memory) = metric.value.as_f32() {
                                service_info.memory_mb = Some(memory);
                            }
                        } else if metric.name.ends_with("_disk_gb") {
                            if let Some(disk) = metric.value.as_f32() {
                                service_info.disk_gb = Some(disk);
                            }
                        }
                    }
                    Some(sub_name) => {
@@ -407,11 +407,12 @@ impl ServicesWidget {
                            sub_service_list.push((
                                sub_name.clone(),
                                ServiceInfo {
                                    memory_mb: None,
                                    disk_gb: None,
                                    metrics: Vec::new(),
                                    widget_status: Status::Unknown,
                                    service_type: String::new(), // Unknown type in legacy path
                                    memory_bytes: None,
                                    restart_count: None,
                                    uptime_seconds: None,
                                },
                            ));
                            &mut sub_service_list.last_mut().unwrap().1
@@ -419,14 +420,6 @@ impl ServicesWidget {
                        if metric.name.ends_with("_status") {
                            sub_service_info.widget_status = metric.status;
                        } else if metric.name.ends_with("_memory_mb") {
                            if let Some(memory) = metric.value.as_f32() {
                                sub_service_info.memory_mb = Some(memory);
                            }
                        } else if metric.name.ends_with("_disk_gb") {
                            if let Some(disk) = metric.value.as_f32() {
                                sub_service_info.disk_gb = Some(disk);
                            }
                        }
                    }
                }
@@ -485,8 +478,8 @@ impl ServicesWidget {
        // Header
        let header = format!(
-            "{:<25} {:<10} {:<8} {:<8}",
+            "{:<25} {:<10} {:<8} {:<8} {:<5}",
-            "Service:", "Status:", "RAM:", "Disk:"
+            "Service:", "Status:", "RAM:", "Uptime:", "↻:"
        );
        let header_para = Paragraph::new(header).style(Typography::muted());
        frame.render_widget(header_para, content_chunks[0]);
--- a/dashboard/src/ui/widgets/system.rs
+++ b/dashboard/src/ui/widgets/system.rs
@@ -26,7 +26,7 @@ pub struct SystemWidget {
    cpu_load_1min: Option<f32>,
    cpu_load_5min: Option<f32>,
    cpu_load_15min: Option<f32>,
-    cpu_frequency: Option<f32>,
+    cpu_cstates: Vec<cm_dashboard_shared::CStateInfo>,
    cpu_status: Status,
    // Memory metrics
@@ -45,15 +45,9 @@ pub struct SystemWidget {
    storage_pools: Vec<StoragePool>,
    // Backup metrics
-    backup_status: String,
+    backup_repositories: Vec<String>,
-    backup_start_time_raw: Option<String>,
+    backup_repository_status: Status,
-    backup_disk_serial: Option<String>,
+    backup_disks: Vec<cm_dashboard_shared::BackupDiskData>,
    backup_disk_usage_percent: Option<f32>,
    backup_disk_used_gb: Option<f32>,
    backup_disk_total_gb: Option<f32>,
    backup_disk_wear_percent: Option<f32>,
    backup_disk_temperature: Option<f32>,
    backup_last_size_gb: Option<f32>,
    // Overall status
    has_data: bool,
@@ -102,7 +96,7 @@ impl SystemWidget {
            cpu_load_1min: None,
            cpu_load_5min: None,
            cpu_load_15min: None,
-            cpu_frequency: None,
+            cpu_cstates: Vec::new(),
            cpu_status: Status::Unknown,
            memory_usage_percent: None,
            memory_used_gb: None,
@@ -114,15 +108,9 @@ impl SystemWidget {
            tmp_status: Status::Unknown,
            tmpfs_mounts: Vec::new(),
            storage_pools: Vec::new(),
-            backup_status: "unknown".to_string(),
+            backup_repositories: Vec::new(),
-            backup_start_time_raw: None,
+            backup_repository_status: Status::Unknown,
-            backup_disk_serial: None,
+            backup_disks: Vec::new(),
            backup_disk_usage_percent: None,
            backup_disk_used_gb: None,
            backup_disk_total_gb: None,
            backup_disk_wear_percent: None,
            backup_disk_temperature: None,
            backup_last_size_gb: None,
            has_data: false,
        }
    }
@@ -137,12 +125,19 @@ impl SystemWidget {
        }
    }
-    /// Format CPU frequency
+    /// Format CPU C-states (idle depth) with percentages
-    fn format_cpu_frequency(&self) -> String {
+    fn format_cpu_cstate(&self) -> String {
-        match self.cpu_frequency {
+        if self.cpu_cstates.is_empty() {
-            Some(freq) => format!("{:.0} MHz", freq),
+            return "—".to_string();
            None => "— MHz".to_string(),
        }
        // Format top 3 C-states with percentages: "C10:79% C8:10% C6:8%"
        // Agent already sends clean names (C3, C10, etc.)
        self.cpu_cstates
            .iter()
            .map(|cs| format!("{}:{:.0}%", cs.name, cs.percent))
            .collect::<Vec<_>>()
            .join(" ")
    }
    /// Format memory usage
@@ -188,7 +183,7 @@ impl Widget for SystemWidget {
        self.cpu_load_1min = Some(cpu.load_1min);
        self.cpu_load_5min = Some(cpu.load_5min);
        self.cpu_load_15min = Some(cpu.load_15min);
-        self.cpu_frequency = Some(cpu.frequency_mhz);
+        self.cpu_cstates = cpu.cstates.clone();
        self.cpu_status = Status::Ok;
        // Extract memory data directly
@@ -214,25 +209,9 @@ impl Widget for SystemWidget {
        // Extract backup data
        let backup = &agent_data.backup;
-        self.backup_status = backup.status.clone();
+        self.backup_repositories = backup.repositories.clone();
-        self.backup_start_time_raw = backup.start_time_raw.clone();
+        self.backup_repository_status = backup.repository_status;
-        self.backup_last_size_gb = backup.last_backup_size_gb;
+        self.backup_disks = backup.disks.clone();
        if let Some(disk) = &backup.repository_disk {
            self.backup_disk_serial = Some(disk.serial.clone());
            self.backup_disk_usage_percent = Some(disk.usage_percent);
            self.backup_disk_used_gb = Some(disk.used_gb);
            self.backup_disk_total_gb = Some(disk.total_gb);
            self.backup_disk_wear_percent = disk.wear_percent;
            self.backup_disk_temperature = disk.temperature_celsius;
        } else {
            self.backup_disk_serial = None;
            self.backup_disk_usage_percent = None;
            self.backup_disk_used_gb = None;
            self.backup_disk_total_gb = None;
            self.backup_disk_wear_percent = None;
            self.backup_disk_temperature = None;
        }
    }
 }
@@ -532,14 +511,36 @@ impl SystemWidget {
    fn render_backup(&self) -> Vec<Line<'_>> {
        let mut lines = Vec::new();
-        // First line: serial number with temperature and wear
+        // First section: Repository status and list
-        if let Some(serial) = &self.backup_disk_serial {
+        if !self.backup_repositories.is_empty() {
-            let truncated_serial = truncate_serial(serial);
+            let repo_text = format!("Repo: {}", self.backup_repositories.len());
            let repo_spans = StatusIcons::create_status_spans(self.backup_repository_status, &repo_text);
            lines.push(Line::from(repo_spans));
            // List all repositories (sorted for consistent display)
            let mut sorted_repos = self.backup_repositories.clone();
            sorted_repos.sort();
            let repo_count = sorted_repos.len();
            for (idx, repo) in sorted_repos.iter().enumerate() {
                let tree_char = if idx == repo_count - 1 { "└─" } else { "├─" };
                lines.push(Line::from(vec![
                    Span::styled(format!("  {} ", tree_char), Typography::tree()),
                    Span::styled(repo.clone(), Typography::secondary()),
                ]));
            }
        }
        // Second section: Per-disk backup information (sorted by serial for consistent display)
        let mut sorted_disks = self.backup_disks.clone();
        sorted_disks.sort_by(|a, b| a.serial.cmp(&b.serial));
        for disk in &sorted_disks {
            let truncated_serial = truncate_serial(&disk.serial);
            let mut details = Vec::new();
-            if let Some(temp) = self.backup_disk_temperature {
+
            if let Some(temp) = disk.temperature_celsius {
                details.push(format!("T: {}°C", temp as i32));
            }
-            if let Some(wear) = self.backup_disk_wear_percent {
+            if let Some(wear) = disk.wear_percent {
                details.push(format!("W: {}%", wear as i32));
            }
@@ -549,44 +550,40 @@ impl SystemWidget {
                truncated_serial
            };
-            let backup_status = match self.backup_status.as_str() {
+            // Overall disk status (worst of backup and usage)
-                "completed" | "success" => Status::Ok,
+            let disk_status = disk.backup_status.max(disk.usage_status);
-                "running" => Status::Pending,
+            let disk_spans = StatusIcons::create_status_spans(disk_status, &disk_text);
                "failed" => Status::Critical,
                _ => Status::Unknown,
            };
            let disk_spans = StatusIcons::create_status_spans(backup_status, &disk_text);
            lines.push(Line::from(disk_spans));
-            // Show backup time from TOML if available
+            // Show backup time with status
-            if let Some(start_time) = &self.backup_start_time_raw {
+            if let Some(backup_time) = &disk.last_backup_time {
-                let time_text = if let Some(size) = self.backup_last_size_gb {
+                let time_text = format!("Backup: {}", backup_time);
-                    format!("Time: {} ({:.1}GB)", start_time, size)
+                let mut time_spans = vec![
                } else {
                    format!("Time: {}", start_time)
                };
                lines.push(Line::from(vec![
                    Span::styled("  ├─ ", Typography::tree()),
-                    Span::styled(time_text, Typography::secondary())
+                ];
-                ]));
+                time_spans.extend(StatusIcons::create_status_spans(disk.backup_status, &time_text));
                lines.push(Line::from(time_spans));
            }
-            // Usage information
+            // Show usage with status and archive count
-            if let (Some(used), Some(total), Some(usage_percent)) = (
+            let archive_display = if disk.archives_min == disk.archives_max {
-                self.backup_disk_used_gb, 
+                format!("{}", disk.archives_min)
-                self.backup_disk_total_gb,
+            } else {
-                self.backup_disk_usage_percent
+                format!("{}-{}", disk.archives_min, disk.archives_max)
-            ) {
+            };
-                let usage_text = format!("Usage: {:.0}% {:.0}GB/{:.0}GB", usage_percent, used, total);
+
-                let usage_spans = StatusIcons::create_status_spans(Status::Ok, &usage_text);
+            let usage_text = format!(
-                let mut full_spans = vec![
+                "Usage: ({}) {:.0}% {:.0}GB/{:.0}GB",
-                    Span::styled("  └─ ", Typography::tree()),
+                archive_display,
-                ];
+                disk.disk_usage_percent,
-                full_spans.extend(usage_spans);
+                disk.disk_used_gb,
-                lines.push(Line::from(full_spans));
+                disk.disk_total_gb
-            }
+            );
            let mut usage_spans = vec![
                Span::styled("  └─ ", Typography::tree()),
            ];
            usage_spans.extend(StatusIcons::create_status_spans(disk.usage_status, &usage_text));
            lines.push(Line::from(usage_spans));
        }
        lines
@@ -832,10 +829,10 @@ impl SystemWidget {
        );
        lines.push(Line::from(cpu_spans));
-        let freq_text = self.format_cpu_frequency();
+        let cstate_text = self.format_cpu_cstate();
        lines.push(Line::from(vec![
            Span::styled("  └─ ", Typography::tree()),
-            Span::styled(format!("Freq: {}", freq_text), Typography::secondary())
+            Span::styled(format!("C-state: {}", cstate_text), Typography::secondary())
        ]));
        // RAM section
@@ -894,7 +891,7 @@ impl SystemWidget {
        lines.extend(storage_lines);
        // Backup section (if available)
-        if self.backup_status != "unavailable" && self.backup_status != "unknown" {
+        if !self.backup_repositories.is_empty() || !self.backup_disks.is_empty() {
            lines.push(Line::from(vec![
                Span::styled("Backup:", Typography::widget_title())
            ]));
--- a/shared/Cargo.toml
+++ b/shared/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "cm-dashboard-shared"
-version = "0.1.194"
+version = "0.1.223"
 edition = "2021"
 [dependencies]
--- a/shared/src/agent_data.rs
+++ b/shared/src/agent_data.rs
@@ -40,13 +40,20 @@ pub struct NetworkInterfaceData {
    pub vlan_id: Option<u16>,
 }
 /// CPU C-state usage information
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct CStateInfo {
    pub name: String,
    pub percent: f32,
 }
 /// CPU monitoring data
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct CpuData {
    pub load_1min: f32,
    pub load_5min: f32,
    pub load_15min: f32,
-    pub frequency_mhz: f32,
+    pub cstates: Vec<CStateInfo>, // C-state usage percentages (C1, C6, C10, etc.) - indicates CPU idle depth distribution
    pub temperature_celsius: Option<f32>,
    pub load_status: Status,
    pub temperature_status: Status,
@@ -136,11 +143,15 @@ pub struct PoolDriveData {
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct ServiceData {
    pub name: String,
    pub memory_mb: f32,
    pub disk_gb: f32,
    pub user_stopped: bool,
    pub service_status: Status,
    pub sub_services: Vec<SubServiceData>,
    /// Memory usage in bytes (from MemoryCurrent)
    pub memory_bytes: Option<u64>,
    /// Number of service restarts (from NRestarts)
    pub restart_count: Option<u32>,
    /// Uptime in seconds (calculated from ExecMainStartTimestamp)
    pub uptime_seconds: Option<u64>,
 }
 /// Sub-service data (nginx sites, docker containers, etc.)
@@ -165,23 +176,27 @@ pub struct SubServiceMetric {
 /// Backup system data
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct BackupData {
-    pub status: String,
+    pub repositories: Vec<String>,
-    pub total_size_gb: Option<f32>,
+    pub repository_status: Status,
-    pub repository_health: Option<String>,
+    pub disks: Vec<BackupDiskData>,
    pub repository_disk: Option<BackupDiskData>,
    pub last_backup_size_gb: Option<f32>,
    pub start_time_raw: Option<String>,
 }
 /// Backup repository disk information
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct BackupDiskData {
    pub serial: String,
-    pub usage_percent: f32,
+    pub product_name: Option<String>,
    pub used_gb: f32,
    pub total_gb: f32,
    pub wear_percent: Option<f32>,
    pub temperature_celsius: Option<f32>,
    pub last_backup_time: Option<String>,
    pub backup_status: Status,
    pub disk_usage_percent: f32,
    pub disk_used_gb: f32,
    pub disk_total_gb: f32,
    pub usage_status: Status,
    pub services: Vec<String>,
    pub archives_min: i64,
    pub archives_max: i64,
 }
 impl AgentData {
@@ -200,7 +215,7 @@ impl AgentData {
                    load_1min: 0.0,
                    load_5min: 0.0,
                    load_15min: 0.0,
-                    frequency_mhz: 0.0,
+                    cstates: Vec::new(),
                    temperature_celsius: None,
                    load_status: Status::Unknown,
                    temperature_status: Status::Unknown,
@@ -222,12 +237,9 @@ impl AgentData {
            },
            services: Vec::new(),
            backup: BackupData {
-                status: "unknown".to_string(),
+                repositories: Vec::new(),
-                total_size_gb: None,
+                repository_status: Status::Unknown,
-                repository_health: None,
+                disks: Vec::new(),
                repository_disk: None,
                last_backup_size_gb: None,
                start_time_raw: None,
            },
        }
    }
Author	SHA1	Message	Date
Christoffer Martinsson	8a0e68f0e3	Fix Data_3 timeout by parallelizing SMART collection All checks were successful Build and Release / build-and-release (push) Successful in 1m10s Details Root cause: SMART data was collected sequentially, one drive at a time. With 5 drives taking ~500ms each, total collection time was 2.5+ seconds. When disk collector runs every 1 second, this caused overlapping collections creating resource contention. The last drive (sda/Data_3) would timeout due to the drive being accessed by the previous collection. Solution: Query all drives in parallel using futures::join_all. Now all drives get their SMART data collected simultaneously with independent 3-second timeouts, eliminating contention and reducing total collection time from 2.5+ seconds to ~500ms (the slowest single drive). Benefits: - All drives complete in ~500ms instead of 2.5+ seconds - No overlapping collections causing resource contention - Each drive gets full 3-second timeout window - sda/Data_3 should now show temperature and serial number Bump version to v0.1.223	2025-11-29 23:51:43 +01:00
Christoffer Martinsson	2d653fe9ae	Fix empty Storage section by configuring stdio pipes All checks were successful Build and Release / build-and-release (push) Successful in 1m15s Details Root cause: run_command_with_timeout() was calling cmd.spawn() without configuring stdout/stderr pipes. This caused command output to go to journald instead of being captured by wait_with_output(). The disk collector received empty output and failed silently. Solution: Configure stdout(Stdio::piped()) and stderr(Stdio::piped()) before spawning commands. This ensures wait_with_output() can properly capture command output. Fixes: Empty Storage section, lsblk output appearing in journald Bump version to v0.1.222	2025-11-29 23:25:17 +01:00
Christoffer Martinsson	caba78004e	Fix empty Storage section by properly aliasing command types All checks were successful Build and Release / build-and-release (push) Successful in 2m6s Details v0.1.220 broke disk collector by changing the import from std::process::Command to tokio::process::Command, but lines 193 and 767 explicitly used std::process::Command::new() which silently failed. Solution: Import both as aliases (TokioCommand/StdCommand) and use appropriate type for each operation - async commands use TokioCommand with run_command_with_timeout, sync commands use StdCommand with system timeout wrapper. Fixes: Empty Storage section after v0.1.220 deployment Bump version to v0.1.221	2025-11-29 21:29:33 +01:00
Christoffer Martinsson	77bf08a978	Fix blocking smartctl commands with proper async/timeout handling All checks were successful Build and Release / build-and-release (push) Successful in 2m2s Details - Changed disk collector to use tokio::process::Command instead of std::process::Command - Updated run_command_with_timeout to properly kill processes on timeout - Fixes issue where smartctl hangs on problematic drives (/dev/sda) freezing entire agent - Timeout now force-kills hung processes using kill -9, preventing orphaned smartctl processes This resolves the issue where Data_3 showed unknown status because smartctl was hanging indefinitely trying to read from a problematic drive, blocking the entire collector. Bump version to v0.1.220 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 21:09:04 +01:00
Christoffer Martinsson	929870f8b6	Bump version to v0.1.219 All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details	2025-11-29 18:35:14 +01:00
Christoffer Martinsson	7aae852b7b	Bump version to v0.1.218 All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-29 17:59:33 +01:00
Christoffer Martinsson	40f3ff66d8	Show archive count range to detect inconsistencies - Display single number if all services have same count - Display min-max range if counts differ (indicates problem)	2025-11-29 17:59:24 +01:00
Christoffer Martinsson	1c1beddb55	Bump version to v0.1.217 All checks were successful Build and Release / build-and-release (push) Successful in 1m20s Details	2025-11-29 17:51:13 +01:00
Christoffer Martinsson	620d1f10b6	Show archive count per service instead of total sum	2025-11-29 17:51:01 +01:00
Christoffer Martinsson	a0d571a40e	Bump version to v0.1.216 All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-29 17:44:12 +01:00
Christoffer Martinsson	977200fff3	Move archive count to Usage line in backup display	2025-11-29 17:44:05 +01:00
Christoffer Martinsson	d692de5f83	Bump version to v0.1.215 All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details	2025-11-29 17:41:49 +01:00
Christoffer Martinsson	f5913dbd43	Add archive count to backup disk display	2025-11-29 17:41:11 +01:00
Christoffer Martinsson	faa30a7839	Sort backup repositories and disks for stable display All checks were successful Build and Release / build-and-release (push) Successful in 1m21s Details - Sort repositories alphabetically before rendering - Sort backup disks by serial number - Prevents display jumping between different orderings on updates - Consistent display order across refreshes Bump version to v0.1.214 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 17:15:17 +01:00
Christoffer Martinsson	6e4a42799f	Bump version to v0.1.213 All checks were successful Build and Release / build-and-release (push) Successful in 1m22s Details	2025-11-29 16:46:16 +01:00
Christoffer Martinsson	afb8d68e03	Implement multi-disk backup support - Update BackupData structure to support multiple backup disks - Scan /var/lib/backup/status/ directory for all status files - Calculate status icons for backup and disk usage - Aggregate repository status from all disks - Update dashboard to display all backup disks with per-disk status - Display repository list with count and aggregated status	2025-11-29 16:44:50 +01:00
Christoffer Martinsson	5e08b34280	Move C-state name cleaning to agent for smaller JSON All checks were successful Build and Release / build-and-release (push) Successful in 1m32s Details - Agent now extracts "C" + digits pattern (C3, C10) using char parsing - Removes suffixes like "_ACPI", "_MWAIT" at source - Reduces JSON payload size over ZMQ - No regex dependency - uses fast char iteration (~1μs overhead) - Robust fallback to original name if pattern not found - Dashboard simplified to use clean names directly Bump version to v0.1.212 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 14:05:55 +01:00
Christoffer Martinsson	0d8284b69c	Clean C-state display to show only CX format All checks were successful Build and Release / build-and-release (push) Successful in 1m18s Details - Strip suffixes like "_ACPI" from C-state names - Display changes from "C3_ACPI:51%" to "C3:51%" - Cleaner, more concise presentation Bump version to v0.1.211 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 13:34:01 +01:00
Christoffer Martinsson	d84690cb3b	Move transmission interval to ZMQ config section All checks were successful Build and Release / build-and-release (push) Successful in 1m43s Details - Changed code to use zmq.transmission_interval_seconds instead of top-level collection_interval_seconds - Removed collection_interval_seconds from AgentConfig - Updated validation to check zmq.transmission_interval_seconds - Improves config organization by grouping all ZMQ settings together Bump version to v0.1.210 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 13:31:39 +01:00
Christoffer Martinsson	7c030b33d6	Show top 3 C-states with usage percentages All checks were successful Build and Release / build-and-release (push) Successful in 1m21s Details - Changed CpuData.cstate from String to Vec<CStateInfo> - Added CStateInfo struct with name and percent fields - Collector calculates percentage for each C-state based on accumulated time - Sorts and returns top 3 C-states by usage - Dashboard displays: "C10:79% C8:10% C6:8%" Provides better visibility into CPU idle state distribution. Bump version to v0.1.209 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-28 23:45:46 +01:00
Christoffer Martinsson	c6817537a8	Replace CPU frequency with C-state monitoring All checks were successful Build and Release / build-and-release (push) Successful in 1m20s Details - Changed CpuData.frequency_mhz to CpuData.cstate (String) - Implemented collect_cstate() to read CPU idle depth from sysfs - Finds deepest C-state with most accumulated time (C0-C10) - Updated dashboard to display C-state instead of frequency - More accurate indicator of CPU activity vs power management Bump version to v0.1.208 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-28 23:30:14 +01:00
Christoffer Martinsson	2189d34b16	Bump version to v0.1.207 All checks were successful Build and Release / build-and-release (push) Successful in 1m9s Details	2025-11-28 23:16:33 +01:00
Christoffer Martinsson	28cfd5758f	Fix service metrics not showing - remove cache check The service_status_cache from discovery only has active_state with all detailed metrics set to None. During collection, get_service_status() was returning cached data instead of fetching fresh systemctl show data. Now always fetch fresh data to populate memory_bytes, restart_count, and uptime_seconds properly.	2025-11-28 23:15:51 +01:00
Christoffer Martinsson	5deb8cf8d8	Bump version to v0.1.206 All checks were successful Build and Release / build-and-release (push) Successful in 1m10s Details	2025-11-28 23:07:20 +01:00
Christoffer Martinsson	0e01813ff5	Add service metrics from systemctl (memory, uptime, restarts) Shared: - Add memory_bytes, restart_count, uptime_seconds to ServiceData Agent: - Add new fields to ServiceStatusInfo struct - Fetch MemoryCurrent, NRestarts, ExecMainStartTimestamp from systemctl show - Calculate uptime from start timestamp - Parse and populate new fields in ServiceData - Remove unused load_state and sub_state fields Dashboard: - Add memory_bytes, restart_count, uptime_seconds to ServiceInfo - Update header: Service, Status, RAM, Uptime, ↻ (restarts) - Format memory as MB/GB - Format uptime as Xd Xh, Xh Xm, or Xm - Show restart count with ! prefix if > 0 to indicate instability All metrics obtained from single systemctl show call - zero overhead.	2025-11-28 23:06:13 +01:00
Christoffer Martinsson	c3c9507a42	Bump version to v0.1.205 All checks were successful Build and Release / build-and-release (push) Successful in 1m22s Details	2025-11-28 22:37:28 +01:00
Christoffer Martinsson	4d77ffe17e	Remove RAM and Disk columns from services widget header Changed header from 4 columns to 2 columns: - Before: Service, Status, RAM, Disk - After: Service, Status Matches the removal of memory_mb and disk_gb fields.	2025-11-28 22:37:14 +01:00
Christoffer Martinsson	14f74b4cac	Bump version to v0.1.204 All checks were successful Build and Release / build-and-release (push) Successful in 1m20s Details	2025-11-28 14:33:19 +01:00
Christoffer Martinsson	67b686f8c7	Remove RAM and disk collection for services Complete removal of service resource metrics: Agent: - Remove memory_mb and disk_gb fields from ServiceData struct - Remove get_service_memory_usage() method - Remove get_service_disk_usage() method - Remove get_directory_size() method - Remove unused warn import Dashboard: - Remove memory_mb and disk_gb from ServiceInfo struct - Remove memory/disk display from format_parent_service_line - Remove memory/disk parsing in legacy metric path - Remove unused format_disk_size() function Service resource metrics were slow, unreliable, and never worked properly since structured data migration. Will be handled differently in the future.	2025-11-28 14:25:12 +01:00
Christoffer Martinsson	e3996fdb84	Fix compilation errors from command receiver removal All checks were successful Build and Release / build-and-release (push) Successful in 1m8s Details - Remove AgentCommand import from agent.rs - Remove handle_commands() method - Remove command handling from main loop - Remove command_port validation checks	2025-11-28 13:01:36 +01:00
Christoffer Martinsson	f94ca60e69	Bump version to v0.1.203 Some checks failed Build and Release / build-and-release (push) Failing after 1m36s Details	2025-11-28 12:53:56 +01:00
Christoffer Martinsson	c19ff56df8	Remove unused ZMQ command receiver (port 6131) Service control migrated to SSH, command receiver no longer needed. - Remove command_receiver Socket from ZmqHandler - Remove try_receive_command method - Remove AgentCommand enum - Remove command_port from ZmqConfig	2025-11-28 12:52:43 +01:00
Christoffer Martinsson	fe2f604703	Bump version to v0.1.202 All checks were successful Build and Release / build-and-release (push) Successful in 1m8s Details	2025-11-28 12:45:25 +01:00
Christoffer Martinsson	8bfd416327	Revert to v0.1.192 - fix agent hang issue Some checks failed Build and Release / build-and-release (push) Failing after 1m8s Details	2025-11-28 12:42:24 +01:00
Christoffer Martinsson	85c6c624fb	Revert D-Bus usage, use systemctl commands only All checks were successful Build and Release / build-and-release (push) Successful in 1m20s Details - Remove zbus dependency from agent - Replace D-Bus Connection calls with systemctl show commands - Fix agent hang by eliminating blocking D-Bus operations - get_unit_property now uses systemctl show with property flags - Memory, disk usage, and nginx config queries use systemctl - Simpler, more reliable service monitoring	2025-11-28 12:15:04 +01:00
Christoffer Martinsson	eab3f17428	Fix agent hang by reverting service discovery to systemctl All checks were successful Build and Release / build-and-release (push) Successful in 1m31s Details The D-Bus ListUnits call in discover_services_internal() was causing the agent to hang on startup. Root cause: - D-Bus ListUnits call with complex tuple destructuring hung indefinitely - Agent never completed first collection cycle - No collector output in logs Fix: - Revert discover_services_internal() to use systemctl list-units/list-unit-files - Keep D-Bus-based property queries (WorkingDirectory, MemoryCurrent, ExecStart) - Hybrid approach: systemctl for discovery, D-Bus for individual queries External commands still used: - systemctl list-units, list-unit-files (service discovery) - smartctl (SMART data) - sudo du (directory sizes) - nginx -T (config fallback) Version bump: 0.1.198 → 0.1.199	2025-11-28 11:57:31 +01:00
Christoffer Martinsson	7ad149bbe4	Replace all systemctl commands with zbus D-Bus API All checks were successful Build and Release / build-and-release (push) Successful in 1m31s Details Complete migration from systemctl subprocess calls to native D-Bus communication: Removed systemctl commands: - systemctl is-active (fallback) - use D-Bus cache from ListUnits - systemctl show --property=LoadState,ActiveState,SubState - use D-Bus cache - systemctl show --property=WorkingDirectory - use D-Bus Properties.Get - systemctl show --property=MemoryCurrent - use D-Bus Properties.Get - systemctl show nginx --property=ExecStart - use D-Bus Properties.Get Implementation details: - Added get_unit_property() helper for D-Bus property access - Made get_nginx_site_metrics() async to support D-Bus calls - Made get_nginx_sites_internal() async - Made discover_nginx_sites() async - Made get_nginx_config_from_systemd() async - Fixed RwLock guard Send issues by using scoped locks Remaining external commands: - smartctl (disk.rs) - No Rust alternative for SMART data - sudo du (systemd.rs) - Directory size measurement - nginx -T (systemd.rs) - Nginx config fallback - timeout hostname (nixos.rs) - Rare fallback only Version bump: 0.1.197 → 0.1.198	2025-11-28 11:46:28 +01:00
Christoffer Martinsson	b444c88ea0	Replace external commands with native Rust APIs All checks were successful Build and Release / build-and-release (push) Successful in 1m54s Details Significant performance improvements by eliminating subprocess spawning: - Replace 'ip' commands with rtnetlink for network interface discovery - Replace 'docker ps/images' with bollard Docker API client - Replace 'systemctl list-units' with zbus D-Bus for systemd interaction - Replace 'df' with statvfs() syscall for filesystem statistics - Replace 'lsblk' with /proc/mounts parsing Add interval-based caching to collectors: - DiskCollector now respects interval_seconds configuration - SystemdCollector now respects interval_seconds configuration - CpuCollector now respects interval_seconds configuration Remove unused command communication infrastructure: - Remove port 6131 ZMQ command receiver - Clean up unused AgentCommand types Dependencies added: - rtnetlink = "0.14" - netlink-packet-route = "0.19" - bollard = "0.17" - zbus = "4.0" - nix (fs features for statvfs)	2025-11-28 11:27:33 +01:00
Christoffer Martinsson	317cf76bd1	Bump version to v0.1.196 All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-27 23:16:40 +01:00
Christoffer Martinsson	0db1a165b9	Revert "Implement cached collector architecture with configurable timeouts" This reverts commit `2740de9b54`.	2025-11-27 23:12:08 +01:00
Christoffer Martinsson	3c2955376d	Revert "Fix ZMQ sender blocking - move to independent thread with try_read" This reverts commit `01e1f33b66`.	2025-11-27 23:10:55 +01:00
Christoffer Martinsson	f09ccabc7f	Revert "Fix data duplication in cached collector architecture" This reverts commit `14618c59c6`.	2025-11-27 23:09:40 +01:00
Christoffer Martinsson	43dd5a901a	Update CLAUDE.md with correct ZMQ sender architecture	2025-11-27 22:59:38 +01:00
Christoffer Martinsson	01e1f33b66	Fix ZMQ sender blocking - move to independent thread with try_read All checks were successful Build and Release / build-and-release (push) Successful in 1m21s Details CRITICAL FIX: The previous cached collector architecture still had ZMQ sending in the main event loop, where it could block waiting for RwLock when collectors were writing. This caused the 3-8 second delays you observed. Changes: - Move ZMQ publisher to dedicated std::thread (ZMQ sockets aren't thread-safe) - Use try_read() instead of read() to avoid blocking on write locks - Send previous data if cache is locked by collector - ZMQ now sends every 2s regardless of collector timing - Remove publisher from ZmqHandler (now only handles commands) Architecture: - Collectors: Independent tokio tasks updating shared cache - ZMQ Sender: Dedicated OS thread with its own publisher socket - Main Loop: Only handles commands and notifications This ensures ZMQ transmission is NEVER blocked by slow collectors. Bump version to v0.1.195	2025-11-27 22:56:58 +01:00