Christoffer Martinsson 8a36472a3d Implement real-time process monitoring and fix UI hardcoded data

This commit addresses several key issues identified during development:

Major Changes:
- Replace hardcoded top CPU/RAM process display with real system data
- Add intelligent process monitoring to CpuCollector using ps command
- Fix disk metrics permission issues in systemd collector
- Optimize service collection to focus on status, memory, and disk only
- Update dashboard widgets to display live process information

Process Monitoring Implementation:
- Added collect_top_cpu_process() and collect_top_ram_process() methods
- Implemented ps-based monitoring with accurate CPU percentages
- Added filtering to prevent self-monitoring artifacts (ps commands)
- Enhanced error handling and validation for process data
- Dashboard now shows realistic values like "claude (PID 2974) 11.0%"

Service Collection Optimization:
- Removed CPU monitoring from systemd collector for efficiency
- Enhanced service directory permission error logging
- Simplified services widget to show essential metrics only
- Fixed service-to-directory mapping accuracy

UI and Dashboard Improvements:
- Reorganized dashboard layout with btop-inspired multi-panel design
- Updated system panel to include real top CPU/RAM process display
- Enhanced widget formatting and data presentation
- Removed placeholder/hardcoded data throughout the interface

Technical Details:
- Updated agent/src/collectors/cpu.rs with process monitoring
- Modified dashboard/src/ui/mod.rs for real-time process display
- Enhanced systemd collector error handling and disk metrics
- Updated CLAUDE.md documentation with implementation details

2025-10-16 23:55:05 +02:00

26 KiB

Raw Blame History

CM Dashboard Agent Architecture

Overview

This document defines the architecture for the CM Dashboard Agent. The agent collects individual metrics and sends them to the dashboard via ZMQ. The dashboard decides which metrics to use in which widgets.

Core Philosophy

Individual Metrics Approach: The agent collects and transmits individual metrics (e.g., cpu_load_1min, memory_usage_percent, backup_last_run) rather than grouped metric structures. This provides maximum flexibility for dashboard widget composition.

Folder Structure

cm-dashboard/
├── agent/                         # Agent application
│   ├── Cargo.toml
│   ├── src/
│   │   ├── main.rs                    # Entry point with CLI parsing
│   │   ├── agent.rs                   # Main Agent orchestrator
│   │   ├── config/
│   │   │   ├── mod.rs                 # Configuration module exports
│   │   │   ├── loader.rs              # TOML configuration loading
│   │   │   ├── defaults.rs            # Default configuration values
│   │   │   └── validation.rs          # Configuration validation
│   │   ├── communication/
│   │   │   ├── mod.rs                 # Communication module exports
│   │   │   ├── zmq_config.rs          # ZMQ configuration structures
│   │   │   ├── zmq_handler.rs         # ZMQ socket management
│   │   │   ├── protocol.rs            # Message format definitions
│   │   │   └── error.rs               # Communication errors
│   │   ├── metrics/
│   │   │   ├── mod.rs                 # Metrics module exports
│   │   │   ├── registry.rs            # Metric name registry and types
│   │   │   ├── value.rs               # Metric value types and status
│   │   │   ├── cache.rs               # Individual metric caching
│   │   │   └── collection.rs          # Metric collection storage
│   │   ├── collectors/
│   │   │   ├── mod.rs                 # Collector trait definition
│   │   │   ├── cpu.rs                 # CPU-related metrics
│   │   │   ├── memory.rs              # Memory-related metrics
│   │   │   ├── disk.rs                # Disk usage metrics
│   │   │   ├── processes.rs           # Process-related metrics
│   │   │   ├── systemd.rs             # Systemd service metrics
│   │   │   ├── smart.rs               # Storage SMART metrics
│   │   │   ├── backup.rs              # Backup status metrics
│   │   │   ├── network.rs             # Network metrics
│   │   │   └── error.rs               # Collector errors
│   │   ├── notifications/
│   │   │   ├── mod.rs                 # Notification exports
│   │   │   ├── manager.rs             # Status change detection
│   │   │   ├── email.rs               # Email notification backend
│   │   │   └── status_tracker.rs     # Individual metric status tracking
│   │   └── utils/
│   │       ├── mod.rs                 # Utility exports
│   │       ├── system.rs              # System command utilities
│   │       ├── time.rs                # Timestamp utilities
│   │       └── discovery.rs          # Auto-discovery functions
│   ├── config/
│   │   ├── agent.example.toml         # Example configuration
│   │   └── production.toml            # Production template
│   └── tests/
│       ├── integration/               # Integration tests
│       ├── unit/                      # Unit tests by module
│       └── fixtures/                  # Test data and mocks
├── dashboard/                     # Dashboard application
│   ├── Cargo.toml
│   ├── src/
│   │   ├── main.rs                    # Entry point with CLI parsing
│   │   ├── app.rs                     # Main Dashboard application state
│   │   ├── config/
│   │   │   ├── mod.rs                 # Configuration module exports
│   │   │   ├── loader.rs              # TOML configuration loading
│   │   │   └── defaults.rs            # Default configuration values
│   │   ├── communication/
│   │   │   ├── mod.rs                 # Communication module exports
│   │   │   ├── zmq_consumer.rs        # ZMQ metric consumer
│   │   │   ├── protocol.rs            # Shared message protocol
│   │   │   └── error.rs               # Communication errors
│   │   ├── metrics/
│   │   │   ├── mod.rs                 # Metrics module exports
│   │   │   ├── store.rs               # Metric storage and retrieval
│   │   │   ├── filter.rs              # Metric filtering and selection
│   │   │   ├── history.rs             # Historical metric storage
│   │   │   └── subscription.rs        # Metric subscription management
│   │   ├── ui/
│   │   │   ├── mod.rs                 # UI module exports
│   │   │   ├── app.rs                 # Main UI application loop
│   │   │   ├── layout.rs              # Layout management
│   │   │   ├── widgets/
│   │   │   │   ├── mod.rs             # Widget exports
│   │   │   │   ├── base.rs            # Base widget trait
│   │   │   │   ├── cpu.rs             # CPU metrics widget
│   │   │   │   ├── memory.rs          # Memory metrics widget
│   │   │   │   ├── storage.rs         # Storage metrics widget
│   │   │   │   ├── services.rs        # Services metrics widget
│   │   │   │   ├── backup.rs          # Backup metrics widget
│   │   │   │   ├── hosts.rs           # Host selection widget
│   │   │   │   └── alerts.rs          # Alerts/status widget
│   │   │   ├── theme.rs               # UI theming and colors
│   │   │   └── input.rs               # Input handling
│   │   ├── hosts/
│   │   │   ├── mod.rs                 # Host management exports
│   │   │   ├── manager.rs             # Host connection management
│   │   │   ├── discovery.rs           # Host auto-discovery
│   │   │   └── connection.rs          # Individual host connections
│   │   └── utils/
│   │       ├── mod.rs                 # Utility exports
│   │       ├── formatting.rs          # Data formatting utilities
│   │       └── time.rs                # Time formatting utilities
│   ├── config/
│   │   ├── dashboard.example.toml     # Example configuration
│   │   └── hosts.example.toml         # Example host configuration
│   └── tests/
│       ├── integration/               # Integration tests
│       ├── unit/                      # Unit tests by module
│       └── fixtures/                  # Test data and mocks
├── shared/                        # Shared types and utilities
│   ├── Cargo.toml
│   ├── src/
│   │   ├── lib.rs                     # Shared library exports
│   │   ├── protocol.rs                # Shared message protocol
│   │   ├── metrics.rs                 # Shared metric types
│   │   └── error.rs                   # Shared error types
└── tests/                         # End-to-end tests
    ├── e2e/                           # End-to-end test scenarios
    └── fixtures/                      # Shared test data

Architecture Principles

1. Individual Metrics Philosophy

No Grouped Structures: Instead of SystemMetrics or BackupMetrics, we collect individual metrics:

// Good - Individual metrics
"cpu_load_1min" -> 2.5
"cpu_load_5min" -> 2.8
"cpu_temperature" -> 45.0
"memory_usage_percent" -> 78.5
"memory_total_gb" -> 32.0
"disk_root_usage_percent" -> 15.2
"service_ssh_status" -> "active"
"backup_last_run_timestamp" -> 1697123456

// Bad - Grouped structures
SystemMetrics { cpu: {...}, memory: {...} }

Dashboard Flexibility: The dashboard consumes individual metrics and decides which ones to display in each widget.

2. Metric Definition

Each metric has:

Name: Unique identifier (e.g., cpu_load_1min)
Value: Typed value (f32, i64, String, bool)
Status: Health status (ok, warning, critical, unknown)
Timestamp: When the metric was collected
Metadata: Optional description, units, etc.

3. Module Responsibilities

Communication: ZMQ protocol and message handling
Metrics: Value types, caching, and storage
Collectors: Gather specific metrics from system
Notifications: Track status changes across all metrics
Config: Configuration loading and validation

4. Data Flow

Collectors → Individual Metrics → Cache → ZMQ → Dashboard
     ↓              ↓                ↓
Status Calc → Status Tracker → Notifications

Metric Design Rules

1. Naming Convention

Metrics follow hierarchical naming:

{category}_{subcategory}_{property}_{unit}

Examples:
cpu_load_1min
cpu_temperature_celsius  
memory_usage_percent
memory_total_gb
disk_root_usage_percent
disk_nvme0_temperature_celsius
service_ssh_status
service_ssh_memory_mb
backup_last_run_timestamp
backup_status
network_eth0_rx_bytes

2. Value Types

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum MetricValue {
    Float(f32),
    Integer(i64),
    String(String),
    Boolean(bool),
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum Status {
    Ok,
    Warning, 
    Critical,
    Unknown,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Metric {
    pub name: String,
    pub value: MetricValue,
    pub status: Status,
    pub timestamp: u64,
    pub description: Option<String>,
    pub unit: Option<String>,
}

3. Collector Interface

Each collector provides individual metrics:

#[async_trait]
pub trait Collector {
    fn name(&self) -> &str;
    async fn collect(&self) -> Result<Vec<Metric>>;
}

// Example CPU collector output:
vec![
    Metric { name: "cpu_load_1min", value: Float(2.5), status: Ok, ... },
    Metric { name: "cpu_load_5min", value: Float(2.8), status: Ok, ... },
    Metric { name: "cpu_temperature", value: Float(45.0), status: Ok, ... },
]

Communication Protocol

ZMQ Message Format

#[derive(Debug, Serialize, Deserialize)]
pub struct MetricMessage {
    pub hostname: String,
    pub timestamp: u64,
    pub metrics: Vec<Metric>,
}

ZMQ Configuration

#[derive(Debug, Deserialize)]
pub struct ZmqConfig {
    pub publisher_port: u16,      // Default: 6130
    pub command_port: u16,        // Default: 6131  
    pub bind_address: String,     // Default: "0.0.0.0"
    pub timeout_ms: u64,          // Default: 5000
    pub heartbeat_interval: u64,  // Default: 30000
}

Caching Strategy

Configuration-Based Individual Metric Cache

pub struct MetricCache {
    cache: HashMap<String, CachedMetric>,
    config: CacheConfig,
}

struct CachedMetric {
    metric: Metric,
    collected_at: Instant,
    access_count: u64,
    cache_tier: CacheTier,
}

#[derive(Debug, Deserialize)]
pub struct CacheConfig {
    pub enabled: bool,
    pub default_ttl_seconds: u64,
    pub max_entries: usize,
    pub metric_tiers: HashMap<String, CacheTier>,
}

#[derive(Debug, Deserialize, Clone)]
pub struct CacheTier {
    pub interval_seconds: u64,
    pub description: String,
}

Configuration-Based Caching Rules:

Each metric type has configurable cache intervals via config files
Cache tiers defined in configuration, not hardcoded
Individual metrics cached by name with tier-specific TTL
Cache miss triggers single metric collection
No grouped cache invalidation
Performance target: <2% CPU usage through intelligent caching

Configuration System

Configuration Structure

[zmq]
publisher_port = 6130
command_port = 6131
bind_address = "0.0.0.0"
timeout_ms = 5000

[cache]
enabled = true
default_ttl_seconds = 30
max_entries = 10000

# Cache tiers for different metric types
[cache.tiers.realtime]
interval_seconds = 5
description = "High-frequency metrics (CPU load, memory usage)"

[cache.tiers.fast]
interval_seconds = 30
description = "Medium-frequency metrics (network stats, process lists)"

[cache.tiers.medium]
interval_seconds = 300
description = "Low-frequency metrics (service status, disk usage)"

[cache.tiers.slow]
interval_seconds = 900
description = "Very low-frequency metrics (SMART data, backup status)"

[cache.tiers.static]
interval_seconds = 3600
description = "Rarely changing metrics (hardware info, system capabilities)"

# Metric type to tier mapping
[cache.metric_assignments]
"cpu_load_*" = "realtime"
"memory_usage_*" = "realtime"
"service_*_cpu_percent" = "realtime"
"service_*_memory_mb" = "realtime"
"service_*_status" = "medium"
"service_*_disk_gb" = "medium"
"disk_*_temperature" = "slow"
"disk_*_wear_percent" = "slow"
"backup_*" = "slow"
"network_*" = "fast"

[collectors.cpu]
enabled = true
interval_seconds = 5
temperature_warning = 70.0
temperature_critical = 80.0
load_warning = 5.0
load_critical = 8.0

[collectors.memory]
enabled = true
interval_seconds = 5
usage_warning_percent = 80.0
usage_critical_percent = 95.0

[collectors.systemd]
enabled = true
interval_seconds = 30
services = ["ssh", "nginx", "docker", "gitea"]

[notifications]
enabled = true
smtp_host = "localhost"
smtp_port = 25
from_email = "{{hostname}}@cmtec.se"
to_email = "cm@cmtec.se"
rate_limit_minutes = 30

Implementation Guidelines

1. Adding New Metrics

// 1. Define metric names in registry
pub const NETWORK_ETH0_RX_BYTES: &str = "network_eth0_rx_bytes";
pub const NETWORK_ETH0_TX_BYTES: &str = "network_eth0_tx_bytes";

// 2. Implement collector
pub struct NetworkCollector {
    config: NetworkConfig,
}

impl Collector for NetworkCollector {
    async fn collect(&self) -> Result<Vec<Metric>> {
        vec![
            Metric {
                name: NETWORK_ETH0_RX_BYTES.to_string(),
                value: MetricValue::Integer(rx_bytes),
                status: Status::Ok,
                timestamp: now(),
                unit: Some("bytes".to_string()),
                ..Default::default()
            },
            // ... more metrics
        ]
    }
}

// 3. Register in agent
agent.register_collector(Box::new(NetworkCollector::new(config.network)));

2. Status Calculation

Each collector calculates status for its metrics:

impl CpuCollector {
    fn calculate_temperature_status(&self, temp: f32) -> Status {
        if temp >= self.config.critical_threshold {
            Status::Critical
        } else if temp >= self.config.warning_threshold {
            Status::Warning
        } else {
            Status::Ok
        }
    }
}

3. Dashboard Usage

Dashboard widgets subscribe to specific metrics:

// Dashboard CPU widget
let cpu_metrics = [
    "cpu_load_1min",
    "cpu_load_5min", 
    "cpu_load_15min",
    "cpu_temperature",
];

// Dashboard memory widget  
let memory_metrics = [
    "memory_usage_percent",
    "memory_total_gb",
    "memory_available_gb",
];

Dashboard Architecture

Dashboard Principles

1. UI Layout Preservation

Current UI Layout Maintained: The existing dashboard UI layout is preserved and enhanced with the new metric-centric architecture. All current widgets remain in their established positions and functionality.

Widget Enhancement, Not Replacement: Widgets are enhanced to consume individual metrics rather than grouped structures, but maintain their visual appearance and user interaction patterns.

Each widget subscribes to specific individual metrics and composes them for display:

// CPU Widget Metrics
const CPU_WIDGET_METRICS: &[&str] = &[
    "cpu_load_1min",
    "cpu_load_5min", 
    "cpu_load_15min",
    "cpu_temperature_celsius",
    "cpu_frequency_mhz",
    "cpu_usage_percent",
];

// Memory Widget Metrics
const MEMORY_WIDGET_METRICS: &[&str] = &[
    "memory_usage_percent",
    "memory_total_gb",
    "memory_available_gb",
    "memory_used_gb",
    "memory_swap_total_gb",
    "memory_swap_used_gb",
];

// Storage Widget Metrics
const STORAGE_WIDGET_METRICS: &[&str] = &[
    "disk_nvme0_temperature_celsius",
    "disk_nvme0_wear_percent",
    "disk_nvme0_spare_percent",
    "disk_nvme0_hours",
    "disk_nvme0_capacity_gb",
    "disk_nvme0_usage_gb",
    "disk_nvme0_usage_percent",
];

// Services Widget Metrics  
const SERVICES_WIDGET_METRICS: &[&str] = &[
    "service_ssh_status",
    "service_ssh_memory_mb",
    "service_ssh_cpu_percent",
    "service_nginx_status",
    "service_nginx_memory_mb",
    "service_docker_status",
    // ... per discovered service
];

// Backup Widget Metrics
const BACKUP_WIDGET_METRICS: &[&str] = &[
    "backup_last_run_timestamp",
    "backup_status",
    "backup_size_gb",
    "backup_duration_minutes",
    "backup_next_scheduled_timestamp",
];

Dashboard Communication

ZMQ Consumer Architecture

// dashboard/src/communication/zmq_consumer.rs
pub struct ZmqConsumer {
    subscriber: Socket,
    config: ZmqConfig,
    metric_filter: MetricFilter,
}

impl ZmqConsumer {
    pub async fn subscribe_to_host(&mut self, hostname: &str) -> Result<()>
    pub async fn receive_metrics(&mut self) -> Result<Vec<Metric>>
    pub fn set_metric_filter(&mut self, filter: MetricFilter)
    pub async fn request_metrics(&self, metric_names: &[String]) -> Result<()>
}

#[derive(Debug, Clone)]
pub struct MetricFilter {
    pub include_patterns: Vec<String>,
    pub exclude_patterns: Vec<String>,
    pub hosts: Vec<String>,
}

Protocol Compatibility

The dashboard uses the same protocol as defined in the agent:

// shared/src/protocol.rs (shared between agent and dashboard)
#[derive(Debug, Serialize, Deserialize)]
pub struct MetricMessage {
    pub hostname: String,
    pub timestamp: u64,
    pub metrics: Vec<Metric>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Metric {
    pub name: String,
    pub value: MetricValue,
    pub status: Status,
    pub timestamp: u64,
    pub description: Option<String>,
    pub unit: Option<String>,
}

Dashboard Metric Management

Metric Store

// dashboard/src/metrics/store.rs
pub struct MetricStore {
    current_metrics: HashMap<String, HashMap<String, Metric>>, // host -> metric_name -> metric
    historical_metrics: HistoricalStore,
    subscriptions: SubscriptionManager,
}

impl MetricStore {
    pub fn update_metrics(&mut self, hostname: &str, metrics: Vec<Metric>)
    pub fn get_metric(&self, hostname: &str, metric_name: &str) -> Option<&Metric>
    pub fn get_metrics_for_widget(&self, hostname: &str, widget: WidgetType) -> Vec<&Metric>
    pub fn get_hosts(&self) -> Vec<String>
    pub fn get_latest_timestamp(&self, hostname: &str) -> Option<u64>
}

Metric Subscription Management

// dashboard/src/metrics/subscription.rs
pub struct SubscriptionManager {
    widget_subscriptions: HashMap<WidgetType, Vec<String>>,
    active_hosts: HashSet<String>,
    metric_filters: HashMap<String, MetricFilter>,
}

impl SubscriptionManager {
    pub fn subscribe_widget(&mut self, widget: WidgetType, metrics: &[String])
    pub fn get_required_metrics(&self) -> Vec<String>
    pub fn add_host(&mut self, hostname: String)
    pub fn remove_host(&mut self, hostname: &str)
    pub fn is_metric_needed(&self, metric_name: &str) -> bool
}

// dashboard/src/ui/widgets/base.rs
pub trait Widget {
    fn widget_type(&self) -> WidgetType;
    fn required_metrics(&self) -> &[&str];
    fn update_metrics(&mut self, metrics: &HashMap<String, Metric>);
    fn render(&self, frame: &mut Frame, area: Rect);
    fn handle_input(&mut self, event: &Event) -> bool;
    fn get_status(&self) -> Status;
}

#[derive(Debug, Clone, Copy, Hash, Eq, PartialEq)]
pub enum WidgetType {
    Cpu,
    Memory, 
    Storage,
    Services,
    Backup,
    Hosts,
    Alerts,
}

// dashboard/src/ui/widgets/cpu.rs
pub struct CpuWidget {
    metrics: HashMap<String, Metric>,
    config: CpuWidgetConfig,
}

impl Widget for CpuWidget {
    fn required_metrics(&self) -> &[&str] {
        CPU_WIDGET_METRICS
    }
    
    fn update_metrics(&mut self, metrics: &HashMap<String, Metric>) {
        // Update only the metrics this widget cares about
        for &metric_name in self.required_metrics() {
            if let Some(metric) = metrics.get(metric_name) {
                self.metrics.insert(metric_name.to_string(), metric.clone());
            }
        }
    }
    
    fn render(&self, frame: &mut Frame, area: Rect) {
        // Extract specific metric values for display
        let load_1min = self.get_metric_value("cpu_load_1min").unwrap_or(0.0);
        let load_5min = self.get_metric_value("cpu_load_5min").unwrap_or(0.0);
        let temperature = self.get_metric_value("cpu_temperature_celsius");
        
        // Maintain existing UI layout and styling
        // ... render implementation preserving current appearance
    }
    
    fn get_status(&self) -> Status {
        // Aggregate status from individual metric statuses
        self.metrics.values()
            .map(|m| &m.status)
            .max()
            .copied()
            .unwrap_or(Status::Unknown)
    }
}

Host Management

Multi-Host Connection Management

// dashboard/src/hosts/manager.rs
pub struct HostManager {
    connections: HashMap<String, HostConnection>,
    discovery: HostDiscovery,
    active_host: Option<String>,
    metric_store: Arc<Mutex<MetricStore>>,
}

impl HostManager {
    pub async fn discover_hosts(&mut self) -> Result<Vec<String>>
    pub async fn connect_to_host(&mut self, hostname: &str) -> Result<()>
    pub fn disconnect_from_host(&mut self, hostname: &str)
    pub fn set_active_host(&mut self, hostname: String)
    pub fn get_active_host(&self) -> Option<&str>
    pub fn get_connected_hosts(&self) -> Vec<&str>
    pub async fn refresh_all_hosts(&mut self) -> Result<()>
}

// dashboard/src/hosts/connection.rs
pub struct HostConnection {
    hostname: String,
    zmq_consumer: ZmqConsumer,
    last_seen: Instant,
    connection_status: ConnectionStatus,
    metric_buffer: VecDeque<Metric>,
}

#[derive(Debug, Clone)]
pub enum ConnectionStatus {
    Connected,
    Connecting,
    Disconnected,
    Error(String),
}

Configuration Integration

Dashboard Configuration

# dashboard/config/dashboard.toml
[zmq]
subscriber_ports = [6130]  # Ports to listen on for metrics
connection_timeout_ms = 15000
reconnect_interval_ms = 5000

[ui]
refresh_rate_ms = 100
theme = "default"
preserve_layout = true

[hosts]
auto_discovery = true
predefined_hosts = ["cmbox", "labbox", "simonbox", "steambox", "srv01"]
default_host = "cmbox"

[metrics]
history_retention_hours = 24
max_metrics_per_host = 10000

[widgets.cpu]
enabled = true
metrics = [
    "cpu_load_1min",
    "cpu_load_5min", 
    "cpu_load_15min",
    "cpu_temperature_celsius"
]

[widgets.memory]
enabled = true
metrics = [
    "memory_usage_percent",
    "memory_total_gb",
    "memory_available_gb"
]

[widgets.storage]
enabled = true  
metrics = [
    "disk_nvme0_temperature_celsius",
    "disk_nvme0_wear_percent",
    "disk_nvme0_usage_percent"
]

UI Layout Preservation Rules

CPU widget: Top-left position preserved
Memory widget: Top-right position preserved
Storage widget: Left-center position preserved
Services widget: Right-center position preserved
Backup widget: Bottom-right position preserved
Host navigation: Bottom status bar preserved

2. Preserve Visual Styling

Colors: Existing status colors (green, yellow, red) maintained
Borders: Current border styles and characters preserved
Text formatting: Font styles, alignment, and spacing preserved
Progress bars: Current progress bar implementations maintained

3. Maintain User Interactions

Navigation keys: ←→ for host switching preserved
Refresh key: r for manual refresh preserved
Quit key: q for exit preserved
Additional keys: All current keyboard shortcuts maintained

4. Status Display Consistency

Status aggregation: Widget-level status calculated from individual metric statuses
Color mapping: Status enum maps to existing color scheme
Status indicators: Current status display format preserved

Implementation Migration Strategy

Phase 1: Shared Types

Create shared/ crate with common protocol and metric types
Update both agent and dashboard to use shared types

Phase 2: Agent Migration

Implement new agent architecture with individual metrics
Maintain backward compatibility during transition

Phase 3: Dashboard Migration

Update dashboard to consume individual metrics
Preserve all existing UI layouts and interactions
Enhance widgets with new metric subscription system

Phase 4: Integration Testing

End-to-end testing with real multi-host scenarios
Performance validation and optimization
UI/UX validation to ensure no regressions

Benefits of This Architecture

Maximum Flexibility: Dashboard can compose any widget from any metrics
Easy Extension: Adding new metrics doesn't affect existing code
Granular Caching: Cache individual metrics based on collection cost
Simple Testing: Test individual metric collection in isolation
Clear Separation: Agent collects, dashboard consumes and displays
Efficient Updates: Only send changed metrics to dashboard

Future Extensions

Metric Filtering: Dashboard requests only needed metrics
Historical Storage: Store metric history for trending
Metric Aggregation: Calculate derived metrics from base metrics
Dynamic Discovery: Auto-discover new metric sources
Metric Validation: Validate metric values and ranges

26 KiB Raw Blame History