cm-dashboard/ARCHITECT.md

# CM Dashboard Agent Architecture

## Overview

This document defines the architecture for the CM Dashboard Agent. The agent collects individual metrics and sends them to the dashboard via ZMQ. The dashboard decides which metrics to use in which widgets.

## Core Philosophy

**Individual Metrics Approach**: The agent collects and transmits individual metrics (e.g., `cpu_load_1min`, `memory_usage_percent`, `backup_last_run`) rather than grouped metric structures. This provides maximum flexibility for dashboard widget composition.

## Folder Structure

```
cm-dashboard/
├── agent/                         # Agent application
│   ├── Cargo.toml
│   ├── src/
│   │   ├── main.rs                    # Entry point with CLI parsing
│   │   ├── agent.rs                   # Main Agent orchestrator
│   │   ├── config/
│   │   │   ├── mod.rs                 # Configuration module exports
│   │   │   ├── loader.rs              # TOML configuration loading
│   │   │   ├── defaults.rs            # Default configuration values
│   │   │   └── validation.rs          # Configuration validation
│   │   ├── communication/
│   │   │   ├── mod.rs                 # Communication module exports
│   │   │   ├── zmq_config.rs          # ZMQ configuration structures
│   │   │   ├── zmq_handler.rs         # ZMQ socket management
│   │   │   ├── protocol.rs            # Message format definitions
│   │   │   └── error.rs               # Communication errors
│   │   ├── metrics/
│   │   │   ├── mod.rs                 # Metrics module exports
│   │   │   ├── registry.rs            # Metric name registry and types
│   │   │   ├── value.rs               # Metric value types and status
│   │   │   ├── cache.rs               # Individual metric caching
│   │   │   └── collection.rs          # Metric collection storage
│   │   ├── collectors/
│   │   │   ├── mod.rs                 # Collector trait definition
│   │   │   ├── cpu.rs                 # CPU-related metrics
│   │   │   ├── memory.rs              # Memory-related metrics
│   │   │   ├── disk.rs                # Disk usage metrics
│   │   │   ├── processes.rs           # Process-related metrics
│   │   │   ├── systemd.rs             # Systemd service metrics
│   │   │   ├── smart.rs               # Storage SMART metrics
│   │   │   ├── backup.rs              # Backup status metrics
│   │   │   ├── network.rs             # Network metrics
│   │   │   └── error.rs               # Collector errors
│   │   ├── notifications/
│   │   │   ├── mod.rs                 # Notification exports
│   │   │   ├── manager.rs             # Status change detection
│   │   │   ├── email.rs               # Email notification backend
│   │   │   └── status_tracker.rs     # Individual metric status tracking
│   │   └── utils/
│   │       ├── mod.rs                 # Utility exports
│   │       ├── system.rs              # System command utilities
│   │       ├── time.rs                # Timestamp utilities
│   │       └── discovery.rs          # Auto-discovery functions
│   ├── config/
│   │   ├── agent.example.toml         # Example configuration
│   │   └── production.toml            # Production template
│   └── tests/
│       ├── integration/               # Integration tests
│       ├── unit/                      # Unit tests by module
│       └── fixtures/                  # Test data and mocks
├── dashboard/                     # Dashboard application
│   ├── Cargo.toml
│   ├── src/
│   │   ├── main.rs                    # Entry point with CLI parsing
│   │   ├── app.rs                     # Main Dashboard application state
│   │   ├── config/
│   │   │   ├── mod.rs                 # Configuration module exports
│   │   │   ├── loader.rs              # TOML configuration loading
│   │   │   └── defaults.rs            # Default configuration values
│   │   ├── communication/
│   │   │   ├── mod.rs                 # Communication module exports
│   │   │   ├── zmq_consumer.rs        # ZMQ metric consumer
│   │   │   ├── protocol.rs            # Shared message protocol
│   │   │   └── error.rs               # Communication errors
│   │   ├── metrics/
│   │   │   ├── mod.rs                 # Metrics module exports
│   │   │   ├── store.rs               # Metric storage and retrieval
│   │   │   ├── filter.rs              # Metric filtering and selection
│   │   │   ├── history.rs             # Historical metric storage
│   │   │   └── subscription.rs        # Metric subscription management
│   │   ├── ui/
│   │   │   ├── mod.rs                 # UI module exports
│   │   │   ├── app.rs                 # Main UI application loop
│   │   │   ├── layout.rs              # Layout management
│   │   │   ├── widgets/
│   │   │   │   ├── mod.rs             # Widget exports
│   │   │   │   ├── base.rs            # Base widget trait
│   │   │   │   ├── cpu.rs             # CPU metrics widget
│   │   │   │   ├── memory.rs          # Memory metrics widget
│   │   │   │   ├── storage.rs         # Storage metrics widget
│   │   │   │   ├── services.rs        # Services metrics widget
│   │   │   │   ├── backup.rs          # Backup metrics widget
│   │   │   │   ├── hosts.rs           # Host selection widget
│   │   │   │   └── alerts.rs          # Alerts/status widget
│   │   │   ├── theme.rs               # UI theming and colors
│   │   │   └── input.rs               # Input handling
│   │   ├── hosts/
│   │   │   ├── mod.rs                 # Host management exports
│   │   │   ├── manager.rs             # Host connection management
│   │   │   ├── discovery.rs           # Host auto-discovery
│   │   │   └── connection.rs          # Individual host connections
│   │   └── utils/
│   │       ├── mod.rs                 # Utility exports
│   │       ├── formatting.rs          # Data formatting utilities
│   │       └── time.rs                # Time formatting utilities
│   ├── config/
│   │   ├── dashboard.example.toml     # Example configuration
│   │   └── hosts.example.toml         # Example host configuration
│   └── tests/
│       ├── integration/               # Integration tests
│       ├── unit/                      # Unit tests by module
│       └── fixtures/                  # Test data and mocks
├── shared/                        # Shared types and utilities
│   ├── Cargo.toml
│   ├── src/
│   │   ├── lib.rs                     # Shared library exports
│   │   ├── protocol.rs                # Shared message protocol
│   │   ├── metrics.rs                 # Shared metric types
│   │   └── error.rs                   # Shared error types
└── tests/                         # End-to-end tests
    ├── e2e/                           # End-to-end test scenarios
    └── fixtures/                      # Shared test data
```

## Architecture Principles

### 1. Individual Metrics Philosophy

**No Grouped Structures**: Instead of `SystemMetrics` or `BackupMetrics`, we collect individual metrics:

```rust
// Good - Individual metrics
"cpu_load_1min" -> 2.5
"cpu_load_5min" -> 2.8
"cpu_temperature" -> 45.0
"memory_usage_percent" -> 78.5
"memory_total_gb" -> 32.0
"disk_root_usage_percent" -> 15.2
"service_ssh_status" -> "active"
"backup_last_run_timestamp" -> 1697123456

// Bad - Grouped structures
SystemMetrics { cpu: {...}, memory: {...} }
```

**Dashboard Flexibility**: The dashboard consumes individual metrics and decides which ones to display in each widget.

### 2. Metric Definition

Each metric has:
- **Name**: Unique identifier (e.g., `cpu_load_1min`)
- **Value**: Typed value (f32, i64, String, bool)
- **Status**: Health status (ok, warning, critical, unknown)
- **Timestamp**: When the metric was collected
- **Metadata**: Optional description, units, etc.

### 3. Module Responsibilities

- **Communication**: ZMQ protocol and message handling
- **Metrics**: Value types, caching, and storage
- **Collectors**: Gather specific metrics from system
- **Notifications**: Track status changes across all metrics
- **Config**: Configuration loading and validation

### 4. Data Flow

```
Collectors → Individual Metrics → Cache → ZMQ → Dashboard
     ↓              ↓                ↓
Status Calc → Status Tracker → Notifications
```

## Metric Design Rules

### 1. Naming Convention

Metrics follow hierarchical naming:

```
{category}_{subcategory}_{property}_{unit}

Examples:
cpu_load_1min
cpu_temperature_celsius
memory_usage_percent
memory_total_gb
disk_root_usage_percent
disk_nvme0_temperature_celsius
service_ssh_status
service_ssh_memory_mb
backup_last_run_timestamp
backup_status
network_eth0_rx_bytes
```

### 2. Value Types

```rust
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum MetricValue {
    Float(f32),
    Integer(i64),
    String(String),
    Boolean(bool),
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum Status {
    Ok,
    Warning,
    Critical,
    Unknown,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Metric {
    pub name: String,
    pub value: MetricValue,
    pub status: Status,
    pub timestamp: u64,
    pub description: Option<String>,
    pub unit: Option<String>,
}
```

### 3. Collector Interface

Each collector provides individual metrics:

```rust
#[async_trait]
pub trait Collector {
    fn name(&self) -> &str;
    async fn collect(&self) -> Result<Vec<Metric>>;
}

// Example CPU collector output:
vec![
    Metric { name: "cpu_load_1min", value: Float(2.5), status: Ok, ... },
    Metric { name: "cpu_load_5min", value: Float(2.8), status: Ok, ... },
    Metric { name: "cpu_temperature", value: Float(45.0), status: Ok, ... },
]
```

## Communication Protocol

### ZMQ Message Format

```rust
#[derive(Debug, Serialize, Deserialize)]
pub struct MetricMessage {
    pub hostname: String,
    pub timestamp: u64,
    pub metrics: Vec<Metric>,
}
```

### ZMQ Configuration

```rust
#[derive(Debug, Deserialize)]
pub struct ZmqConfig {
    pub publisher_port: u16,      // Default: 6130
    pub command_port: u16,        // Default: 6131
    pub bind_address: String,     // Default: "0.0.0.0"
    pub timeout_ms: u64,          // Default: 5000
    pub heartbeat_interval: u64,  // Default: 30000
}
```

## Caching Strategy

### Configuration-Based Individual Metric Cache

```rust
pub struct MetricCache {
    cache: HashMap<String, CachedMetric>,
    config: CacheConfig,
}

struct CachedMetric {
    metric: Metric,
    collected_at: Instant,
    access_count: u64,
    cache_tier: CacheTier,
}

#[derive(Debug, Deserialize)]
pub struct CacheConfig {
    pub enabled: bool,
    pub default_ttl_seconds: u64,
    pub max_entries: usize,
    pub metric_tiers: HashMap<String, CacheTier>,
}

#[derive(Debug, Deserialize, Clone)]
pub struct CacheTier {
    pub interval_seconds: u64,
    pub description: String,
}
```

**Configuration-Based Caching Rules**:
- Each metric type has configurable cache intervals via config files
- Cache tiers defined in configuration, not hardcoded
- Individual metrics cached by name with tier-specific TTL
- Cache miss triggers single metric collection
- No grouped cache invalidation
- Performance target: <2% CPU usage through intelligent caching

## Configuration System

### Configuration Structure

```toml
[zmq]
publisher_port = 6130
command_port = 6131
bind_address = "0.0.0.0"
timeout_ms = 5000

[cache]
enabled = true
default_ttl_seconds = 30
max_entries = 10000

# Cache tiers for different metric types
[cache.tiers.realtime]
interval_seconds = 5
description = "High-frequency metrics (CPU load, memory usage)"

[cache.tiers.fast]
interval_seconds = 30
description = "Medium-frequency metrics (network stats, process lists)"

[cache.tiers.medium]
interval_seconds = 300
description = "Low-frequency metrics (service status, disk usage)"

[cache.tiers.slow]
interval_seconds = 900
description = "Very low-frequency metrics (SMART data, backup status)"

[cache.tiers.static]
interval_seconds = 3600
description = "Rarely changing metrics (hardware info, system capabilities)"

# Metric type to tier mapping
[cache.metric_assignments]
"cpu_load_*" = "realtime"
"memory_usage_*" = "realtime"
"service_*_cpu_percent" = "realtime"
"service_*_memory_mb" = "realtime"
"service_*_status" = "medium"
"service_*_disk_gb" = "medium"
"disk_*_temperature" = "slow"
"disk_*_wear_percent" = "slow"
"backup_*" = "slow"
"network_*" = "fast"

[collectors.cpu]
enabled = true
interval_seconds = 5
temperature_warning = 70.0
temperature_critical = 80.0
load_warning = 5.0
load_critical = 8.0

[collectors.memory]
enabled = true
interval_seconds = 5
usage_warning_percent = 80.0
usage_critical_percent = 95.0

[collectors.systemd]
enabled = true
interval_seconds = 30
services = ["ssh", "nginx", "docker", "gitea"]

[notifications]
enabled = true
smtp_host = "localhost"
smtp_port = 25
from_email = "{{hostname}}@cmtec.se"
to_email = "cm@cmtec.se"
rate_limit_minutes = 30
```

## Implementation Guidelines

### 1. Adding New Metrics

```rust
// 1. Define metric names in registry
pub const NETWORK_ETH0_RX_BYTES: &str = "network_eth0_rx_bytes";
pub const NETWORK_ETH0_TX_BYTES: &str = "network_eth0_tx_bytes";

// 2. Implement collector
pub struct NetworkCollector {
    config: NetworkConfig,
}

impl Collector for NetworkCollector {
    async fn collect(&self) -> Result<Vec<Metric>> {
        vec![
            Metric {
                name: NETWORK_ETH0_RX_BYTES.to_string(),
                value: MetricValue::Integer(rx_bytes),
                status: Status::Ok,
                timestamp: now(),
                unit: Some("bytes".to_string()),
                ..Default::default()
            },
            // ... more metrics
        ]
    }
}

// 3. Register in agent
agent.register_collector(Box::new(NetworkCollector::new(config.network)));
```

### 2. Status Calculation

Each collector calculates status for its metrics:

```rust
impl CpuCollector {
    fn calculate_temperature_status(&self, temp: f32) -> Status {
        if temp >= self.config.critical_threshold {
            Status::Critical
        } else if temp >= self.config.warning_threshold {
            Status::Warning
        } else {
            Status::Ok
        }
    }
}
```

### 3. Dashboard Usage

Dashboard widgets subscribe to specific metrics:

```rust
// Dashboard CPU widget
let cpu_metrics = [
    "cpu_load_1min",
    "cpu_load_5min",
    "cpu_load_15min",
    "cpu_temperature",
];

// Dashboard memory widget
let memory_metrics = [
    "memory_usage_percent",
    "memory_total_gb",
    "memory_available_gb",
];
```

# Dashboard Architecture

## Dashboard Principles

### 1. UI Layout Preservation

**Current UI Layout Maintained**: The existing dashboard UI layout is preserved and enhanced with the new metric-centric architecture. All current widgets remain in their established positions and functionality.

**Widget Enhancement, Not Replacement**: Widgets are enhanced to consume individual metrics rather than grouped structures, but maintain their visual appearance and user interaction patterns.

### 2. Metric-to-Widget Mapping

Each widget subscribes to specific individual metrics and composes them for display:

```rust
// CPU Widget Metrics
const CPU_WIDGET_METRICS: &[&str] = &[
    "cpu_load_1min",
    "cpu_load_5min",
    "cpu_load_15min",
    "cpu_temperature_celsius",
    "cpu_frequency_mhz",
    "cpu_usage_percent",
];

// Memory Widget Metrics
const MEMORY_WIDGET_METRICS: &[&str] = &[
    "memory_usage_percent",
    "memory_total_gb",
    "memory_available_gb",
    "memory_used_gb",
    "memory_swap_total_gb",
    "memory_swap_used_gb",
];

// Storage Widget Metrics
const STORAGE_WIDGET_METRICS: &[&str] = &[
    "disk_nvme0_temperature_celsius",
    "disk_nvme0_wear_percent",
    "disk_nvme0_spare_percent",
    "disk_nvme0_hours",
    "disk_nvme0_capacity_gb",
    "disk_nvme0_usage_gb",
    "disk_nvme0_usage_percent",
];

// Services Widget Metrics
const SERVICES_WIDGET_METRICS: &[&str] = &[
    "service_ssh_status",
    "service_ssh_memory_mb",
    "service_ssh_cpu_percent",
    "service_nginx_status",
    "service_nginx_memory_mb",
    "service_docker_status",
    // ... per discovered service
];

// Backup Widget Metrics
const BACKUP_WIDGET_METRICS: &[&str] = &[
    "backup_last_run_timestamp",
    "backup_status",
    "backup_size_gb",
    "backup_duration_minutes",
    "backup_next_scheduled_timestamp",
];
```

## Dashboard Communication

### ZMQ Consumer Architecture

```rust
// dashboard/src/communication/zmq_consumer.rs
pub struct ZmqConsumer {
    subscriber: Socket,
    config: ZmqConfig,
    metric_filter: MetricFilter,
}

impl ZmqConsumer {
    pub async fn subscribe_to_host(&mut self, hostname: &str) -> Result<()>
    pub async fn receive_metrics(&mut self) -> Result<Vec<Metric>>
    pub fn set_metric_filter(&mut self, filter: MetricFilter)
    pub async fn request_metrics(&self, metric_names: &[String]) -> Result<()>
}

#[derive(Debug, Clone)]
pub struct MetricFilter {
    pub include_patterns: Vec<String>,
    pub exclude_patterns: Vec<String>,
    pub hosts: Vec<String>,
}
```

### Protocol Compatibility

The dashboard uses the same protocol as defined in the agent:

```rust
// shared/src/protocol.rs (shared between agent and dashboard)
#[derive(Debug, Serialize, Deserialize)]
pub struct MetricMessage {
    pub hostname: String,
    pub timestamp: u64,
    pub metrics: Vec<Metric>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Metric {
    pub name: String,
    pub value: MetricValue,
    pub status: Status,
    pub timestamp: u64,
    pub description: Option<String>,
    pub unit: Option<String>,
}
```

## Dashboard Metric Management

### Metric Store

```rust
// dashboard/src/metrics/store.rs
pub struct MetricStore {
    current_metrics: HashMap<String, HashMap<String, Metric>>, // host -> metric_name -> metric
    historical_metrics: HistoricalStore,
    subscriptions: SubscriptionManager,
}

impl MetricStore {
    pub fn update_metrics(&mut self, hostname: &str, metrics: Vec<Metric>)
    pub fn get_metric(&self, hostname: &str, metric_name: &str) -> Option<&Metric>
    pub fn get_metrics_for_widget(&self, hostname: &str, widget: WidgetType) -> Vec<&Metric>
    pub fn get_hosts(&self) -> Vec<String>
    pub fn get_latest_timestamp(&self, hostname: &str) -> Option<u64>
}
```

### Metric Subscription Management

```rust
// dashboard/src/metrics/subscription.rs
pub struct SubscriptionManager {
    widget_subscriptions: HashMap<WidgetType, Vec<String>>,
    active_hosts: HashSet<String>,
    metric_filters: HashMap<String, MetricFilter>,
}

impl SubscriptionManager {
    pub fn subscribe_widget(&mut self, widget: WidgetType, metrics: &[String])
    pub fn get_required_metrics(&self) -> Vec<String>
    pub fn add_host(&mut self, hostname: String)
    pub fn remove_host(&mut self, hostname: &str)
    pub fn is_metric_needed(&self, metric_name: &str) -> bool
}
```

## Widget Architecture

### Base Widget Trait

```rust
// dashboard/src/ui/widgets/base.rs
pub trait Widget {
    fn widget_type(&self) -> WidgetType;
    fn required_metrics(&self) -> &[&str];
    fn update_metrics(&mut self, metrics: &HashMap<String, Metric>);
    fn render(&self, frame: &mut Frame, area: Rect);
    fn handle_input(&mut self, event: &Event) -> bool;
    fn get_status(&self) -> Status;
}

#[derive(Debug, Clone, Copy, Hash, Eq, PartialEq)]
pub enum WidgetType {
    Cpu,
    Memory,
    Storage,
    Services,
    Backup,
    Hosts,
    Alerts,
}
```

### Enhanced Widget Implementation

```rust
// dashboard/src/ui/widgets/cpu.rs
pub struct CpuWidget {
    metrics: HashMap<String, Metric>,
    config: CpuWidgetConfig,
}

impl Widget for CpuWidget {
    fn required_metrics(&self) -> &[&str] {
        CPU_WIDGET_METRICS
    }

    fn update_metrics(&mut self, metrics: &HashMap<String, Metric>) {
        // Update only the metrics this widget cares about
        for &metric_name in self.required_metrics() {
            if let Some(metric) = metrics.get(metric_name) {
                self.metrics.insert(metric_name.to_string(), metric.clone());
            }
        }
    }

    fn render(&self, frame: &mut Frame, area: Rect) {
        // Extract specific metric values for display
        let load_1min = self.get_metric_value("cpu_load_1min").unwrap_or(0.0);
        let load_5min = self.get_metric_value("cpu_load_5min").unwrap_or(0.0);
        let temperature = self.get_metric_value("cpu_temperature_celsius");

        // Maintain existing UI layout and styling
        // ... render implementation preserving current appearance
    }

    fn get_status(&self) -> Status {
        // Aggregate status from individual metric statuses
        self.metrics.values()
            .map(|m| &m.status)
            .max()
            .copied()
            .unwrap_or(Status::Unknown)
    }
}
```

## Host Management

### Multi-Host Connection Management

```rust
// dashboard/src/hosts/manager.rs
pub struct HostManager {
    connections: HashMap<String, HostConnection>,
    discovery: HostDiscovery,
    active_host: Option<String>,
    metric_store: Arc<Mutex<MetricStore>>,
}

impl HostManager {
    pub async fn discover_hosts(&mut self) -> Result<Vec<String>>
    pub async fn connect_to_host(&mut self, hostname: &str) -> Result<()>
    pub fn disconnect_from_host(&mut self, hostname: &str)
    pub fn set_active_host(&mut self, hostname: String)
    pub fn get_active_host(&self) -> Option<&str>
    pub fn get_connected_hosts(&self) -> Vec<&str>
    pub async fn refresh_all_hosts(&mut self) -> Result<()>
}

// dashboard/src/hosts/connection.rs
pub struct HostConnection {
    hostname: String,
    zmq_consumer: ZmqConsumer,
    last_seen: Instant,
    connection_status: ConnectionStatus,
    metric_buffer: VecDeque<Metric>,
}

#[derive(Debug, Clone)]
pub enum ConnectionStatus {
    Connected,
    Connecting,
    Disconnected,
    Error(String),
}
```

## Configuration Integration

### Dashboard Configuration

```toml
# dashboard/config/dashboard.toml
[zmq]
subscriber_ports = [6130]  # Ports to listen on for metrics
connection_timeout_ms = 15000
reconnect_interval_ms = 5000

[ui]
refresh_rate_ms = 100
theme = "default"
preserve_layout = true

[hosts]
auto_discovery = true
predefined_hosts = ["cmbox", "labbox", "simonbox", "steambox", "srv01"]
default_host = "cmbox"

[metrics]
history_retention_hours = 24
max_metrics_per_host = 10000

[widgets.cpu]
enabled = true
metrics = [
    "cpu_load_1min",
    "cpu_load_5min",
    "cpu_load_15min",
    "cpu_temperature_celsius"
]

[widgets.memory]
enabled = true
metrics = [
    "memory_usage_percent",
    "memory_total_gb",
    "memory_available_gb"
]

[widgets.storage]
enabled = true
metrics = [
    "disk_nvme0_temperature_celsius",
    "disk_nvme0_wear_percent",
    "disk_nvme0_usage_percent"
]
```

## UI Layout Preservation Rules

### 1. Maintain Current Widget Positions

- **CPU widget**: Top-left position preserved
- **Memory widget**: Top-right position preserved
- **Storage widget**: Left-center position preserved
- **Services widget**: Right-center position preserved
- **Backup widget**: Bottom-right position preserved
- **Host navigation**: Bottom status bar preserved

### 2. Preserve Visual Styling

- **Colors**: Existing status colors (green, yellow, red) maintained
- **Borders**: Current border styles and characters preserved
- **Text formatting**: Font styles, alignment, and spacing preserved
- **Progress bars**: Current progress bar implementations maintained

### 3. Maintain User Interactions

- **Navigation keys**: `←→` for host switching preserved
- **Refresh key**: `r` for manual refresh preserved
- **Quit key**: `q` for exit preserved
- **Additional keys**: All current keyboard shortcuts maintained

### 4. Status Display Consistency

- **Status aggregation**: Widget-level status calculated from individual metric statuses
- **Color mapping**: Status enum maps to existing color scheme
- **Status indicators**: Current status display format preserved

## Implementation Migration Strategy

### Phase 1: Shared Types
1. Create `shared/` crate with common protocol and metric types
2. Update both agent and dashboard to use shared types

### Phase 2: Agent Migration
1. Implement new agent architecture with individual metrics
2. Maintain backward compatibility during transition

### Phase 3: Dashboard Migration
1. Update dashboard to consume individual metrics
2. Preserve all existing UI layouts and interactions
3. Enhance widgets with new metric subscription system

### Phase 4: Integration Testing
1. End-to-end testing with real multi-host scenarios
2. Performance validation and optimization
3. UI/UX validation to ensure no regressions

## Benefits of This Architecture

1. **Maximum Flexibility**: Dashboard can compose any widget from any metrics
2. **Easy Extension**: Adding new metrics doesn't affect existing code
3. **Granular Caching**: Cache individual metrics based on collection cost
4. **Simple Testing**: Test individual metric collection in isolation
5. **Clear Separation**: Agent collects, dashboard consumes and displays
6. **Efficient Updates**: Only send changed metrics to dashboard

## Future Extensions

- **Metric Filtering**: Dashboard requests only needed metrics
- **Historical Storage**: Store metric history for trending
- **Metric Aggregation**: Calculate derived metrics from base metrics
- **Dynamic Discovery**: Auto-discover new metric sources
- **Metric Validation**: Validate metric values and ranges