Compare commits

...

14 Commits

Author SHA1 Message Date
0a13cab897 Add detected IP display in dashboard Agent row
All checks were successful
Build and Release / build-and-release (push) Successful in 1m8s
Display the connection IP address that the dashboard is configured to use
for each host below the Agent version information. Shows which network
path (local/Tailscale) is being used for connections based on host
configuration.

Features:
- Display detected IP below Agent row in system widget
- Uses existing host configuration connection logic
- Shows actual IP being used for dashboard connections
2025-11-13 11:26:58 +01:00
d33ec5d225 Add Tailscale network support for host connections
All checks were successful
Build and Release / build-and-release (push) Successful in 1m31s
Implement configurable network routing for both local and Tailscale networks.
Dashboard now supports intelligent connection selection with automatic fallback
between network types. Add IP configuration fields and connection routing logic
for ZMQ and SSH operations.

Features:
- Host configuration with local and Tailscale IP addresses
- Configurable connection types (local/tailscale/auto)
- Automatic fallback between network connections
- Updated ZMQ connection logic with retry support
- SSH command routing through configured IP addresses
2025-11-13 10:08:17 +01:00
d31c2384df Add configurable maintenance mode file support
All checks were successful
Build and Release / build-and-release (push) Successful in 1m32s
Implement maintenance_mode_file configuration option in NotificationConfig
to allow customizable file paths for suppressing email notifications.
Updates maintenance mode check to use configured path instead of hardcoded
/tmp/cm-maintenance file.
2025-11-10 07:48:15 +01:00
c8db463204 Add interactive SSH terminal session functionality
All checks were successful
Build and Release / build-and-release (push) Successful in 1m31s
- Press 't' to open SSH session to current host in tmux split
- Uses 30% vertical split consistent with logs and rebuild commands
- Auto-closes tmux window when SSH session ends
- Provides direct host administration access from dashboard
- Uses same SSH configuration as rebuild operations

Version 0.1.65
2025-11-09 11:39:43 +01:00
e8e50ef9bb Replace empty panels with offline host message for better UX
All checks were successful
Build and Release / build-and-release (push) Successful in 2m33s
- Hide all system/backup/service panels when host is offline
- Show centered wake-up message with host status
- Display "Press 'w' to wake up host" if MAC address configured
- Provide clear visual indication when hosts are unreachable
- Improve user experience by removing confusing empty panels

Version 0.1.64
2025-11-08 18:28:28 +01:00
0faed9309e Improve host disconnection detection and fix notification exclusions
All checks were successful
Build and Release / build-and-release (push) Successful in 1m34s
- Add dedicated heartbeat transmission every 5 seconds independent of metric collection
- Fix host offline detection by clearing metrics for disconnected hosts
- Move exclude_email_metrics to NotificationConfig for better organization
- Add cleanup_offline_hosts method to remove stale metrics after heartbeat timeout
- Ensure offline hosts show proper status icons and visual indicators

Version 0.1.63
2025-11-08 11:33:32 +01:00
c980346d05 Fix heartbeat detection to properly detect offline hosts
All checks were successful
Build and Release / build-and-release (push) Successful in 2m34s
- Add independent heartbeat checking timer (1 second interval) separate from metric reception
- Move get_connected_hosts() call outside metric receive condition to run periodically
- Remove duplicate update_hosts() call from metric processing to avoid redundancy
- Ensure offline host detection works even when no new metrics are received
- Fix issue where hosts going offline were never detected due to conditional heartbeat check
- Heartbeat timeouts now properly detected within configured timeout + 1 second
- Bump version to 0.1.62
2025-11-07 14:27:03 +01:00
3e3d3f0c2b Fix Tab key 1-second delay by reverting ZMQ to non-blocking mode
All checks were successful
Build and Release / build-and-release (push) Successful in 1m10s
- Change receive_metrics() from blocking to DONTWAIT to prevent main loop freezing
- Eliminate 1-second ZMQ socket timeout that was blocking UI after Tab key press
- Main loop now continues immediately after immediate render instead of waiting
- Maintain heartbeat-based host detection while fixing visual responsiveness
- Fix blocking operation introduced when implementing heartbeat timeout mechanism
- Tab navigation now truly immediate without any network operation delays
- Bump version to 0.1.61
2025-11-06 12:04:49 +01:00
9eb7444d56 Cache localhost hostname to eliminate Tab key sluggishness
All checks were successful
Build and Release / build-and-release (push) Successful in 2m10s
- Add cached localhost field to TuiApp struct to avoid repeated gethostname() system calls
- Initialize localhost once in constructor instead of calling gethostname() on every navigation
- Replace gethostname() calls in update_hosts() and navigate_host() with cached value
- Eliminate expensive system call bottleneck causing Tab key responsiveness issues
- Reduce Tab navigation from 2+ system calls to zero system calls (memory access only)
- Fix performance regression introduced by immediate UI refresh implementation
- Bump version to 0.1.60
2025-11-06 11:53:49 +01:00
278d1763aa Fix Tab key responsiveness with immediate UI refresh
All checks were successful
Build and Release / build-and-release (push) Successful in 2m10s
- Add immediate terminal.draw() call after input handling in main loop
- Eliminate delay between Tab key press and visual host switching
- Provide instant visual feedback for all navigation inputs
- Maintain existing metric update render cycle without duplication
- Fix UI update timing issue where changes only appeared on metric intervals
- Bump version to 0.1.59
2025-11-06 11:30:26 +01:00
f874264e13 Optimize dashboard performance for responsive Tab key navigation
All checks were successful
Build and Release / build-and-release (push) Successful in 1m32s
- Replace 6 separate filter operations with single-pass metric categorization in update_metrics
- Reduce CPU overhead from 6x to 1x work per metric update cycle
- Fix Tab key sluggishness caused by competing expensive filtering operations
- Maintain exact same functionality with significantly better performance
- Improve UI responsiveness for host switching and navigation
- Bump version to 0.1.58
2025-11-06 11:18:39 +01:00
5f6e47ece5 Implement heartbeat-based host connectivity detection
All checks were successful
Build and Release / build-and-release (push) Successful in 2m8s
- Add agent_heartbeat metric to agent transmission for reliable host detection
- Update dashboard to track heartbeat timestamps per host instead of general metrics
- Add configurable heartbeat_timeout_seconds to dashboard ZMQ config (default 10s)
- Remove unused timeout_ms from agent config and revert to non-blocking command reception
- Remove unused heartbeat_interval_ms from agent configuration
- Host disconnect detection now uses dedicated heartbeat metrics for improved reliability
- Bump version to 0.1.57
2025-11-06 11:04:01 +01:00
0e7cf24dbb Add exclude_email_metrics configuration option
All checks were successful
Build and Release / build-and-release (push) Successful in 2m34s
- Add exclude_email_metrics field to AgentConfig for filtering email notifications
- Metrics matching excluded names skip notification processing but still appear in dashboard
- Optional field with serde(default) for backward compatibility
- Bump version to 0.1.56
2025-11-06 10:31:25 +01:00
2d080a2f51 Implement WakeOnLAN functionality and offline status handling
All checks were successful
Build and Release / build-and-release (push) Successful in 1m35s
- Add WakeOnLAN support for offline hosts using 'w' key
- Configure MAC addresses for all infrastructure hosts
- Implement Status::Offline for disconnected hosts
- Exclude offline hosts from status aggregation to prevent false alerts
- Update versions to 0.1.55
2025-10-31 09:28:31 +01:00
15 changed files with 421 additions and 98 deletions

6
Cargo.lock generated
View File

@@ -270,7 +270,7 @@ checksum = "a1d728cc89cf3aee9ff92b05e62b19ee65a02b5702cff7d5a377e32c6ae29d8d"
[[package]] [[package]]
name = "cm-dashboard" name = "cm-dashboard"
version = "0.1.54" version = "0.1.67"
dependencies = [ dependencies = [
"anyhow", "anyhow",
"chrono", "chrono",
@@ -292,7 +292,7 @@ dependencies = [
[[package]] [[package]]
name = "cm-dashboard-agent" name = "cm-dashboard-agent"
version = "0.1.54" version = "0.1.67"
dependencies = [ dependencies = [
"anyhow", "anyhow",
"async-trait", "async-trait",
@@ -315,7 +315,7 @@ dependencies = [
[[package]] [[package]]
name = "cm-dashboard-shared" name = "cm-dashboard-shared"
version = "0.1.54" version = "0.1.67"
dependencies = [ dependencies = [
"chrono", "chrono",
"serde", "serde",

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "cm-dashboard-agent" name = "cm-dashboard-agent"
version = "0.1.54" version = "0.1.68"
edition = "2021" edition = "2021"
[dependencies] [dependencies]

View File

@@ -78,10 +78,11 @@ impl Agent {
info!("Initial metric collection completed - all data cached and ready"); info!("Initial metric collection completed - all data cached and ready");
} }
// Separate intervals for collection, transmission, and email notifications // Separate intervals for collection, transmission, heartbeat, and email notifications
let mut collection_interval = let mut collection_interval =
interval(Duration::from_secs(self.config.collection_interval_seconds)); interval(Duration::from_secs(self.config.collection_interval_seconds));
let mut transmission_interval = interval(Duration::from_secs(self.config.zmq.transmission_interval_seconds)); let mut transmission_interval = interval(Duration::from_secs(self.config.zmq.transmission_interval_seconds));
let mut heartbeat_interval = interval(Duration::from_secs(self.config.zmq.heartbeat_interval_seconds));
let mut notification_interval = interval(Duration::from_secs(self.config.notifications.aggregation_interval_seconds)); let mut notification_interval = interval(Duration::from_secs(self.config.notifications.aggregation_interval_seconds));
loop { loop {
@@ -98,6 +99,12 @@ impl Agent {
error!("Failed to broadcast metrics: {}", e); error!("Failed to broadcast metrics: {}", e);
} }
} }
_ = heartbeat_interval.tick() => {
// Send standalone heartbeat for host connectivity detection
if let Err(e) = self.send_heartbeat().await {
error!("Failed to send heartbeat: {}", e);
}
}
_ = notification_interval.tick() => { _ = notification_interval.tick() => {
// Process batched email notifications (separate from dashboard updates) // Process batched email notifications (separate from dashboard updates)
if let Err(e) = self.host_status_manager.process_pending_notifications(&mut self.notification_manager).await { if let Err(e) = self.host_status_manager.process_pending_notifications(&mut self.notification_manager).await {
@@ -180,6 +187,10 @@ impl Agent {
let version_metric = self.get_agent_version_metric(); let version_metric = self.get_agent_version_metric();
metrics.push(version_metric); metrics.push(version_metric);
// Add heartbeat metric for host connectivity detection
let heartbeat_metric = self.get_heartbeat_metric();
metrics.push(heartbeat_metric);
// Check for user-stopped services that are now active and clear their flags // Check for user-stopped services that are now active and clear their flags
self.clear_user_stopped_flags_for_active_services(&metrics); self.clear_user_stopped_flags_for_active_services(&metrics);
@@ -201,6 +212,12 @@ impl Agent {
async fn process_metrics(&mut self, metrics: &[Metric]) -> bool { async fn process_metrics(&mut self, metrics: &[Metric]) -> bool {
let mut status_changed = false; let mut status_changed = false;
for metric in metrics { for metric in metrics {
// Filter excluded metrics from email notification processing only
if self.config.notifications.exclude_email_metrics.contains(&metric.name) {
debug!("Excluding metric '{}' from email notification processing", metric.name);
continue;
}
if self.host_status_manager.process_metric(metric, &mut self.notification_manager).await { if self.host_status_manager.process_metric(metric, &mut self.notification_manager).await {
status_changed = true; status_changed = true;
} }
@@ -226,6 +243,35 @@ impl Agent {
format!("v{}", env!("CARGO_PKG_VERSION")) format!("v{}", env!("CARGO_PKG_VERSION"))
} }
/// Create heartbeat metric for host connectivity detection
fn get_heartbeat_metric(&self) -> Metric {
use std::time::{SystemTime, UNIX_EPOCH};
let timestamp = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_secs();
Metric::new(
"agent_heartbeat".to_string(),
MetricValue::Integer(timestamp as i64),
Status::Ok,
)
}
/// Send standalone heartbeat for connectivity detection
async fn send_heartbeat(&mut self) -> Result<()> {
let heartbeat_metric = self.get_heartbeat_metric();
let message = MetricMessage::new(
self.hostname.clone(),
vec![heartbeat_metric],
);
self.zmq_handler.publish_metrics(&message).await?;
debug!("Sent standalone heartbeat for connectivity detection");
Ok(())
}
async fn handle_commands(&mut self) -> Result<()> { async fn handle_commands(&mut self) -> Result<()> {
// Try to receive commands (non-blocking) // Try to receive commands (non-blocking)
match self.zmq_handler.try_receive_command() { match self.zmq_handler.try_receive_command() {

View File

@@ -66,8 +66,6 @@ impl ZmqHandler {
} }
/// Send heartbeat (placeholder for future use)
/// Try to receive a command (non-blocking) /// Try to receive a command (non-blocking)
pub fn try_receive_command(&self) -> Result<Option<AgentCommand>> { pub fn try_receive_command(&self) -> Result<Option<AgentCommand>> {
match self.command_receiver.recv_bytes(zmq::DONTWAIT) { match self.command_receiver.recv_bytes(zmq::DONTWAIT) {

View File

@@ -25,9 +25,10 @@ pub struct ZmqConfig {
pub publisher_port: u16, pub publisher_port: u16,
pub command_port: u16, pub command_port: u16,
pub bind_address: String, pub bind_address: String,
pub timeout_ms: u64,
pub heartbeat_interval_ms: u64,
pub transmission_interval_seconds: u64, pub transmission_interval_seconds: u64,
/// Heartbeat transmission interval in seconds for host connectivity detection
#[serde(default = "default_heartbeat_interval_seconds")]
pub heartbeat_interval_seconds: u64,
} }
/// Collector configuration /// Collector configuration
@@ -146,9 +147,23 @@ pub struct NotificationConfig {
pub rate_limit_minutes: u64, pub rate_limit_minutes: u64,
/// Email notification batching interval in seconds (default: 60) /// Email notification batching interval in seconds (default: 60)
pub aggregation_interval_seconds: u64, pub aggregation_interval_seconds: u64,
/// List of metric names to exclude from email notifications
#[serde(default)]
pub exclude_email_metrics: Vec<String>,
/// Path to maintenance mode file that suppresses email notifications when present
#[serde(default = "default_maintenance_mode_file")]
pub maintenance_mode_file: String,
} }
fn default_heartbeat_interval_seconds() -> u64 {
5
}
fn default_maintenance_mode_file() -> String {
"/tmp/cm-maintenance".to_string()
}
impl AgentConfig { impl AgentConfig {
pub fn from_file<P: AsRef<Path>>(path: P) -> Result<Self> { pub fn from_file<P: AsRef<Path>>(path: P) -> Result<Self> {
loader::load_config(path) loader::load_config(path)

View File

@@ -19,10 +19,6 @@ pub fn validate_config(config: &AgentConfig) -> Result<()> {
bail!("ZMQ bind address cannot be empty"); bail!("ZMQ bind address cannot be empty");
} }
if config.zmq.timeout_ms == 0 {
bail!("ZMQ timeout cannot be 0");
}
// Validate collection interval // Validate collection interval
if config.collection_interval_seconds == 0 { if config.collection_interval_seconds == 0 {
bail!("Collection interval cannot be 0"); bail!("Collection interval cannot be 0");

View File

@@ -59,6 +59,6 @@ impl NotificationManager {
} }
fn is_maintenance_mode(&self) -> bool { fn is_maintenance_mode(&self) -> bool {
std::fs::metadata("/tmp/cm-maintenance").is_ok() std::fs::metadata(&self.config.maintenance_mode_file).is_ok()
} }
} }

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "cm-dashboard" name = "cm-dashboard"
version = "0.1.54" version = "0.1.68"
edition = "2021" edition = "2021"
[dependencies] [dependencies]

View File

@@ -22,7 +22,7 @@ pub struct Dashboard {
terminal: Option<Terminal<CrosstermBackend<io::Stdout>>>, terminal: Option<Terminal<CrosstermBackend<io::Stdout>>>,
headless: bool, headless: bool,
initial_commands_sent: std::collections::HashSet<String>, initial_commands_sent: std::collections::HashSet<String>,
_config: DashboardConfig, config: DashboardConfig,
} }
impl Dashboard { impl Dashboard {
@@ -67,11 +67,8 @@ impl Dashboard {
} }
}; };
// Connect to configured hosts from configuration
let hosts: Vec<String> = config.hosts.keys().cloned().collect();
// Try to connect to hosts but don't fail if none are available // Try to connect to hosts but don't fail if none are available
match zmq_consumer.connect_to_predefined_hosts(&hosts).await { match zmq_consumer.connect_to_predefined_hosts(&config.hosts).await {
Ok(_) => info!("Successfully connected to ZMQ hosts"), Ok(_) => info!("Successfully connected to ZMQ hosts"),
Err(e) => { Err(e) => {
warn!( warn!(
@@ -133,7 +130,7 @@ impl Dashboard {
terminal, terminal,
headless, headless,
initial_commands_sent: std::collections::HashSet::new(), initial_commands_sent: std::collections::HashSet::new(),
_config: config, config,
}) })
} }
@@ -149,6 +146,8 @@ impl Dashboard {
let mut last_metrics_check = Instant::now(); let mut last_metrics_check = Instant::now();
let metrics_check_interval = Duration::from_millis(100); // Check for metrics every 100ms let metrics_check_interval = Duration::from_millis(100); // Check for metrics every 100ms
let mut last_heartbeat_check = Instant::now();
let heartbeat_check_interval = Duration::from_secs(1); // Check for host connectivity every 1 second
loop { loop {
// Handle terminal events (keyboard input) only if not headless // Handle terminal events (keyboard input) only if not headless
@@ -191,6 +190,17 @@ impl Dashboard {
break; break;
} }
} }
// Render UI immediately after handling input for responsive feedback
if let Some(ref mut terminal) = self.terminal {
if let Some(ref mut tui_app) = self.tui_app {
if let Err(e) = terminal.draw(|frame| {
tui_app.render(frame, &self.metric_store);
}) {
error!("Error rendering TUI after input: {}", e);
}
}
}
} }
// Check for new metrics // Check for new metrics
@@ -243,14 +253,8 @@ impl Dashboard {
} }
} }
// Update TUI with new hosts and metrics (only if not headless) // Update TUI with new metrics (only if not headless)
if let Some(ref mut tui_app) = self.tui_app { if let Some(ref mut tui_app) = self.tui_app {
let connected_hosts = self
.metric_store
.get_connected_hosts(Duration::from_secs(30));
tui_app.update_hosts(connected_hosts);
tui_app.update_metrics(&self.metric_store); tui_app.update_metrics(&self.metric_store);
} }
} }
@@ -269,6 +273,20 @@ impl Dashboard {
last_metrics_check = Instant::now(); last_metrics_check = Instant::now();
} }
// Check for host connectivity changes (heartbeat timeouts) periodically
if last_heartbeat_check.elapsed() >= heartbeat_check_interval {
let timeout = Duration::from_secs(self.config.zmq.heartbeat_timeout_seconds);
// Clean up metrics for offline hosts
self.metric_store.cleanup_offline_hosts(timeout);
if let Some(ref mut tui_app) = self.tui_app {
let connected_hosts = self.metric_store.get_connected_hosts(timeout);
tui_app.update_hosts(connected_hosts);
}
last_heartbeat_check = Instant::now();
}
// Render TUI (only if not headless) // Render TUI (only if not headless)
if !self.headless { if !self.headless {
if let Some(ref mut terminal) = self.terminal { if let Some(ref mut terminal) = self.terminal {

View File

@@ -84,13 +84,13 @@ impl ZmqConsumer {
} }
} }
/// Connect to predefined hosts /// Connect to predefined hosts using their configuration
pub async fn connect_to_predefined_hosts(&mut self, hosts: &[String]) -> Result<()> { pub async fn connect_to_predefined_hosts(&mut self, hosts: &std::collections::HashMap<String, crate::config::HostDetails>) -> Result<()> {
let default_port = self.config.subscriber_ports[0]; let default_port = self.config.subscriber_ports[0];
for hostname in hosts { for (hostname, host_details) in hosts {
// Try to connect, but don't fail if some hosts are unreachable // Try to connect using configured IP, but don't fail if some hosts are unreachable
if let Err(e) = self.connect_to_host(hostname, default_port).await { if let Err(e) = self.connect_to_host_with_details(hostname, host_details, default_port).await {
warn!("Could not connect to {}: {}", hostname, e); warn!("Could not connect to {}: {}", hostname, e);
} }
} }
@@ -104,6 +104,29 @@ impl ZmqConsumer {
Ok(()) Ok(())
} }
/// Connect to a host using its configuration details with fallback support
pub async fn connect_to_host_with_details(&mut self, hostname: &str, host_details: &crate::config::HostDetails, port: u16) -> Result<()> {
// Get primary connection IP
let primary_ip = host_details.get_connection_ip(hostname);
// Try primary connection
if let Ok(()) = self.connect_to_host(&primary_ip, port).await {
info!("Connected to {} via primary address: {}", hostname, primary_ip);
return Ok(());
}
// Try fallback IPs if primary fails
let fallbacks = host_details.get_fallback_ips(hostname);
for fallback_ip in fallbacks {
if let Ok(()) = self.connect_to_host(&fallback_ip, port).await {
info!("Connected to {} via fallback address: {}", hostname, fallback_ip);
return Ok(());
}
}
Err(anyhow::anyhow!("Failed to connect to {} using all available addresses", hostname))
}
/// Receive command output from any connected agent (non-blocking) /// Receive command output from any connected agent (non-blocking)
pub async fn receive_command_output(&mut self) -> Result<Option<CommandOutputMessage>> { pub async fn receive_command_output(&mut self) -> Result<Option<CommandOutputMessage>> {
match self.subscriber.recv_bytes(zmq::DONTWAIT) { match self.subscriber.recv_bytes(zmq::DONTWAIT) {

View File

@@ -16,12 +16,90 @@ pub struct DashboardConfig {
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ZmqConfig { pub struct ZmqConfig {
pub subscriber_ports: Vec<u16>, pub subscriber_ports: Vec<u16>,
/// Heartbeat timeout in seconds - hosts considered offline if no heartbeat received within this time
#[serde(default = "default_heartbeat_timeout_seconds")]
pub heartbeat_timeout_seconds: u64,
}
fn default_heartbeat_timeout_seconds() -> u64 {
10 // Default to 10 seconds - allows for multiple missed heartbeats
} }
/// Individual host configuration details /// Individual host configuration details
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HostDetails { pub struct HostDetails {
pub mac_address: Option<String>, pub mac_address: Option<String>,
/// Primary IP address (local network)
pub ip: Option<String>,
/// Tailscale network IP address
pub tailscale_ip: Option<String>,
/// Preferred connection type: "local", "tailscale", or "auto" (fallback)
#[serde(default = "default_connection_type")]
pub connection_type: String,
}
fn default_connection_type() -> String {
"auto".to_string()
}
impl HostDetails {
/// Get the preferred IP address for connection based on connection_type
pub fn get_connection_ip(&self, hostname: &str) -> String {
match self.connection_type.as_str() {
"tailscale" => {
if let Some(ref ts_ip) = self.tailscale_ip {
ts_ip.clone()
} else {
// Fallback to local IP or hostname
self.ip.as_ref().unwrap_or(&hostname.to_string()).clone()
}
}
"local" => {
if let Some(ref local_ip) = self.ip {
local_ip.clone()
} else {
hostname.to_string()
}
}
"auto" | _ => {
// Try tailscale first, then local, then hostname
if let Some(ref ts_ip) = self.tailscale_ip {
ts_ip.clone()
} else if let Some(ref local_ip) = self.ip {
local_ip.clone()
} else {
hostname.to_string()
}
}
}
}
/// Get fallback IP addresses for connection retry
pub fn get_fallback_ips(&self, hostname: &str) -> Vec<String> {
let mut fallbacks = Vec::new();
// Add all available IPs except the primary one
let primary = self.get_connection_ip(hostname);
if let Some(ref ts_ip) = self.tailscale_ip {
if ts_ip != &primary {
fallbacks.push(ts_ip.clone());
}
}
if let Some(ref local_ip) = self.ip {
if local_ip != &primary {
fallbacks.push(local_ip.clone());
}
}
// Always include hostname as final fallback if not already primary
if hostname != primary {
fallbacks.push(hostname.to_string());
}
fallbacks
}
} }
/// System configuration /// System configuration

View File

@@ -11,8 +11,8 @@ pub struct MetricStore {
current_metrics: HashMap<String, HashMap<String, Metric>>, current_metrics: HashMap<String, HashMap<String, Metric>>,
/// Historical metrics for trending /// Historical metrics for trending
historical_metrics: HashMap<String, Vec<MetricDataPoint>>, historical_metrics: HashMap<String, Vec<MetricDataPoint>>,
/// Last update timestamp per host /// Last heartbeat timestamp per host
last_update: HashMap<String, Instant>, last_heartbeat: HashMap<String, Instant>,
/// Configuration /// Configuration
max_metrics_per_host: usize, max_metrics_per_host: usize,
history_retention: Duration, history_retention: Duration,
@@ -23,7 +23,7 @@ impl MetricStore {
Self { Self {
current_metrics: HashMap::new(), current_metrics: HashMap::new(),
historical_metrics: HashMap::new(), historical_metrics: HashMap::new(),
last_update: HashMap::new(), last_heartbeat: HashMap::new(),
max_metrics_per_host, max_metrics_per_host,
history_retention: Duration::from_secs(history_retention_hours * 3600), history_retention: Duration::from_secs(history_retention_hours * 3600),
} }
@@ -56,10 +56,13 @@ impl MetricStore {
// Add to history // Add to history
host_history.push(MetricDataPoint { received_at: now }); host_history.push(MetricDataPoint { received_at: now });
}
// Update last update timestamp // Track heartbeat metrics for connectivity detection
self.last_update.insert(hostname.to_string(), now); if metric_name == "agent_heartbeat" {
self.last_heartbeat.insert(hostname.to_string(), now);
debug!("Updated heartbeat for host {}", hostname);
}
}
// Get metrics count before cleanup // Get metrics count before cleanup
let metrics_count = host_metrics.len(); let metrics_count = host_metrics.len();
@@ -88,22 +91,46 @@ impl MetricStore {
} }
} }
/// Get connected hosts (hosts with recent updates) /// Get connected hosts (hosts with recent heartbeats)
pub fn get_connected_hosts(&self, timeout: Duration) -> Vec<String> { pub fn get_connected_hosts(&self, timeout: Duration) -> Vec<String> {
let now = Instant::now(); let now = Instant::now();
self.last_update self.last_heartbeat
.iter() .iter()
.filter_map(|(hostname, &last_update)| { .filter_map(|(hostname, &last_heartbeat)| {
if now.duration_since(last_update) <= timeout { if now.duration_since(last_heartbeat) <= timeout {
Some(hostname.clone()) Some(hostname.clone())
} else { } else {
debug!("Host {} considered offline - last heartbeat was {:?} ago",
hostname, now.duration_since(last_heartbeat));
None None
} }
}) })
.collect() .collect()
} }
/// Clean up data for offline hosts
pub fn cleanup_offline_hosts(&mut self, timeout: Duration) {
let now = Instant::now();
let mut hosts_to_cleanup = Vec::new();
// Find hosts that are offline (no recent heartbeat)
for (hostname, &last_heartbeat) in &self.last_heartbeat {
if now.duration_since(last_heartbeat) > timeout {
hosts_to_cleanup.push(hostname.clone());
}
}
// Clear metrics for offline hosts
for hostname in hosts_to_cleanup {
if let Some(metrics) = self.current_metrics.remove(&hostname) {
info!("Cleared {} metrics for offline host: {}", metrics.len(), hostname);
}
// Keep heartbeat timestamp for reconnection detection
// Don't remove from last_heartbeat to track when host was last seen
}
}
/// Cleanup old data and enforce limits /// Cleanup old data and enforce limits
fn cleanup_host_data(&mut self, hostname: &str) { fn cleanup_host_data(&mut self, hostname: &str) {
let now = Instant::now(); let now = Instant::now();

View File

@@ -90,10 +90,13 @@ pub struct TuiApp {
user_navigated_away: bool, user_navigated_away: bool,
/// Dashboard configuration /// Dashboard configuration
config: DashboardConfig, config: DashboardConfig,
/// Cached localhost hostname to avoid repeated system calls
localhost: String,
} }
impl TuiApp { impl TuiApp {
pub fn new(config: DashboardConfig) -> Self { pub fn new(config: DashboardConfig) -> Self {
let localhost = gethostname::gethostname().to_string_lossy().to_string();
let mut app = Self { let mut app = Self {
host_widgets: HashMap::new(), host_widgets: HashMap::new(),
current_host: None, current_host: None,
@@ -102,6 +105,7 @@ impl TuiApp {
should_quit: false, should_quit: false,
user_navigated_away: false, user_navigated_away: false,
config, config,
localhost,
}; };
// Sort predefined hosts // Sort predefined hosts
@@ -131,31 +135,31 @@ impl TuiApp {
// Only update widgets if we have metrics for this host // Only update widgets if we have metrics for this host
let all_metrics = metric_store.get_metrics_for_host(&hostname); let all_metrics = metric_store.get_metrics_for_host(&hostname);
if !all_metrics.is_empty() { if !all_metrics.is_empty() {
// Get metrics first while hostname is borrowed // Single pass metric categorization for better performance
let cpu_metrics: Vec<&Metric> = all_metrics let mut cpu_metrics = Vec::new();
.iter() let mut memory_metrics = Vec::new();
.filter(|m| { let mut service_metrics = Vec::new();
m.name.starts_with("cpu_") let mut backup_metrics = Vec::new();
|| m.name.contains("c_state_") let mut nixos_metrics = Vec::new();
|| m.name.starts_with("process_top_") let mut disk_metrics = Vec::new();
})
.copied() for metric in all_metrics {
.collect(); if metric.name.starts_with("cpu_")
let memory_metrics: Vec<&Metric> = all_metrics || metric.name.contains("c_state_")
.iter() || metric.name.starts_with("process_top_") {
.filter(|m| m.name.starts_with("memory_") || m.name.starts_with("disk_tmp_")) cpu_metrics.push(metric);
.copied() } else if metric.name.starts_with("memory_") || metric.name.starts_with("disk_tmp_") {
.collect(); memory_metrics.push(metric);
let service_metrics: Vec<&Metric> = all_metrics } else if metric.name.starts_with("service_") {
.iter() service_metrics.push(metric);
.filter(|m| m.name.starts_with("service_")) } else if metric.name.starts_with("backup_") {
.copied() backup_metrics.push(metric);
.collect(); } else if metric.name == "system_nixos_build" || metric.name == "system_active_users" || metric.name == "agent_version" {
let all_backup_metrics: Vec<&Metric> = all_metrics nixos_metrics.push(metric);
.iter() } else if metric.name.starts_with("disk_") {
.filter(|m| m.name.starts_with("backup_")) disk_metrics.push(metric);
.copied() }
.collect(); }
// Clear completed transitions first // Clear completed transitions first
self.clear_completed_transitions(&hostname, &service_metrics); self.clear_completed_transitions(&hostname, &service_metrics);
@@ -166,21 +170,7 @@ impl TuiApp {
// Collect all system metrics (CPU, memory, NixOS, disk/storage) // Collect all system metrics (CPU, memory, NixOS, disk/storage)
let mut system_metrics = cpu_metrics; let mut system_metrics = cpu_metrics;
system_metrics.extend(memory_metrics); system_metrics.extend(memory_metrics);
// Add NixOS metrics - using exact matching for build display fix
let nixos_metrics: Vec<&Metric> = all_metrics
.iter()
.filter(|m| m.name == "system_nixos_build" || m.name == "system_active_users" || m.name == "agent_version")
.copied()
.collect();
system_metrics.extend(nixos_metrics); system_metrics.extend(nixos_metrics);
// Add disk/storage metrics
let disk_metrics: Vec<&Metric> = all_metrics
.iter()
.filter(|m| m.name.starts_with("disk_"))
.copied()
.collect();
system_metrics.extend(disk_metrics); system_metrics.extend(disk_metrics);
host_widgets.system_widget.update_from_metrics(&system_metrics); host_widgets.system_widget.update_from_metrics(&system_metrics);
@@ -189,7 +179,7 @@ impl TuiApp {
.update_from_metrics(&service_metrics); .update_from_metrics(&service_metrics);
host_widgets host_widgets
.backup_widget .backup_widget
.update_from_metrics(&all_backup_metrics); .update_from_metrics(&backup_metrics);
host_widgets.last_update = Some(Instant::now()); host_widgets.last_update = Some(Instant::now());
} }
@@ -221,13 +211,12 @@ impl TuiApp {
self.available_hosts = all_hosts; self.available_hosts = all_hosts;
// Get the current hostname (localhost) for auto-selection // Get the current hostname (localhost) for auto-selection
let localhost = gethostname::gethostname().to_string_lossy().to_string();
if !self.available_hosts.is_empty() { if !self.available_hosts.is_empty() {
if self.available_hosts.contains(&localhost) && !self.user_navigated_away { if self.available_hosts.contains(&self.localhost) && !self.user_navigated_away {
// Localhost is available and user hasn't navigated away - switch to it // Localhost is available and user hasn't navigated away - switch to it
self.current_host = Some(localhost.clone()); self.current_host = Some(self.localhost.clone());
// Find the actual index of localhost in the sorted list // Find the actual index of localhost in the sorted list
self.host_index = self.available_hosts.iter().position(|h| h == &localhost).unwrap_or(0); self.host_index = self.available_hosts.iter().position(|h| h == &self.localhost).unwrap_or(0);
} else if self.current_host.is_none() { } else if self.current_host.is_none() {
// No current host - select first available (which is localhost if available) // No current host - select first available (which is localhost if available)
self.current_host = Some(self.available_hosts[0].clone()); self.current_host = Some(self.available_hosts[0].clone());
@@ -262,12 +251,14 @@ impl TuiApp {
KeyCode::Char('r') => { KeyCode::Char('r') => {
// System rebuild command - works on any panel for current host // System rebuild command - works on any panel for current host
if let Some(hostname) = self.current_host.clone() { if let Some(hostname) = self.current_host.clone() {
let connection_ip = self.get_connection_ip(&hostname);
// Create command that shows logo, rebuilds, and waits for user input // Create command that shows logo, rebuilds, and waits for user input
let logo_and_rebuild = format!( let logo_and_rebuild = format!(
"bash -c 'cat << \"EOF\"\nNixOS System Rebuild\nTarget: {}\n\nEOF\nssh -tt {}@{} \"bash -ic {}\"\necho\necho \"========================================\"\necho \"Rebuild completed. Press any key to close...\"\necho \"========================================\"\nread -n 1 -s\nexit'", "bash -c 'cat << \"EOF\"\nNixOS System Rebuild\nTarget: {} ({})\n\nEOF\nssh -tt {}@{} \"bash -ic {}\"\necho\necho \"========================================\"\necho \"Rebuild completed. Press any key to close...\"\necho \"========================================\"\nread -n 1 -s\nexit'",
hostname, hostname,
connection_ip,
self.config.ssh.rebuild_user, self.config.ssh.rebuild_user,
hostname, connection_ip,
self.config.ssh.rebuild_alias self.config.ssh.rebuild_alias
); );
@@ -300,10 +291,11 @@ impl TuiApp {
KeyCode::Char('J') => { KeyCode::Char('J') => {
// Show service logs via journalctl in tmux split window // Show service logs via journalctl in tmux split window
if let (Some(service_name), Some(hostname)) = (self.get_selected_service(), self.current_host.clone()) { if let (Some(service_name), Some(hostname)) = (self.get_selected_service(), self.current_host.clone()) {
let connection_ip = self.get_connection_ip(&hostname);
let journalctl_command = format!( let journalctl_command = format!(
"bash -c \"ssh -tt {}@{} 'sudo journalctl -u {}.service -f --no-pager -n 50'; exit\"", "bash -c \"ssh -tt {}@{} 'sudo journalctl -u {}.service -f --no-pager -n 50'; exit\"",
self.config.ssh.rebuild_user, self.config.ssh.rebuild_user,
hostname, connection_ip,
service_name service_name
); );
@@ -323,10 +315,11 @@ impl TuiApp {
// Check if this service has a custom log file configured // Check if this service has a custom log file configured
if let Some(host_logs) = self.config.service_logs.get(&hostname) { if let Some(host_logs) = self.config.service_logs.get(&hostname) {
if let Some(log_config) = host_logs.iter().find(|config| config.service_name == service_name) { if let Some(log_config) = host_logs.iter().find(|config| config.service_name == service_name) {
let connection_ip = self.get_connection_ip(&hostname);
let tail_command = format!( let tail_command = format!(
"bash -c \"ssh -tt {}@{} 'sudo tail -n 50 -f {}'; exit\"", "bash -c \"ssh -tt {}@{} 'sudo tail -n 50 -f {}'; exit\"",
self.config.ssh.rebuild_user, self.config.ssh.rebuild_user,
hostname, connection_ip,
log_config.log_file_path log_config.log_file_path
); );
@@ -376,6 +369,26 @@ impl TuiApp {
} }
} }
} }
KeyCode::Char('t') => {
// Open SSH terminal session in tmux window
if let Some(hostname) = self.current_host.clone() {
let connection_ip = self.get_connection_ip(&hostname);
let ssh_command = format!(
"ssh -tt {}@{}",
self.config.ssh.rebuild_user,
connection_ip
);
std::process::Command::new("tmux")
.arg("split-window")
.arg("-v")
.arg("-p")
.arg("30") // Use 30% like other commands
.arg(&ssh_command)
.spawn()
.ok(); // Ignore errors, tmux will handle them
}
}
KeyCode::Tab => { KeyCode::Tab => {
// Tab cycles to next host // Tab cycles to next host
self.navigate_host(1); self.navigate_host(1);
@@ -424,9 +437,8 @@ impl TuiApp {
self.current_host = Some(self.available_hosts[self.host_index].clone()); self.current_host = Some(self.available_hosts[self.host_index].clone());
// Check if user navigated away from localhost // Check if user navigated away from localhost
let localhost = gethostname::gethostname().to_string_lossy().to_string();
if let Some(ref current) = self.current_host { if let Some(ref current) = self.current_host {
if current != &localhost { if current != &self.localhost {
self.user_navigated_away = true; self.user_navigated_away = true;
} else { } else {
self.user_navigated_away = false; // User navigated back to localhost self.user_navigated_away = false; // User navigated back to localhost
@@ -570,6 +582,21 @@ impl TuiApp {
]) ])
.split(main_chunks[1]); // main_chunks[1] is now the content area (between title and statusbar) .split(main_chunks[1]); // main_chunks[1] is now the content area (between title and statusbar)
// Check if current host is offline
let current_host_offline = if let Some(hostname) = self.current_host.clone() {
self.calculate_host_status(&hostname, metric_store) == Status::Offline
} else {
true // No host selected is considered offline
};
// If host is offline, render wake-up message instead of panels
if current_host_offline {
self.render_offline_host_message(frame, main_chunks[1]);
self.render_btop_title(frame, main_chunks[0], metric_store);
self.render_statusbar(frame, main_chunks[2]);
return;
}
// Check if backup panel should be shown // Check if backup panel should be shown
let show_backup = if let Some(hostname) = self.current_host.clone() { let show_backup = if let Some(hostname) = self.current_host.clone() {
let host_widgets = self.get_or_create_host_widgets(&hostname); let host_widgets = self.get_or_create_host_widgets(&hostname);
@@ -637,11 +664,14 @@ impl TuiApp {
return; return;
} }
// Calculate worst-case status across all hosts // Calculate worst-case status across all hosts (excluding offline)
let mut worst_status = Status::Ok; let mut worst_status = Status::Ok;
for host in &self.available_hosts { for host in &self.available_hosts {
let host_status = self.calculate_host_status(host, metric_store); let host_status = self.calculate_host_status(host, metric_store);
worst_status = Status::aggregate(&[worst_status, host_status]); // Don't include offline hosts in status aggregation
if host_status != Status::Offline {
worst_status = Status::aggregate(&[worst_status, host_status]);
}
} }
// Use the worst status color as background // Use the worst status color as background
@@ -799,8 +829,10 @@ impl TuiApp {
let host_widgets = self.get_or_create_host_widgets(&hostname); let host_widgets = self.get_or_create_host_widgets(&hostname);
host_widgets.system_scroll_offset host_widgets.system_scroll_offset
}; };
// Clone the config to avoid borrowing issues
let config = self.config.clone();
let host_widgets = self.get_or_create_host_widgets(&hostname); let host_widgets = self.get_or_create_host_widgets(&hostname);
host_widgets.system_widget.render_with_scroll(frame, inner_area, scroll_offset, &hostname); host_widgets.system_widget.render_with_scroll(frame, inner_area, scroll_offset, &hostname, Some(&config));
} }
} }
@@ -820,7 +852,87 @@ impl TuiApp {
} }
} }
/// Render offline host message with wake-up option
fn render_offline_host_message(&self, frame: &mut Frame, area: Rect) {
use ratatui::layout::Alignment;
use ratatui::style::Modifier;
use ratatui::text::{Line, Span};
use ratatui::widgets::{Block, Borders, Paragraph};
// Get hostname for message
let hostname = self.current_host.as_ref()
.map(|h| h.as_str())
.unwrap_or("Unknown");
// Check if host has MAC address for wake-on-LAN
let has_mac = self.current_host.as_ref()
.and_then(|hostname| self.config.hosts.get(hostname))
.and_then(|details| details.mac_address.as_ref())
.is_some();
// Create message content
let mut lines = vec![
Line::from(Span::styled(
format!("Host '{}' is offline", hostname),
Style::default().fg(Theme::muted_text()).add_modifier(Modifier::BOLD),
)),
Line::from(""),
];
if has_mac {
lines.push(Line::from(Span::styled(
"Press 'w' to wake up host",
Style::default().fg(Theme::primary_text()).add_modifier(Modifier::BOLD),
)));
} else {
lines.push(Line::from(Span::styled(
"No MAC address configured - cannot wake up",
Style::default().fg(Theme::muted_text()),
)));
}
// Create centered message
let message = Paragraph::new(lines)
.block(Block::default()
.borders(Borders::ALL)
.border_style(Style::default().fg(Theme::muted_text()))
.title(" Offline Host ")
.title_style(Style::default().fg(Theme::muted_text()).add_modifier(Modifier::BOLD)))
.style(Style::default().bg(Theme::background()).fg(Theme::primary_text()))
.alignment(Alignment::Center);
// Center the message in the available area
let popup_area = ratatui::layout::Layout::default()
.direction(Direction::Vertical)
.constraints([
Constraint::Percentage(40),
Constraint::Length(6),
Constraint::Percentage(40),
])
.split(area)[1];
let popup_area = ratatui::layout::Layout::default()
.direction(Direction::Horizontal)
.constraints([
Constraint::Percentage(25),
Constraint::Percentage(50),
Constraint::Percentage(25),
])
.split(popup_area)[1];
frame.render_widget(message, popup_area);
}
/// Parse MAC address string (e.g., "AA:BB:CC:DD:EE:FF") to [u8; 6] /// Parse MAC address string (e.g., "AA:BB:CC:DD:EE:FF") to [u8; 6]
/// Get the connection IP for a hostname based on host configuration
fn get_connection_ip(&self, hostname: &str) -> String {
if let Some(host_details) = self.config.hosts.get(hostname) {
host_details.get_connection_ip(hostname)
} else {
hostname.to_string()
}
}
fn parse_mac_address(mac_str: &str) -> Result<[u8; 6], &'static str> { fn parse_mac_address(mac_str: &str) -> Result<[u8; 6], &'static str> {
let parts: Vec<&str> = mac_str.split(':').collect(); let parts: Vec<&str> = mac_str.split(':').collect();
if parts.len() != 6 { if parts.len() != 6 {

View File

@@ -439,7 +439,7 @@ impl Widget for SystemWidget {
impl SystemWidget { impl SystemWidget {
/// Render with scroll offset support /// Render with scroll offset support
pub fn render_with_scroll(&mut self, frame: &mut Frame, area: Rect, scroll_offset: usize, hostname: &str) { pub fn render_with_scroll(&mut self, frame: &mut Frame, area: Rect, scroll_offset: usize, hostname: &str, config: Option<&crate::config::DashboardConfig>) {
let mut lines = Vec::new(); let mut lines = Vec::new();
// NixOS section // NixOS section
@@ -457,6 +457,16 @@ impl SystemWidget {
Span::styled(format!("Agent: {}", agent_version_text), Typography::secondary()) Span::styled(format!("Agent: {}", agent_version_text), Typography::secondary())
])); ]));
// Display detected connection IP
if let Some(config) = config {
if let Some(host_details) = config.hosts.get(hostname) {
let detected_ip = host_details.get_connection_ip(hostname);
lines.push(Line::from(vec![
Span::styled(format!("IP: {}", detected_ip), Typography::secondary())
]));
}
}
// CPU section // CPU section
lines.push(Line::from(vec![ lines.push(Line::from(vec![

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "cm-dashboard-shared" name = "cm-dashboard-shared"
version = "0.1.54" version = "0.1.68"
edition = "2021" edition = "2021"
[dependencies] [dependencies]