Compare commits

...

9 Commits

Author SHA1 Message Date
2d080a2f51 Implement WakeOnLAN functionality and offline status handling
All checks were successful
Build and Release / build-and-release (push) Successful in 1m35s
- Add WakeOnLAN support for offline hosts using 'w' key
- Configure MAC addresses for all infrastructure hosts
- Implement Status::Offline for disconnected hosts
- Exclude offline hosts from status aggregation to prevent false alerts
- Update versions to 0.1.55
2025-10-31 09:28:31 +01:00
6179bd51a7 Implement WakeOnLAN functionality with simplified configuration
All checks were successful
Build and Release / build-and-release (push) Successful in 2m32s
- Add Status::Offline enum variant for disconnected hosts
- All configured hosts now always visible showing offline status when disconnected
- Add WakeOnLAN support using wake-on-lan Rust crate
- Implement w key binding to wake offline hosts with MAC addresses
- Simplify configuration to single [hosts] section with MAC addresses only
- Change critical status icon from ◯ to ! for better visibility
- Add proper MAC address parsing and error handling
- Silent WakeOnLAN operation with logging for success/failure

Configuration format:
[hosts]
hostname = { mac_address = "AA:BB:CC:DD:EE:FF" }
2025-10-31 09:03:01 +01:00
57de4c366a Bump version to 0.1.53
All checks were successful
Build and Release / build-and-release (push) Successful in 2m10s
2025-10-30 17:00:39 +01:00
e18778e962 Fix string syntax error in rebuild command
- Replace raw string with escaped string to fix compilation error
- Maintain same functionality with proper string formatting
2025-10-30 16:59:41 +01:00
e4469a0ebf Replace tmux popups with split windows for better log navigation
Some checks failed
Build and Release / build-and-release (push) Failing after 1m9s
- Change J/L log commands from popups to split windows for scrolling support
- Change rebuild command from popup to split window with consistent 30% height
- Add auto-close behavior with bash -c "command; exit" wrapper for logs
- Add "press any key to close" prompt with visual separators for rebuild
- Enable proper tmux copy mode and navigation in all split windows

Users can now scroll through logs, copy text, and resize windows while
maintaining clean auto-close behavior for all operations.
2025-10-30 15:30:58 +01:00
6fedf4c7fc Add sudo support and line count to log viewing commands
All checks were successful
Build and Release / build-and-release (push) Successful in 1m12s
- Add sudo to journalctl command for proper systemd log access
- Add sudo to tail command for system log file access
- Add -n 50 to tail command to match journalctl behavior
- Both J and L keys now show last 50 lines before following

Ensures consistent behavior and proper permissions for all log viewing.
2025-10-30 13:26:04 +01:00
3f6dffa66e Add custom service log file support with L key
All checks were successful
Build and Release / build-and-release (push) Successful in 2m7s
- Add ServiceLogConfig structure for per-host service log paths
- Implement L key handler for custom log file viewing via tmux popup
- Update dashboard config to support service_logs HashMap
- Add tail -f command execution over SSH for real-time log streaming
- Update status line to show L: Custom shortcut
- Document configuration format in CLAUDE.md

Each service can now have custom log file paths configured per host,
accessible via L key with same tmux popup interface as journalctl.
2025-10-30 13:12:36 +01:00
1b64fbde3d Fix tmux popup title flag for service logs feature
All checks were successful
Build and Release / build-and-release (push) Successful in 1m47s
Fix journalctl popup that was failing with 'can't find session' error:

Issue Resolution:
- Change tmux display-popup flag from -t to -T for setting popup title
- -t flag was incorrectly trying to target a session named 'Logs: servicename'
- -T flag correctly sets the popup window title

The J key (Shift+j) service logs feature now works properly, opening
an 80% tmux popup with journalctl -f for real-time log viewing.

Bump version to v0.1.49
2025-10-30 12:42:58 +01:00
4f4c3b0d6e Improve notification behavior during startup and recovery
All checks were successful
Build and Release / build-and-release (push) Successful in 2m9s
Fix notification issues for better operational experience:

Startup Notification Suppression:
- Suppress notifications for transitions from Status::Unknown during agent/server startup
- Prevents notification spam when services transition from Unknown to Warning/Critical on restart
- Only real status changes (not initial discovery) trigger notifications
- Maintains alerting for actual service state changes after startup

Recovery Notification Refinement:
- Recovery notifications only sent when ALL services reach OK status
- Individual service recoveries suppressed if other services still have problems
- Ensures recovery notifications indicate complete system health restoration
- Prevents premature celebration when partial recoveries occur

Result: Clean startup experience without false alerts and meaningful recovery
notifications that truly indicate full system health restoration.

Bump version to v0.1.48
2025-10-30 12:35:23 +01:00
13 changed files with 186 additions and 57 deletions

View File

@@ -20,12 +20,28 @@ A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure.
- Persistent storage survives agent restarts
- Automatic flag clearing when services are restarted via dashboard
### Custom Service Logs
- Configure service-specific log file paths per host in dashboard config
- Press `L` on any service to view custom log files via `tail -f`
- Configuration format in dashboard config:
```toml
[service_logs]
hostname1 = [
{ service_name = "nginx", log_file_path = "/var/log/nginx/access.log" },
{ service_name = "app", log_file_path = "/var/log/myapp/app.log" }
]
hostname2 = [
{ service_name = "database", log_file_path = "/var/log/postgres/postgres.log" }
]
```
### Service Management
- **Direct Control**: Arrow keys (↑↓) or vim keys (j/k) navigate services
- **Service Actions**:
- `s` - Start service (sends UserStart command)
- `S` - Stop service (sends UserStop command)
- `J` - Show service logs (journalctl in tmux popup)
- `L` - Show custom log files (tail -f custom paths in tmux popup)
- `R` - Rebuild current host
- **Visual Status**: Green ● (active), Yellow ◐ (inactive), Red ◯ (failed)
- **Transitional Icons**: Blue arrows during operations
@@ -34,6 +50,7 @@ A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure.
- **Tab**: Switch between hosts
- **↑↓ or j/k**: Select services
- **J**: Show service logs (journalctl)
- **L**: Show custom log files
- **q**: Quit dashboard
## Core Architecture Principles

13
Cargo.lock generated
View File

@@ -270,7 +270,7 @@ checksum = "a1d728cc89cf3aee9ff92b05e62b19ee65a02b5702cff7d5a377e32c6ae29d8d"
[[package]]
name = "cm-dashboard"
version = "0.1.46"
version = "0.1.55"
dependencies = [
"anyhow",
"chrono",
@@ -286,12 +286,13 @@ dependencies = [
"toml",
"tracing",
"tracing-subscriber",
"wake-on-lan",
"zmq",
]
[[package]]
name = "cm-dashboard-agent"
version = "0.1.46"
version = "0.1.55"
dependencies = [
"anyhow",
"async-trait",
@@ -314,7 +315,7 @@ dependencies = [
[[package]]
name = "cm-dashboard-shared"
version = "0.1.46"
version = "0.1.55"
dependencies = [
"chrono",
"serde",
@@ -2064,6 +2065,12 @@ version = "0.9.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a"
[[package]]
name = "wake-on-lan"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1ccf60b60ad7e5b1b37372c5134cbcab4db0706c231d212e0c643a077462bc8f"
[[package]]
name = "walkdir"
version = "2.5.0"

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard-agent"
version = "0.1.47"
version = "0.1.55"
edition = "2021"
[dependencies]

View File

@@ -140,6 +140,7 @@ impl Collector for BackupCollector {
Status::Warning => "warning".to_string(),
Status::Critical => "critical".to_string(),
Status::Unknown => "unknown".to_string(),
Status::Offline => "offline".to_string(),
}),
status: overall_status,
timestamp,
@@ -202,6 +203,7 @@ impl Collector for BackupCollector {
Status::Warning => "warning".to_string(),
Status::Critical => "critical".to_string(),
Status::Unknown => "unknown".to_string(),
Status::Offline => "offline".to_string(),
}),
status: service_status,
timestamp,

View File

@@ -272,11 +272,13 @@ impl HostStatusManager {
/// Check if a status change is significant enough for notification
fn is_significant_change(&self, old_status: Status, new_status: Status) -> bool {
match (old_status, new_status) {
// Always notify on problems
// Don't notify on transitions from Unknown (startup/restart scenario)
(Status::Unknown, _) => false,
// Always notify on problems (but not from Unknown)
(_, Status::Warning) | (_, Status::Critical) => true,
// Only notify on recovery if it's from a problem state to OK and all services are OK
(Status::Warning | Status::Critical, Status::Ok) => self.current_host_status == Status::Ok,
// Don't notify on startup or other transitions
// Don't notify on other transitions
_ => false,
}
}
@@ -374,8 +376,8 @@ impl HostStatusManager {
details.push('\n');
}
// Show recoveries
if !recovery_changes.is_empty() {
// Show recoveries only if host status is now OK (all services recovered)
if !recovery_changes.is_empty() && aggregated.host_status_final == Status::Ok {
details.push_str(&format!("✅ RECOVERIES ({}):\n", recovery_changes.len()));
for change in recovery_changes {
details.push_str(&format!(" {}\n", change));

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard"
version = "0.1.47"
version = "0.1.55"
edition = "2021"
[dependencies]
@@ -18,4 +18,5 @@ tracing-subscriber = { workspace = true }
ratatui = { workspace = true }
crossterm = { workspace = true }
toml = { workspace = true }
gethostname = { workspace = true }
gethostname = { workspace = true }
wake-on-lan = "0.2"

View File

@@ -67,8 +67,8 @@ impl Dashboard {
}
};
// Connect to predefined hosts from configuration
let hosts = config.hosts.predefined_hosts.clone();
// Connect to configured hosts from configuration
let hosts: Vec<String> = config.hosts.keys().cloned().collect();
// Try to connect to hosts but don't fail if none are available
match zmq_consumer.connect_to_predefined_hosts(&hosts).await {

View File

@@ -6,9 +6,10 @@ use std::path::Path;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DashboardConfig {
pub zmq: ZmqConfig,
pub hosts: HostsConfig,
pub hosts: std::collections::HashMap<String, HostDetails>,
pub system: SystemConfig,
pub ssh: SshConfig,
pub service_logs: std::collections::HashMap<String, Vec<ServiceLogConfig>>,
}
/// ZMQ consumer configuration
@@ -17,10 +18,10 @@ pub struct ZmqConfig {
pub subscriber_ports: Vec<u16>,
}
/// Hosts configuration
/// Individual host configuration details
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct HostsConfig {
pub predefined_hosts: Vec<String>,
pub struct HostDetails {
pub mac_address: Option<String>,
}
/// System configuration
@@ -39,6 +40,13 @@ pub struct SshConfig {
pub rebuild_alias: String,
}
/// Service log file configuration per host
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ServiceLogConfig {
pub service_name: String,
pub log_file_path: String,
}
impl DashboardConfig {
pub fn load_from_file<P: AsRef<Path>>(path: P) -> Result<Self> {
let path = path.as_ref();
@@ -60,8 +68,3 @@ impl Default for ZmqConfig {
}
}
impl Default for HostsConfig {
fn default() -> Self {
panic!("Dashboard configuration must be loaded from file - no hardcoded defaults allowed")
}
}

View File

@@ -9,6 +9,7 @@ use ratatui::{
use std::collections::HashMap;
use std::time::Instant;
use tracing::info;
use wake_on_lan::MagicPacket;
pub mod theme;
pub mod widgets;
@@ -93,15 +94,25 @@ pub struct TuiApp {
impl TuiApp {
pub fn new(config: DashboardConfig) -> Self {
Self {
let mut app = Self {
host_widgets: HashMap::new(),
current_host: None,
available_hosts: Vec::new(),
available_hosts: config.hosts.keys().cloned().collect(),
host_index: 0,
should_quit: false,
user_navigated_away: false,
config,
};
// Sort predefined hosts
app.available_hosts.sort();
// Initialize with first host if available
if !app.available_hosts.is_empty() {
app.current_host = Some(app.available_hosts[0].clone());
}
app
}
/// Get or create host widgets for the given hostname
@@ -186,21 +197,28 @@ impl TuiApp {
}
/// Update available hosts with localhost prioritization
pub fn update_hosts(&mut self, hosts: Vec<String>) {
// Sort hosts alphabetically
let mut sorted_hosts = hosts.clone();
pub fn update_hosts(&mut self, discovered_hosts: Vec<String>) {
// Start with configured hosts (always visible)
let mut all_hosts: Vec<String> = self.config.hosts.keys().cloned().collect();
// Add any discovered hosts that aren't already configured
for host in discovered_hosts {
if !all_hosts.contains(&host) {
all_hosts.push(host);
}
}
// Keep hosts that have pending transitions even if they're offline
for (hostname, host_widgets) in &self.host_widgets {
if !host_widgets.pending_service_transitions.is_empty() {
if !sorted_hosts.contains(hostname) {
sorted_hosts.push(hostname.clone());
if !all_hosts.contains(hostname) {
all_hosts.push(hostname.clone());
}
}
}
sorted_hosts.sort();
self.available_hosts = sorted_hosts;
all_hosts.sort();
self.available_hosts = all_hosts;
// Get the current hostname (localhost) for auto-selection
let localhost = gethostname::gethostname().to_string_lossy().to_string();
@@ -244,14 +262,9 @@ impl TuiApp {
KeyCode::Char('r') => {
// System rebuild command - works on any panel for current host
if let Some(hostname) = self.current_host.clone() {
// Create command that shows CM Dashboard logo and then rebuilds
// Create command that shows logo, rebuilds, and waits for user input
let logo_and_rebuild = format!(
r"cat << 'EOF'
NixOS System Rebuild
Target: {}
EOF
ssh -tt {}@{} 'bash -ic {}'",
"bash -c 'cat << \"EOF\"\nNixOS System Rebuild\nTarget: {}\n\nEOF\nssh -tt {}@{} \"bash -ic {}\"\necho\necho \"========================================\"\necho \"Rebuild completed. Press any key to close...\"\necho \"========================================\"\nread -n 1 -s\nexit'",
hostname,
self.config.ssh.rebuild_user,
hostname,
@@ -259,11 +272,10 @@ ssh -tt {}@{} 'bash -ic {}'",
);
std::process::Command::new("tmux")
.arg("display-popup")
.arg("-w")
.arg("80%")
.arg("-h")
.arg("80%")
.arg("split-window")
.arg("-v")
.arg("-p")
.arg("30")
.arg(&logo_and_rebuild)
.spawn()
.ok(); // Ignore errors, tmux will handle them
@@ -286,28 +298,50 @@ ssh -tt {}@{} 'bash -ic {}'",
}
}
KeyCode::Char('J') => {
// Show service logs via journalctl in tmux popup
// Show service logs via journalctl in tmux split window
if let (Some(service_name), Some(hostname)) = (self.get_selected_service(), self.current_host.clone()) {
let journalctl_command = format!(
"ssh -tt {}@{} 'journalctl -u {}.service -f --no-pager -n 50'",
"bash -c \"ssh -tt {}@{} 'sudo journalctl -u {}.service -f --no-pager -n 50'; exit\"",
self.config.ssh.rebuild_user,
hostname,
service_name
);
std::process::Command::new("tmux")
.arg("display-popup")
.arg("-w")
.arg("80%")
.arg("-h")
.arg("80%")
.arg("-t")
.arg(format!("Logs: {}", service_name))
.arg("split-window")
.arg("-v")
.arg("-p")
.arg("30")
.arg(&journalctl_command)
.spawn()
.ok(); // Ignore errors, tmux will handle them
}
}
KeyCode::Char('L') => {
// Show custom service log file in tmux split window
if let (Some(service_name), Some(hostname)) = (self.get_selected_service(), self.current_host.clone()) {
// Check if this service has a custom log file configured
if let Some(host_logs) = self.config.service_logs.get(&hostname) {
if let Some(log_config) = host_logs.iter().find(|config| config.service_name == service_name) {
let tail_command = format!(
"bash -c \"ssh -tt {}@{} 'sudo tail -n 50 -f {}'; exit\"",
self.config.ssh.rebuild_user,
hostname,
log_config.log_file_path
);
std::process::Command::new("tmux")
.arg("split-window")
.arg("-v")
.arg("-p")
.arg("30")
.arg(&tail_command)
.spawn()
.ok(); // Ignore errors, tmux will handle them
}
}
}
}
KeyCode::Char('b') => {
// Trigger backup
if let Some(hostname) = self.current_host.clone() {
@@ -315,6 +349,33 @@ ssh -tt {}@{} 'bash -ic {}'",
return Ok(Some(UiCommand::TriggerBackup { hostname }));
}
}
KeyCode::Char('w') => {
// Wake on LAN for offline hosts
if let Some(hostname) = self.current_host.clone() {
// Check if host has MAC address configured
if let Some(host_details) = self.config.hosts.get(&hostname) {
if let Some(mac_address) = &host_details.mac_address {
// Parse MAC address and send WoL packet
let mac_bytes = Self::parse_mac_address(mac_address);
match mac_bytes {
Ok(mac) => {
match MagicPacket::new(&mac).send() {
Ok(_) => {
info!("WakeOnLAN packet sent successfully to {} ({})", hostname, mac_address);
}
Err(e) => {
tracing::error!("Failed to send WakeOnLAN packet to {}: {}", hostname, e);
}
}
}
Err(_) => {
tracing::error!("Invalid MAC address format for {}: {}", hostname, mac_address);
}
}
}
}
}
}
KeyCode::Tab => {
// Tab cycles to next host
self.navigate_host(1);
@@ -576,11 +637,14 @@ ssh -tt {}@{} 'bash -ic {}'",
return;
}
// Calculate worst-case status across all hosts
// Calculate worst-case status across all hosts (excluding offline)
let mut worst_status = Status::Ok;
for host in &self.available_hosts {
let host_status = self.calculate_host_status(host, metric_store);
worst_status = Status::aggregate(&[worst_status, host_status]);
// Don't include offline hosts in status aggregation
if host_status != Status::Offline {
worst_status = Status::aggregate(&[worst_status, host_status]);
}
}
// Use the worst status color as background
@@ -658,7 +722,7 @@ ssh -tt {}@{} 'bash -ic {}'",
let metrics = metric_store.get_metrics_for_host(hostname);
if metrics.is_empty() {
return Status::Unknown;
return Status::Offline;
}
// First check if we have the aggregated host status summary from the agent
@@ -678,7 +742,8 @@ ssh -tt {}@{} 'bash -ic {}'",
Status::Warning => has_warning = true,
Status::Pending => has_pending = true,
Status::Ok => ok_count += 1,
Status::Unknown => {} // Ignore unknown for aggregation
Status::Unknown => {}, // Ignore unknown for aggregation
Status::Offline => {}, // Ignore offline for aggregation
}
}
@@ -718,6 +783,8 @@ ssh -tt {}@{} 'bash -ic {}'",
shortcuts.push("r: Rebuild".to_string());
shortcuts.push("s/S: Start/Stop".to_string());
shortcuts.push("J: Logs".to_string());
shortcuts.push("L: Custom".to_string());
shortcuts.push("w: Wake".to_string());
// Always show quit
shortcuts.push("q: Quit".to_string());
@@ -756,5 +823,20 @@ ssh -tt {}@{} 'bash -ic {}'",
}
}
/// Parse MAC address string (e.g., "AA:BB:CC:DD:EE:FF") to [u8; 6]
fn parse_mac_address(mac_str: &str) -> Result<[u8; 6], &'static str> {
let parts: Vec<&str> = mac_str.split(':').collect();
if parts.len() != 6 {
return Err("MAC address must have 6 parts separated by colons");
}
let mut mac = [0u8; 6];
for (i, part) in parts.iter().enumerate() {
match u8::from_str_radix(part, 16) {
Ok(byte) => mac[i] = byte,
Err(_) => return Err("Invalid hexadecimal byte in MAC address"),
}
}
Ok(mac)
}
}

View File

@@ -147,6 +147,7 @@ impl Theme {
Status::Warning => Self::warning(),
Status::Critical => Self::error(),
Status::Unknown => Self::muted_text(),
Status::Offline => Self::muted_text(), // Dark gray for offline
}
}
@@ -244,8 +245,9 @@ impl StatusIcons {
Status::Ok => "",
Status::Pending => "", // Hollow circle for pending
Status::Warning => "",
Status::Critical => "",
Status::Critical => "!",
Status::Unknown => "?",
Status::Offline => "", // Empty circle for offline
}
}
@@ -258,6 +260,7 @@ impl StatusIcons {
Status::Warning => Theme::warning(), // Yellow
Status::Critical => Theme::error(), // Red
Status::Unknown => Theme::muted_text(), // Gray
Status::Offline => Theme::muted_text(), // Dark gray for offline
};
vec![

View File

@@ -146,6 +146,7 @@ impl ServicesWidget {
Status::Warning => Theme::warning(),
Status::Critical => Theme::error(),
Status::Unknown => Theme::muted_text(),
Status::Offline => Theme::muted_text(),
};
(icon.to_string(), info.status.clone(), status_color)

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard-shared"
version = "0.1.47"
version = "0.1.55"
edition = "2021"
[dependencies]

View File

@@ -87,6 +87,7 @@ pub enum Status {
Warning,
Critical,
Unknown,
Offline,
}
impl Status {
@@ -190,6 +191,16 @@ impl HysteresisThresholds {
Status::Ok
}
}
Status::Offline => {
// Host coming back online, use normal thresholds like first measurement
if value >= self.critical_high {
Status::Critical
} else if value >= self.warning_high {
Status::Warning
} else {
Status::Ok
}
}
}
}
}