Compare commits

..

4 Commits

Author SHA1 Message Date
4f4c3b0d6e Improve notification behavior during startup and recovery
All checks were successful
Build and Release / build-and-release (push) Successful in 2m9s
Fix notification issues for better operational experience:

Startup Notification Suppression:
- Suppress notifications for transitions from Status::Unknown during agent/server startup
- Prevents notification spam when services transition from Unknown to Warning/Critical on restart
- Only real status changes (not initial discovery) trigger notifications
- Maintains alerting for actual service state changes after startup

Recovery Notification Refinement:
- Recovery notifications only sent when ALL services reach OK status
- Individual service recoveries suppressed if other services still have problems
- Ensures recovery notifications indicate complete system health restoration
- Prevents premature celebration when partial recoveries occur

Result: Clean startup experience without false alerts and meaningful recovery
notifications that truly indicate full system health restoration.

Bump version to v0.1.48
2025-10-30 12:35:23 +01:00
bd20f0cae1 Fix user-stopped flag timing and service transition handling
All checks were successful
Build and Release / build-and-release (push) Successful in 2m9s
Correct user-stopped service behavior during startup transitions:

User-Stopped Flag Timing Fix:
- Clear user-stopped flag only when service actually becomes active, not when start command succeeds
- Remove premature flag clearing from service control handler
- Add automatic flag clearing when service status metrics show active state
- Services retain user-stopped status during activating/transitioning states

Service Transition Handling:
- User-stopped services in activating state now report Status::OK instead of Status::Pending
- Prevents host warnings during legitimate service startup transitions
- Maintains accurate status reporting throughout service lifecycle
- Failed service starts preserve user-stopped flags correctly

Journalctl Popup Fix:
- Fix terminal corruption when using J key for service logs
- Correct command quoting to prevent tmux popup interference
- Stable popup display without dashboard interface corruption

Result: Clean service startup experience with no false warnings and proper
user-stopped tracking throughout the entire service lifecycle.

Bump version to v0.1.47
2025-10-30 12:05:54 +01:00
11c9a5f9d2 Add service logs feature and improve tmux popup sizing
All checks were successful
Build and Release / build-and-release (push) Successful in 2m9s
New Features:
- Add journalctl service logs viewer via Shift+J key
- Opens tmux popup with real-time log streaming using journalctl -f
- Shows last 50 lines and follows new log entries for selected service
- Popup titled 'Logs: service.service' for clear context

Improvements:
- Increase tmux popup size to 80% width and height for better readability
- Applies to both rebuild (R) and logs (J) popups
- Compact status line text to fit new J: Logs shortcut
- Updated documentation with new key binding

Navigation Updates:
- J: Show service logs (journalctl in tmux popup)
- Status line: Tab: Host • ↑↓/jk: Select • r: Rebuild • s/S: Start/Stop • J: Logs • q: Quit

Bump version to v0.1.46
2025-10-30 11:21:14 +01:00
aeae60146d Fix user-stopped service display and flag timing issues
All checks were successful
Build and Release / build-and-release (push) Successful in 2m10s
Improve user-stopped service tracking behavior:

Service Display Fix:
- Services widget now shows actual systemctl status (active/inactive)
- Use info.status instead of hardcoded text based on widget_status
- User-stopped services correctly display 'inactive' with green OK icon
- Prevents misleading 'active' display for stopped services

User-Stopped Flag Timing Fix:
- Clear user-stopped flag AFTER successful service start, not when command sent
- Prevents warnings during service startup transition period
- Service remains Status::OK during 'activating' state for user-stopped services
- Flag only cleared when systemctl start command actually succeeds
- Failed start attempts preserve user-stopped flag

Result: Clean service state tracking with accurate display and no false alerts
during intentional user operations.

Bump version to v0.1.45
2025-10-30 11:11:39 +01:00
11 changed files with 103 additions and 31 deletions

View File

@@ -25,6 +25,7 @@ A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure.
- **Service Actions**: - **Service Actions**:
- `s` - Start service (sends UserStart command) - `s` - Start service (sends UserStart command)
- `S` - Stop service (sends UserStop command) - `S` - Stop service (sends UserStop command)
- `J` - Show service logs (journalctl in tmux popup)
- `R` - Rebuild current host - `R` - Rebuild current host
- **Visual Status**: Green ● (active), Yellow ◐ (inactive), Red ◯ (failed) - **Visual Status**: Green ● (active), Yellow ◐ (inactive), Red ◯ (failed)
- **Transitional Icons**: Blue arrows during operations - **Transitional Icons**: Blue arrows during operations
@@ -32,6 +33,7 @@ A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure.
### Navigation ### Navigation
- **Tab**: Switch between hosts - **Tab**: Switch between hosts
- **↑↓ or j/k**: Select services - **↑↓ or j/k**: Select services
- **J**: Show service logs (journalctl)
- **q**: Quit dashboard - **q**: Quit dashboard
## Core Architecture Principles ## Core Architecture Principles

6
Cargo.lock generated
View File

@@ -270,7 +270,7 @@ checksum = "a1d728cc89cf3aee9ff92b05e62b19ee65a02b5702cff7d5a377e32c6ae29d8d"
[[package]] [[package]]
name = "cm-dashboard" name = "cm-dashboard"
version = "0.1.43" version = "0.1.47"
dependencies = [ dependencies = [
"anyhow", "anyhow",
"chrono", "chrono",
@@ -291,7 +291,7 @@ dependencies = [
[[package]] [[package]]
name = "cm-dashboard-agent" name = "cm-dashboard-agent"
version = "0.1.43" version = "0.1.47"
dependencies = [ dependencies = [
"anyhow", "anyhow",
"async-trait", "async-trait",
@@ -314,7 +314,7 @@ dependencies = [
[[package]] [[package]]
name = "cm-dashboard-shared" name = "cm-dashboard-shared"
version = "0.1.43" version = "0.1.47"
dependencies = [ dependencies = [
"chrono", "chrono",
"serde", "serde",

View File

@@ -87,6 +87,7 @@ cm-dashboard • ● cmbox ● srv01 ● srv02 ● steambox
- **↑↓ or j/k**: Navigate services - **↑↓ or j/k**: Navigate services
- **s**: Start selected service (UserStart) - **s**: Start selected service (UserStart)
- **S**: Stop selected service (UserStop) - **S**: Stop selected service (UserStop)
- **J**: Show service logs (journalctl in tmux popup)
- **R**: Rebuild current host - **R**: Rebuild current host
- **q**: Quit - **q**: Quit

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "cm-dashboard-agent" name = "cm-dashboard-agent"
version = "0.1.44" version = "0.1.48"
edition = "2021" edition = "2021"
[dependencies] [dependencies]

View File

@@ -180,6 +180,9 @@ impl Agent {
let version_metric = self.get_agent_version_metric(); let version_metric = self.get_agent_version_metric();
metrics.push(version_metric); metrics.push(version_metric);
// Check for user-stopped services that are now active and clear their flags
self.clear_user_stopped_flags_for_active_services(&metrics);
if metrics.is_empty() { if metrics.is_empty() {
debug!("No metrics to broadcast"); debug!("No metrics to broadcast");
return Ok(()); return Ok(());
@@ -288,7 +291,7 @@ impl Agent {
info!("Executing systemctl {} {} (user action: {})", action_str, service_name, is_user_action); info!("Executing systemctl {} {} (user action: {})", action_str, service_name, is_user_action);
// Handle user-stopped service tracking before systemctl execution // Handle user-stopped service tracking before systemctl execution (stop only)
match action { match action {
ServiceAction::UserStop => { ServiceAction::UserStop => {
info!("Marking service '{}' as user-stopped", service_name); info!("Marking service '{}' as user-stopped", service_name);
@@ -299,15 +302,6 @@ impl Agent {
UserStoppedServiceTracker::update_global(&self.service_tracker); UserStoppedServiceTracker::update_global(&self.service_tracker);
} }
} }
ServiceAction::UserStart => {
info!("Clearing user-stopped flag for service '{}'", service_name);
if let Err(e) = self.service_tracker.clear_user_stopped(service_name) {
error!("Failed to clear user-stopped flag: {}", e);
} else {
// Sync to global tracker
UserStoppedServiceTracker::update_global(&self.service_tracker);
}
}
_ => {} _ => {}
} }
@@ -323,6 +317,9 @@ impl Agent {
if !output.stdout.is_empty() { if !output.stdout.is_empty() {
debug!("stdout: {}", String::from_utf8_lossy(&output.stdout)); debug!("stdout: {}", String::from_utf8_lossy(&output.stdout));
} }
// Note: User-stopped flag will be cleared by systemd collector
// when service actually reaches 'active' state, not here
} else { } else {
let stderr = String::from_utf8_lossy(&output.stderr); let stderr = String::from_utf8_lossy(&output.stderr);
error!("Service {} {} failed: {}", service_name, action_str, stderr); error!("Service {} {} failed: {}", service_name, action_str, stderr);
@@ -342,4 +339,33 @@ impl Agent {
Ok(()) Ok(())
} }
/// Check metrics for user-stopped services that are now active and clear their flags
fn clear_user_stopped_flags_for_active_services(&mut self, metrics: &[Metric]) {
for metric in metrics {
// Look for service status metrics that are active
if metric.name.starts_with("service_") && metric.name.ends_with("_status") {
if let MetricValue::String(status) = &metric.value {
if status == "active" {
// Extract service name from metric name (service_nginx_status -> nginx)
let service_name = metric.name
.strip_prefix("service_")
.and_then(|s| s.strip_suffix("_status"))
.unwrap_or("");
if !service_name.is_empty() && UserStoppedServiceTracker::is_service_user_stopped(service_name) {
info!("Service '{}' is now active - clearing user-stopped flag", service_name);
if let Err(e) = self.service_tracker.clear_user_stopped(service_name) {
error!("Failed to clear user-stopped flag for '{}': {}", service_name, e);
} else {
// Sync to global tracker
UserStoppedServiceTracker::update_global(&self.service_tracker);
debug!("Cleared user-stopped flag for service '{}'", service_name);
}
}
}
}
}
}
}
} }

View File

@@ -357,7 +357,15 @@ impl SystemdCollector {
/// Calculate service status, taking user-stopped services into account /// Calculate service status, taking user-stopped services into account
fn calculate_service_status(&self, service_name: &str, active_status: &str) -> Status { fn calculate_service_status(&self, service_name: &str, active_status: &str) -> Status {
match active_status.to_lowercase().as_str() { match active_status.to_lowercase().as_str() {
"active" => Status::Ok, "active" => {
// If service is now active and was marked as user-stopped, clear the flag
if UserStoppedServiceTracker::is_service_user_stopped(service_name) {
debug!("Service '{}' is now active - clearing user-stopped flag", service_name);
// Note: We can't directly clear here because this is a read-only context
// The agent will need to handle this differently
}
Status::Ok
},
"inactive" | "dead" => { "inactive" | "dead" => {
// Check if this service was stopped by user action // Check if this service was stopped by user action
if UserStoppedServiceTracker::is_service_user_stopped(service_name) { if UserStoppedServiceTracker::is_service_user_stopped(service_name) {
@@ -368,7 +376,15 @@ impl SystemdCollector {
} }
}, },
"failed" | "error" => Status::Critical, "failed" | "error" => Status::Critical,
"activating" | "deactivating" | "reloading" | "start" | "stop" | "restart" => Status::Pending, "activating" | "deactivating" | "reloading" | "start" | "stop" | "restart" => {
// For user-stopped services that are transitioning, keep them as OK during transition
if UserStoppedServiceTracker::is_service_user_stopped(service_name) {
debug!("Service '{}' is transitioning but was user-stopped - treating as OK", service_name);
Status::Ok
} else {
Status::Pending
}
},
_ => Status::Unknown, _ => Status::Unknown,
} }
} }

View File

@@ -272,11 +272,13 @@ impl HostStatusManager {
/// Check if a status change is significant enough for notification /// Check if a status change is significant enough for notification
fn is_significant_change(&self, old_status: Status, new_status: Status) -> bool { fn is_significant_change(&self, old_status: Status, new_status: Status) -> bool {
match (old_status, new_status) { match (old_status, new_status) {
// Always notify on problems // Don't notify on transitions from Unknown (startup/restart scenario)
(Status::Unknown, _) => false,
// Always notify on problems (but not from Unknown)
(_, Status::Warning) | (_, Status::Critical) => true, (_, Status::Warning) | (_, Status::Critical) => true,
// Only notify on recovery if it's from a problem state to OK and all services are OK // Only notify on recovery if it's from a problem state to OK and all services are OK
(Status::Warning | Status::Critical, Status::Ok) => self.current_host_status == Status::Ok, (Status::Warning | Status::Critical, Status::Ok) => self.current_host_status == Status::Ok,
// Don't notify on startup or other transitions // Don't notify on other transitions
_ => false, _ => false,
} }
} }
@@ -374,8 +376,8 @@ impl HostStatusManager {
details.push('\n'); details.push('\n');
} }
// Show recoveries // Show recoveries only if host status is now OK (all services recovered)
if !recovery_changes.is_empty() { if !recovery_changes.is_empty() && aggregated.host_status_final == Status::Ok {
details.push_str(&format!("✅ RECOVERIES ({}):\n", recovery_changes.len())); details.push_str(&format!("✅ RECOVERIES ({}):\n", recovery_changes.len()));
for change in recovery_changes { for change in recovery_changes {
details.push_str(&format!(" {}\n", change)); details.push_str(&format!(" {}\n", change));

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "cm-dashboard" name = "cm-dashboard"
version = "0.1.44" version = "0.1.48"
edition = "2021" edition = "2021"
[dependencies] [dependencies]

View File

@@ -260,6 +260,10 @@ ssh -tt {}@{} 'bash -ic {}'",
std::process::Command::new("tmux") std::process::Command::new("tmux")
.arg("display-popup") .arg("display-popup")
.arg("-w")
.arg("80%")
.arg("-h")
.arg("80%")
.arg(&logo_and_rebuild) .arg(&logo_and_rebuild)
.spawn() .spawn()
.ok(); // Ignore errors, tmux will handle them .ok(); // Ignore errors, tmux will handle them
@@ -281,6 +285,29 @@ ssh -tt {}@{} 'bash -ic {}'",
} }
} }
} }
KeyCode::Char('J') => {
// Show service logs via journalctl in tmux popup
if let (Some(service_name), Some(hostname)) = (self.get_selected_service(), self.current_host.clone()) {
let journalctl_command = format!(
"ssh -tt {}@{} 'journalctl -u {}.service -f --no-pager -n 50'",
self.config.ssh.rebuild_user,
hostname,
service_name
);
std::process::Command::new("tmux")
.arg("display-popup")
.arg("-w")
.arg("80%")
.arg("-h")
.arg("80%")
.arg("-t")
.arg(format!("Logs: {}", service_name))
.arg(&journalctl_command)
.spawn()
.ok(); // Ignore errors, tmux will handle them
}
}
KeyCode::Char('b') => { KeyCode::Char('b') => {
// Trigger backup // Trigger backup
if let Some(hostname) = self.current_host.clone() { if let Some(hostname) = self.current_host.clone() {
@@ -686,10 +713,11 @@ ssh -tt {}@{} 'bash -ic {}'",
let mut shortcuts = Vec::new(); let mut shortcuts = Vec::new();
// Global shortcuts // Global shortcuts
shortcuts.push("Tab: Switch Host".to_string()); shortcuts.push("Tab: Host".to_string());
shortcuts.push("↑↓/jk: Select Service".to_string()); shortcuts.push("↑↓/jk: Select".to_string());
shortcuts.push("r: Rebuild Host".to_string()); shortcuts.push("r: Rebuild".to_string());
shortcuts.push("s/S: Start/Stop Service".to_string()); shortcuts.push("s/S: Start/Stop".to_string());
shortcuts.push("J: Logs".to_string());
// Always show quit // Always show quit
shortcuts.push("q: Quit".to_string()); shortcuts.push("q: Quit".to_string());

View File

@@ -113,13 +113,10 @@ impl ServicesWidget {
name.to_string() name.to_string()
}; };
// Parent services always show active/inactive status // Parent services always show actual systemctl status
let status_str = match info.widget_status { let status_str = match info.widget_status {
Status::Ok => "active".to_string(),
Status::Pending => "pending".to_string(), Status::Pending => "pending".to_string(),
Status::Warning => "inactive".to_string(), _ => info.status.clone(), // Use actual status from agent (active/inactive/failed)
Status::Critical => "failed".to_string(),
Status::Unknown => "unknown".to_string(),
}; };
format!( format!(

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "cm-dashboard-shared" name = "cm-dashboard-shared"
version = "0.1.44" version = "0.1.48"
edition = "2021" edition = "2021"
[dependencies] [dependencies]