Compare commits

...

9 Commits

Author SHA1 Message Date
bc94f75328 Enable real-time output streaming for nixos-rebuild command
All checks were successful
Build and Release / build-and-release (push) Successful in 1m24s
- Replace simulated progress messages with actual stdout/stderr capture
- Stream all nixos-rebuild output line-by-line to terminal popup
- Show transparent build process including downloads, compilation, and activation
- Maintain real-time visibility into complete rebuild process
2025-10-26 13:00:53 +01:00
b6da71b7e7 Implement real-time terminal popup for system rebuild operations
All checks were successful
Build and Release / build-and-release (push) Successful in 1m21s
- Add terminal popup UI component with 80% screen coverage and terminal styling
- Extend ZMQ protocol with CommandOutputMessage for streaming output
- Implement real-time output streaming in agent system rebuild handler
- Add keyboard controls (ESC/Q to close, ↑↓ to scroll) for popup interaction
- Fix system panel Build display to show actual NixOS build instead of config hash
- Update service filters in README with wildcard patterns for better matching
- Add periodic progress updates during nixos-rebuild execution
- Integrate command output handling in dashboard main loop
2025-10-26 11:39:03 +01:00
aaf7edfbce Implement cross-host agent version comparison
- MetricStore tracks agent versions from all hosts
- Detects version mismatches using most common version as reference
- Dashboard logs warnings for hosts with outdated agents
- Foundation for visual version mismatch indicators in UI
- Helps identify deployment inconsistencies across infrastructure
2025-10-26 10:42:26 +01:00
bb72c42726 Add agent version reporting and display
- Agent reports version via agent_version metric using nix store hash
- Dashboard displays agent version in system widget
- Foundation for cross-host version comparison
- Both agent -V and dashboard show versions
2025-10-26 10:38:20 +01:00
af5f96ce2f Fix sed command in automated NixOS update workflow
All checks were successful
Build and Release / build-and-release (push) Successful in 1m23s
- Use pipe delimiter instead of forward slash to avoid conflicts
- Should fix 'number option to s command may not be zero' error
- More robust regex pattern matching
2025-10-26 01:13:58 +02:00
8dffe18a23 Improve SATA SSD wear level calculation
Some checks failed
Build and Release / build-and-release (push) Failing after 1m24s
- Support multiple SATA SSD wear attributes (SSD_Life_Left, Media_Wearout_Indicator, etc.)
- Handle manufacturer differences in wear reporting
- Proper parsing of SMART table format with VALUE column
- Covers Samsung, Intel, Crucial and other common SSD types
- NVMe Percentage Used support maintained
2025-10-25 22:32:09 +02:00
0c544753f9 Move SMART configuration into disk config
- Consolidate SMART thresholds into DiskConfig structure
- Remove separate SmartConfig - disk collector handles all drive data
- Update NixOS configuration to use disk.temperature_* settings
- Remove hardcoded temperature thresholds in disk collector
- Logical grouping: disk collector owns all disk/drive configuration
2025-10-25 22:29:26 +02:00
c8e26b9bac Remove redundant smart collector - consolidate SMART into disk collector
- Remove separate smart collector implementation
- Disk collector already handles SMART data for drives
- Eliminates duplicate smartctl calls causing performance issues
- SMART functionality remains in logical place with disk monitoring
- Fixes infinite smartctl loop issue
2025-10-25 22:25:22 +02:00
60ef712fac Fix hash conversion in NixOS update workflow
All checks were successful
Build and Release / build-and-release (push) Successful in 2m38s
- Replace xxd with Python for hex to base64 conversion
- Use standard tools available in GitHub Actions runners
- Should fix hash conversion error in automated workflow
2025-10-25 17:24:37 +02:00
17 changed files with 662 additions and 274 deletions

View File

@@ -108,12 +108,13 @@ jobs:
# Download tarball to get correct hash
curl -L -o cm-dashboard.tar.gz "$TARBALL_URL"
# Convert sha256 hex to base64 for Nix hash format using Python
NEW_HASH=$(sha256sum cm-dashboard.tar.gz | cut -d' ' -f1)
NIX_HASH="sha256-$(echo -n $NEW_HASH | xxd -r -p | base64)"
NIX_HASH="sha256-$(python3 -c "import base64, binascii; print(base64.b64encode(binascii.unhexlify('$NEW_HASH')).decode())")"
# Update the NixOS configuration
sed -i "s/version = \"v[^\"]*\"/version = \"$VERSION\"/" hosts/common/cm-dashboard.nix
sed -i "s/sha256 = \"sha256-[^\"]*\"/sha256 = \"$NIX_HASH\"/" hosts/common/cm-dashboard.nix
sed -i "s|version = \"v[^\"]*\"|version = \"$VERSION\"|" hosts/common/cm-dashboard.nix
sed -i "s|sha256 = \"sha256-[^\"]*\"|sha256 = \"$NIX_HASH\"|" hosts/common/cm-dashboard.nix
# Commit and push changes
git config user.name "Gitea Actions"

View File

@@ -28,18 +28,21 @@ All keyboard navigation and service selection features successfully implemented:
-**Smart Panel Switching**: Only cycles through panels with data (backup panel conditional)
-**Scroll Support**: All panels support content scrolling with proper overflow indicators
**Current Status - October 25, 2025:**
**Current Status - October 26, 2025:**
- All keyboard navigation features working correctly ✅
- Service selection cursor implemented with focus-aware highlighting ✅
- Panel scrolling fixed for System, Services, and Backup panels ✅
- Build display working: "Build: 25.05.20251004.3bcc93c" ✅
- Configuration hash display: Currently shows git hash, needs to be fixed ❌
- Agent version display working: "Agent: 3kvc03nd" ✅
- Cross-host version comparison implemented ✅
- Automated binary release system working ✅
- SMART data consolidated into disk collector ✅
**Target Layout:**
**Current Layout:**
```
NixOS:
Build: 25.05.20251004.3bcc93c
Config: d8ivwiar # Should show nix store hash (8 chars) from deployed system
Agent: 3kvc03nd # Shows agent version (nix store hash)
Active users: cm, simon
CPU:
● Load: 0.02 0.31 0.86 • 3000MHz
@@ -55,7 +58,8 @@ Storage:
**System panel layout fully implemented with blue tree symbols ✅**
**Tree symbols now use consistent blue theming across all panels ✅**
**Overflow handling restored for all widgets ("... and X more") ✅**
**Agent hash display working correctly ✅**
**Agent version display working correctly ✅**
**Cross-host version comparison logging warnings ✅**
### Current Keyboard Navigation Implementation

View File

@@ -152,10 +152,13 @@ interval_seconds = 10
memory_warning_mb = 1000.0
memory_critical_mb = 2000.0
service_name_filters = [
"nginx", "postgresql", "redis", "docker", "sshd"
"nginx*", "postgresql*", "redis*", "docker*", "sshd*",
"gitea*", "immich*", "haasp*", "mosquitto*", "mysql*",
"unifi*", "vaultwarden*"
]
excluded_services = [
"nginx-config-reload", "sshd-keygen"
"nginx-config-reload", "sshd-keygen", "systemd-",
"getty@", "user@", "dbus-", "NetworkManager-"
]
[notifications]

View File

@@ -9,7 +9,7 @@ use crate::config::AgentConfig;
use crate::metrics::MetricCollectionManager;
use crate::notifications::NotificationManager;
use crate::status::HostStatusManager;
use cm_dashboard_shared::{Metric, MetricMessage};
use cm_dashboard_shared::{CommandOutputMessage, Metric, MetricMessage, MetricValue, Status};
pub struct Agent {
hostname: String,
@@ -162,6 +162,10 @@ impl Agent {
let host_status_metric = self.host_status_manager.get_host_status_metric();
metrics.push(host_status_metric);
// Add agent version metric for cross-host version comparison
let version_metric = self.get_agent_version_metric();
metrics.push(version_metric);
if metrics.is_empty() {
debug!("No metrics to broadcast");
return Ok(());
@@ -183,6 +187,39 @@ impl Agent {
}
}
/// Create agent version metric for cross-host version comparison
fn get_agent_version_metric(&self) -> Metric {
// Get version from executable path (same logic as main.rs get_version)
let version = self.get_agent_version();
Metric::new(
"agent_version".to_string(),
MetricValue::String(version),
Status::Ok,
)
}
/// Get agent version from executable path
fn get_agent_version(&self) -> String {
match std::env::current_exe() {
Ok(exe_path) => {
let exe_str = exe_path.to_string_lossy();
// Extract Nix store hash from path
if let Some(hash_part) = exe_str.strip_prefix("/nix/store/") {
if let Some(hash) = hash_part.split('-').next() {
if hash.len() >= 8 {
return hash[..8].to_string();
}
}
}
"unknown".to_string()
},
Err(_) => "unknown".to_string()
}
}
async fn handle_commands(&mut self) -> Result<()> {
// Try to receive commands (non-blocking)
match self.zmq_handler.try_receive_command() {
@@ -281,32 +318,94 @@ impl Agent {
Ok(())
}
/// Handle NixOS system rebuild commands with git clone approach
/// Handle NixOS system rebuild commands with real-time output streaming
async fn handle_system_rebuild(&self, git_url: &str, git_branch: &str, working_dir: &str, api_key_file: Option<&str>) -> Result<()> {
info!("Starting NixOS system rebuild: {} @ {} -> {}", git_url, git_branch, working_dir);
let command_id = format!("rebuild_{}", chrono::Utc::now().timestamp());
// Send initial status
self.send_command_output(&command_id, "SystemRebuild", "Starting NixOS system rebuild...").await?;
// Enable maintenance mode before rebuild
let maintenance_file = "/tmp/cm-maintenance";
if let Err(e) = tokio::fs::File::create(maintenance_file).await {
error!("Failed to create maintenance mode file: {}", e);
self.send_command_output(&command_id, "SystemRebuild", &format!("Warning: Failed to create maintenance mode file: {}", e)).await?;
} else {
info!("Maintenance mode enabled");
self.send_command_output(&command_id, "SystemRebuild", "Maintenance mode enabled").await?;
}
// Clone or update repository
let git_result = self.ensure_git_repository(git_url, git_branch, working_dir, api_key_file).await;
self.send_command_output(&command_id, "SystemRebuild", "Cloning/updating git repository...").await?;
let git_result = self.ensure_git_repository_with_output(&command_id, git_url, git_branch, working_dir, api_key_file).await;
// Execute nixos-rebuild if git operation succeeded - run detached but log output
let rebuild_result = if git_result.is_ok() {
info!("Git repository ready, executing nixos-rebuild in detached mode");
let log_file = std::fs::OpenOptions::new()
.create(true)
.append(true)
.open("/var/log/cm-dashboard/nixos-rebuild.log")
.map_err(|e| anyhow::anyhow!("Failed to open rebuild log: {}", e))?;
if git_result.is_err() {
self.send_command_output(&command_id, "SystemRebuild", &format!("Git operation failed: {:?}", git_result)).await?;
self.send_command_output_complete(&command_id, "SystemRebuild").await?;
return git_result;
}
tokio::process::Command::new("nohup")
.arg("sudo")
self.send_command_output(&command_id, "SystemRebuild", "Git repository ready, starting nixos-rebuild...").await?;
// Execute nixos-rebuild with real-time output streaming
let rebuild_result = self.execute_nixos_rebuild_with_streaming(&command_id, working_dir).await;
// Always try to remove maintenance mode file
if let Err(e) = tokio::fs::remove_file(maintenance_file).await {
if e.kind() != std::io::ErrorKind::NotFound {
self.send_command_output(&command_id, "SystemRebuild", &format!("Warning: Failed to remove maintenance mode file: {}", e)).await?;
}
} else {
self.send_command_output(&command_id, "SystemRebuild", "Maintenance mode disabled").await?;
}
// Handle rebuild result
match rebuild_result {
Ok(()) => {
self.send_command_output(&command_id, "SystemRebuild", "✓ NixOS rebuild completed successfully!").await?;
}
Err(e) => {
self.send_command_output(&command_id, "SystemRebuild", &format!("✗ NixOS rebuild failed: {}", e)).await?;
}
}
// Signal completion
self.send_command_output_complete(&command_id, "SystemRebuild").await?;
info!("System rebuild streaming completed");
Ok(())
}
/// Send command output line to dashboard
async fn send_command_output(&self, command_id: &str, command_type: &str, output_line: &str) -> Result<()> {
let message = CommandOutputMessage::new(
self.hostname.clone(),
command_id.to_string(),
command_type.to_string(),
output_line.to_string(),
false,
);
self.zmq_handler.publish_command_output(&message).await
}
/// Send command completion signal to dashboard
async fn send_command_output_complete(&self, command_id: &str, command_type: &str) -> Result<()> {
let message = CommandOutputMessage::new(
self.hostname.clone(),
command_id.to_string(),
command_type.to_string(),
"Command completed".to_string(),
true,
);
self.zmq_handler.publish_command_output(&message).await
}
/// Execute nixos-rebuild with real-time output streaming
async fn execute_nixos_rebuild_with_streaming(&self, command_id: &str, working_dir: &str) -> Result<()> {
use tokio::io::{AsyncBufReadExt, BufReader};
use tokio::process::Command;
let mut child = Command::new("sudo")
.arg("/run/current-system/sw/bin/nixos-rebuild")
.arg("switch")
.arg("--option")
@@ -315,37 +414,69 @@ impl Agent {
.arg("--flake")
.arg(".")
.current_dir(working_dir)
.stdin(std::process::Stdio::null())
.stdout(std::process::Stdio::from(log_file.try_clone().unwrap()))
.stderr(std::process::Stdio::from(log_file))
.spawn()
} else {
return git_result.and_then(|_| unreachable!());
};
.stdout(std::process::Stdio::piped())
.stderr(std::process::Stdio::piped())
.spawn()?;
// Always try to remove maintenance mode file
if let Err(e) = tokio::fs::remove_file(maintenance_file).await {
if e.kind() != std::io::ErrorKind::NotFound {
error!("Failed to remove maintenance mode file: {}", e);
}
} else {
info!("Maintenance mode disabled");
}
// Get stdout and stderr handles
let stdout = child.stdout.take().expect("Failed to get stdout");
let stderr = child.stderr.take().expect("Failed to get stderr");
// Check rebuild start result
match rebuild_result {
Ok(_child) => {
info!("NixOS rebuild started successfully in background");
// Don't wait for completion to avoid agent being killed during rebuild
// Create readers for both streams
let stdout_reader = BufReader::new(stdout);
let stderr_reader = BufReader::new(stderr);
let mut stdout_lines = stdout_reader.lines();
let mut stderr_lines = stderr_reader.lines();
// Stream output lines in real-time
loop {
tokio::select! {
// Read from stdout
line = stdout_lines.next_line() => {
match line {
Ok(Some(line)) => {
self.send_command_output(command_id, "SystemRebuild", &line).await?;
}
Ok(None) => {
// stdout closed
}
Err(e) => {
error!("Failed to start nixos-rebuild: {}", e);
return Err(anyhow::anyhow!("Failed to start nixos-rebuild: {}", e));
self.send_command_output(command_id, "SystemRebuild", &format!("stdout error: {}", e)).await?;
}
}
}
// Read from stderr
line = stderr_lines.next_line() => {
match line {
Ok(Some(line)) => {
self.send_command_output(command_id, "SystemRebuild", &line).await?;
}
Ok(None) => {
// stderr closed
}
Err(e) => {
self.send_command_output(command_id, "SystemRebuild", &format!("stderr error: {}", e)).await?;
}
}
}
// Wait for process completion
result = child.wait() => {
let status = result?;
if status.success() {
return Ok(());
} else {
return Err(anyhow::anyhow!("nixos-rebuild exited with status: {}", status));
}
}
}
}
}
info!("System rebuild completed, triggering metric refresh");
Ok(())
/// Ensure git repository with output streaming
async fn ensure_git_repository_with_output(&self, command_id: &str, git_url: &str, git_branch: &str, working_dir: &str, api_key_file: Option<&str>) -> Result<()> {
// This is a simplified version - we can enhance this later with git output streaming
self.ensure_git_repository(git_url, git_branch, working_dir, api_key_file).await
}
/// Ensure git repository is cloned and up to date with force clone approach

View File

@@ -41,11 +41,11 @@ pub struct DiskCollector {
impl DiskCollector {
pub fn new(config: DiskConfig) -> Self {
// Create hysteresis thresholds for disk temperature
// Create hysteresis thresholds for disk temperature from config
let temperature_thresholds = HysteresisThresholds::with_custom_gaps(
60.0, // warning at 60°C
config.temperature_warning_celsius,
5.0, // 5°C gap for recovery
70.0, // critical at 70°C
config.temperature_critical_celsius,
5.0, // 5°C gap for recovery
);
@@ -219,18 +219,12 @@ impl DiskCollector {
}
/// Parse wear level from SMART output (SSD wear leveling)
/// Supports both NVMe and SATA SSD wear indicators
fn parse_wear_level_from_smart(&self, smart_output: &str) -> Option<f32> {
for line in smart_output.lines() {
// Look for wear leveling indicators
if line.contains("Wear_Leveling_Count") || line.contains("Media_Wearout_Indicator") {
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 10 {
if let Ok(wear) = parts[9].parse::<f32>() {
return Some(100.0 - wear); // Convert to percentage used
}
}
}
// NVMe drives might show percentage used directly
let line = line.trim();
// NVMe drives - direct percentage used
if line.contains("Percentage Used:") {
if let Some(wear_part) = line.split("Percentage Used:").nth(1) {
if let Some(wear_str) = wear_part.split('%').next() {
@@ -240,6 +234,38 @@ impl DiskCollector {
}
}
}
// SATA SSD attributes - parse SMART table format
// Format: ID ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 10 {
// SSD Life Left / Percent Lifetime Remaining (higher = less wear)
if line.contains("SSD_Life_Left") || line.contains("Percent_Lifetime_Remain") {
if let Ok(remaining) = parts[3].parse::<f32>() { // VALUE column
return Some(100.0 - remaining); // Convert remaining to used
}
}
// Media Wearout Indicator (lower = more wear, normalize to 0-100)
if line.contains("Media_Wearout_Indicator") {
if let Ok(remaining) = parts[3].parse::<f32>() { // VALUE column
return Some(100.0 - remaining); // Convert remaining to used
}
}
// Wear Leveling Count (higher = less wear, but varies by manufacturer)
if line.contains("Wear_Leveling_Count") {
if let Ok(wear_count) = parts[3].parse::<f32>() { // VALUE column
// Most SSDs: 100 = new, decreases with wear
if wear_count <= 100.0 {
return Some(100.0 - wear_count);
}
}
}
// Total LBAs Written - calculate against typical endurance if available
// This is more complex and manufacturer-specific, so we skip for now
}
}
None
}

View File

@@ -8,7 +8,6 @@ pub mod disk;
pub mod error;
pub mod memory;
pub mod nixos;
pub mod smart;
pub mod systemd;
pub use error::CollectorError;

View File

@@ -1,178 +0,0 @@
use async_trait::async_trait;
use cm_dashboard_shared::{Metric, MetricValue, Status, StatusTracker};
use std::process::Stdio;
use tokio::process::Command;
use tracing::{debug, warn};
use super::{Collector, CollectorError};
pub struct SmartCollector;
impl SmartCollector {
/// Get list of storage devices to monitor
async fn get_devices(&self) -> Result<Vec<String>, CollectorError> {
let output = Command::new("lsblk")
.args(["-d", "-n", "-o", "NAME,TYPE"])
.stdout(Stdio::piped())
.stderr(Stdio::null())
.output()
.await
.map_err(|e| CollectorError::SystemRead {
path: "lsblk".to_string(),
error: e.to_string()
})?;
if !output.status.success() {
return Ok(Vec::new()); // Return empty if lsblk fails
}
let stdout = String::from_utf8_lossy(&output.stdout);
let mut devices = Vec::new();
for line in stdout.lines() {
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 2 && parts[1] == "disk" {
let device_name = parts[0];
if device_name.starts_with("nvme") || device_name.starts_with("sd") {
devices.push(format!("/dev/{}", device_name));
}
}
}
Ok(devices)
}
/// Collect SMART data for a single device
async fn collect_device_smart(&self, device: &str, status_tracker: &mut StatusTracker) -> Result<Vec<Metric>, CollectorError> {
debug!("Collecting SMART data for device: {}", device);
let output = Command::new("sudo")
.args(["smartctl", "-H", "-A", device]) // Health and attributes only
.stdout(Stdio::piped())
.stderr(Stdio::null())
.output()
.await
.map_err(|e| CollectorError::SystemRead {
path: "lsblk".to_string(),
error: e.to_string()
})?;
if !output.status.success() {
warn!("smartctl failed for device: {}", device);
return Ok(Vec::new());
}
let stdout = String::from_utf8_lossy(&output.stdout);
self.parse_smart_output(device, &stdout, status_tracker)
}
/// Parse smartctl output and create metrics
fn parse_smart_output(&self, device: &str, output: &str, status_tracker: &mut StatusTracker) -> Result<Vec<Metric>, CollectorError> {
let mut metrics = Vec::new();
let device_name = device.trim_start_matches("/dev/");
let mut health_ok = true;
let mut temperature: Option<f32> = None;
for line in output.lines() {
let line = line.trim();
// Parse health status
if line.contains("SMART overall-health self-assessment") {
if line.contains("FAILED") {
health_ok = false;
}
}
// Parse temperature from various formats
if (line.contains("Temperature") || line.contains("Airflow_Temperature")) && temperature.is_none() {
if let Some(temp) = self.extract_temperature(line) {
temperature = Some(temp);
}
}
}
// Create health metric
let health_status = if health_ok {
Status::Ok
} else {
Status::Critical
};
metrics.push(Metric::new(
format!("smart_health_{}", device_name),
MetricValue::String(if health_ok { "PASSED".to_string() } else { "FAILED".to_string() }),
health_status,
));
// Create temperature metric if available
if let Some(temp) = temperature {
let temp_status = if temp >= 70.0 {
Status::Critical
} else if temp >= 60.0 {
Status::Warning
} else {
Status::Ok
};
metrics.push(Metric::new(
format!("smart_temperature_{}", device_name),
MetricValue::Float(temp),
temp_status,
).with_unit("celsius".to_string()));
}
debug!("Collected {} SMART metrics for {}", metrics.len(), device);
Ok(metrics)
}
/// Extract temperature value from smartctl output line
fn extract_temperature(&self, line: &str) -> Option<f32> {
let parts: Vec<&str> = line.split_whitespace().collect();
for (i, part) in parts.iter().enumerate() {
if let Ok(temp) = part.parse::<f32>() {
// Check if this looks like a temperature value (reasonable range)
if temp > 0.0 && temp < 150.0 {
// Check context around the number
if i + 1 < parts.len() {
let next = parts[i + 1].to_lowercase();
if next.contains("celsius") || next.contains("°c") || next == "c" {
return Some(temp);
}
}
// For SMART attribute lines, temperature is often the 10th column
if parts.len() >= 10 && (line.contains("Temperature") || line.contains("Airflow_Temperature")) {
return Some(temp);
}
}
}
}
None
}
}
#[async_trait]
impl Collector for SmartCollector {
async fn collect(&self, status_tracker: &mut StatusTracker) -> Result<Vec<Metric>, CollectorError> {
debug!("Starting SMART data collection");
let devices = self.get_devices().await?;
let mut all_metrics = Vec::new();
for device in devices {
match self.collect_device_smart(&device, status_tracker).await {
Ok(mut metrics) => {
all_metrics.append(&mut metrics);
}
Err(e) => {
warn!("Failed to collect SMART data for {}: {}", device, e);
// Continue with other devices
}
}
}
debug!("Collected {} total SMART metrics", all_metrics.len());
Ok(all_metrics)
}
}

View File

@@ -1,5 +1,5 @@
use anyhow::Result;
use cm_dashboard_shared::{MessageEnvelope, MetricMessage};
use cm_dashboard_shared::{CommandOutputMessage, MessageEnvelope, MetricMessage};
use tracing::{debug, info};
use zmq::{Context, Socket, SocketType};
@@ -65,6 +65,24 @@ impl ZmqHandler {
Ok(())
}
/// Publish command output message via ZMQ
pub async fn publish_command_output(&self, message: &CommandOutputMessage) -> Result<()> {
debug!(
"Publishing command output for host {} (command: {}): {}",
message.hostname,
message.command_type,
message.output_line
);
let envelope = MessageEnvelope::command_output(message.clone())?;
let serialized = serde_json::to_vec(&envelope)?;
self.publisher.send(&serialized, 0)?;
debug!("Command output published successfully");
Ok(())
}
/// Send heartbeat (placeholder for future use)
/// Try to receive a command (non-blocking)

View File

@@ -36,7 +36,6 @@ pub struct CollectorConfig {
pub memory: MemoryConfig,
pub disk: DiskConfig,
pub systemd: SystemdConfig,
pub smart: SmartConfig,
pub backup: BackupConfig,
pub network: NetworkConfig,
pub nixos: NixOSConfig,
@@ -75,6 +74,11 @@ pub struct DiskConfig {
pub usage_critical_percent: f32,
/// Filesystem configurations
pub filesystems: Vec<FilesystemConfig>,
/// SMART monitoring thresholds
pub temperature_warning_celsius: f32,
pub temperature_critical_celsius: f32,
pub wear_warning_percent: f32,
pub wear_critical_percent: f32,
}
/// Filesystem configuration entry
@@ -102,16 +106,6 @@ pub struct SystemdConfig {
pub host_user_mapping: String,
}
/// SMART collector configuration
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SmartConfig {
pub enabled: bool,
pub interval_seconds: u64,
pub temperature_warning_celsius: f32,
pub temperature_critical_celsius: f32,
pub wear_warning_percent: f32,
pub wear_critical_percent: f32,
}
/// NixOS collector configuration
#[derive(Debug, Clone, Serialize, Deserialize)]

View File

@@ -13,10 +13,26 @@ mod status;
use agent::Agent;
/// Get version showing cm-dashboard-agent package hash for easy deployment verification
fn get_version() -> &'static str {
// Get the path of the current executable
let exe_path = std::env::current_exe().expect("Failed to get executable path");
let exe_str = exe_path.to_string_lossy();
// Extract Nix store hash from path like /nix/store/HASH-cm-dashboard-v0.1.8/bin/cm-dashboard-agent
let hash_part = exe_str.strip_prefix("/nix/store/").expect("Not a nix store path");
let hash = hash_part.split('-').next().expect("Invalid nix store path format");
assert!(hash.len() >= 8, "Hash too short");
// Return first 8 characters of nix store hash
let short_hash = hash[..8].to_string();
Box::leak(short_hash.into_boxed_str())
}
#[derive(Parser)]
#[command(name = "cm-dashboard-agent")]
#[command(about = "CM Dashboard metrics agent with individual metric collection")]
#[command(version)]
#[command(version = get_version())]
struct Cli {
/// Increase logging verbosity (-v, -vv)
#[arg(short, long, action = clap::ArgAction::Count)]

View File

@@ -4,7 +4,7 @@ use tracing::{error, info};
use crate::collectors::{
backup::BackupCollector, cpu::CpuCollector, disk::DiskCollector, memory::MemoryCollector,
nixos::NixOSCollector, smart::SmartCollector, systemd::SystemdCollector, Collector,
nixos::NixOSCollector, systemd::SystemdCollector, Collector,
};
use crate::config::{AgentConfig, CollectorConfig};
@@ -61,14 +61,6 @@ impl MetricCollectionManager {
info!("BENCHMARK: Backup collector only");
}
}
Some("smart") => {
// SMART collector only
if config.smart.enabled {
let smart_collector = SmartCollector;
collectors.push(Box::new(smart_collector));
info!("BENCHMARK: SMART collector only");
}
}
Some("none") => {
// No collectors - test agent loop only
info!("BENCHMARK: No collectors enabled");
@@ -110,11 +102,6 @@ impl MetricCollectionManager {
info!("NixOS collector initialized");
}
if config.smart.enabled {
let smart_collector = SmartCollector;
collectors.push(Box::new(smart_collector));
info!("SMART collector initialized");
}
}
}

View File

@@ -236,6 +236,13 @@ impl Dashboard {
self.metric_store
.update_metrics(&metric_message.hostname, metric_message.metrics);
// Check for agent version mismatches across hosts
if let Some((current_version, outdated_hosts)) = self.metric_store.get_version_mismatches() {
for outdated_host in &outdated_hosts {
warn!("Host {} has outdated agent version (current: {})", outdated_host, current_version);
}
}
// Update TUI with new hosts and metrics (only if not headless)
if let Some(ref mut tui_app) = self.tui_app {
let mut connected_hosts = self
@@ -261,6 +268,26 @@ impl Dashboard {
tui_app.update_metrics(&self.metric_store);
}
}
// Also check for command output messages
if let Ok(Some(cmd_output)) = self.zmq_consumer.receive_command_output().await {
debug!(
"Received command output from {}: {}",
cmd_output.hostname,
cmd_output.output_line
);
// Forward to TUI if not headless
if let Some(ref mut tui_app) = self.tui_app {
tui_app.add_terminal_output(&cmd_output.hostname, cmd_output.output_line);
// Close popup when command completes
if cmd_output.is_complete {
tui_app.close_terminal_popup(&cmd_output.hostname);
}
}
}
last_metrics_check = Instant::now();
}

View File

@@ -1,5 +1,5 @@
use anyhow::Result;
use cm_dashboard_shared::{MessageEnvelope, MessageType, MetricMessage};
use cm_dashboard_shared::{CommandOutputMessage, MessageEnvelope, MessageType, MetricMessage};
use tracing::{debug, error, info, warn};
use zmq::{Context, Socket, SocketType};
@@ -103,6 +103,43 @@ impl ZmqConsumer {
Ok(())
}
/// Receive command output from any connected agent (non-blocking)
pub async fn receive_command_output(&mut self) -> Result<Option<CommandOutputMessage>> {
match self.subscriber.recv_bytes(zmq::DONTWAIT) {
Ok(data) => {
// Deserialize envelope
let envelope: MessageEnvelope = serde_json::from_slice(&data)
.map_err(|e| anyhow::anyhow!("Failed to deserialize envelope: {}", e))?;
// Check message type
match envelope.message_type {
MessageType::CommandOutput => {
let cmd_output = envelope
.decode_command_output()
.map_err(|e| anyhow::anyhow!("Failed to decode command output: {}", e))?;
debug!(
"Received command output from {}: {}",
cmd_output.hostname,
cmd_output.output_line
);
Ok(Some(cmd_output))
}
_ => Ok(None), // Not a command output message
}
}
Err(zmq::Error::EAGAIN) => {
// No message available (non-blocking mode)
Ok(None)
}
Err(e) => {
error!("ZMQ receive error: {}", e);
Err(anyhow::anyhow!("ZMQ receive error: {}", e))
}
}
}
/// Receive metrics from any connected agent (non-blocking)
pub async fn receive_metrics(&mut self) -> Result<Option<MetricMessage>> {
match self.subscriber.recv_bytes(zmq::DONTWAIT) {
@@ -132,6 +169,10 @@ impl ZmqConsumer {
debug!("Received heartbeat");
Ok(None) // Don't return heartbeats as metrics
}
MessageType::CommandOutput => {
debug!("Received command output (will be handled by receive_command_output)");
Ok(None) // Command output handled by separate method
}
_ => {
debug!("Received non-metrics message: {:?}", envelope.message_type);
Ok(None)

View File

@@ -124,4 +124,52 @@ impl MetricStore {
}
}
}
/// Get agent versions from all hosts for cross-host comparison
pub fn get_agent_versions(&self) -> HashMap<String, String> {
let mut versions = HashMap::new();
for (hostname, metrics) in &self.current_metrics {
if let Some(version_metric) = metrics.get("agent_version") {
if let cm_dashboard_shared::MetricValue::String(version) = &version_metric.value {
versions.insert(hostname.clone(), version.clone());
}
}
}
versions
}
/// Check for agent version mismatches across hosts
pub fn get_version_mismatches(&self) -> Option<(String, Vec<String>)> {
let versions = self.get_agent_versions();
if versions.len() < 2 {
return None; // Need at least 2 hosts to compare
}
// Find the most common version (assume it's the "current" version)
let mut version_counts = HashMap::new();
for version in versions.values() {
*version_counts.entry(version.clone()).or_insert(0) += 1;
}
let most_common_version = version_counts
.iter()
.max_by_key(|(_, count)| *count)
.map(|(version, _)| version.clone())?;
// Find hosts with different versions
let outdated_hosts: Vec<String> = versions
.iter()
.filter(|(_, version)| *version != &most_common_version)
.map(|(hostname, _)| hostname.clone())
.collect();
if outdated_hosts.is_empty() {
None
} else {
Some((most_common_version, outdated_hosts))
}
}
}

View File

@@ -92,6 +92,51 @@ impl HostWidgets {
}
}
/// Terminal popup for streaming command output
#[derive(Clone)]
pub struct TerminalPopup {
/// Is the popup currently visible
pub visible: bool,
/// Command being executed
pub command_type: CommandType,
/// Target hostname
pub hostname: String,
/// Target service/operation name
pub target: String,
/// Output lines collected so far
pub output_lines: Vec<String>,
/// Scroll offset for the output
pub scroll_offset: usize,
/// Start time of the operation
pub start_time: Instant,
}
impl TerminalPopup {
pub fn new(command_type: CommandType, hostname: String, target: String) -> Self {
Self {
visible: true,
command_type,
hostname,
target,
output_lines: Vec::new(),
scroll_offset: 0,
start_time: Instant::now(),
}
}
pub fn add_output_line(&mut self, line: String) {
self.output_lines.push(line);
// Auto-scroll to bottom when new content arrives
if self.output_lines.len() > 20 {
self.scroll_offset = self.output_lines.len().saturating_sub(20);
}
}
pub fn close(&mut self) {
self.visible = false;
}
}
/// Main TUI application
pub struct TuiApp {
/// Widget states per host (hostname -> HostWidgets)
@@ -108,6 +153,8 @@ pub struct TuiApp {
should_quit: bool,
/// Track if user manually navigated away from localhost
user_navigated_away: bool,
/// Terminal popup for streaming command output
terminal_popup: Option<TerminalPopup>,
}
impl TuiApp {
@@ -120,6 +167,7 @@ impl TuiApp {
focused_panel: PanelType::System, // Start with System panel focused
should_quit: false,
user_navigated_away: false,
terminal_popup: None,
}
}
@@ -250,6 +298,38 @@ impl TuiApp {
/// Handle keyboard input
pub fn handle_input(&mut self, event: Event) -> Result<Option<UiCommand>> {
if let Event::Key(key) = event {
// If terminal popup is visible, handle popup-specific keys first
if let Some(ref mut popup) = self.terminal_popup {
if popup.visible {
match key.code {
KeyCode::Esc => {
popup.close();
self.terminal_popup = None;
return Ok(None);
}
KeyCode::Up => {
popup.scroll_offset = popup.scroll_offset.saturating_sub(1);
return Ok(None);
}
KeyCode::Down => {
let max_scroll = if popup.output_lines.len() > 20 {
popup.output_lines.len() - 20
} else {
0
};
popup.scroll_offset = (popup.scroll_offset + 1).min(max_scroll);
return Ok(None);
}
KeyCode::Char('q') => {
popup.close();
self.terminal_popup = None;
return Ok(None);
}
_ => return Ok(None), // Consume all other keys when popup is open
}
}
}
match key.code {
KeyCode::Char('q') => {
self.should_quit = true;
@@ -266,6 +346,12 @@ impl TuiApp {
// System rebuild command
if let Some(hostname) = self.current_host.clone() {
self.start_command(&hostname, CommandType::SystemRebuild, hostname.clone());
// Open terminal popup for real-time output
self.terminal_popup = Some(TerminalPopup::new(
CommandType::SystemRebuild,
hostname.clone(),
"NixOS Rebuild".to_string()
));
return Ok(Some(UiCommand::SystemRebuild { hostname }));
}
}
@@ -473,6 +559,25 @@ impl TuiApp {
}
}
/// Add output line to terminal popup
pub fn add_terminal_output(&mut self, hostname: &str, line: String) {
if let Some(ref mut popup) = self.terminal_popup {
if popup.hostname == hostname && popup.visible {
popup.add_output_line(line);
}
}
}
/// Close terminal popup for a specific hostname
pub fn close_terminal_popup(&mut self, hostname: &str) {
if let Some(ref mut popup) = self.terminal_popup {
if popup.hostname == hostname {
popup.close();
self.terminal_popup = None;
}
}
}
/// Check for rebuild completion by detecting agent hash changes
pub fn check_rebuild_completion(&mut self, metric_store: &MetricStore) {
let mut hosts_to_complete = Vec::new();
@@ -636,6 +741,13 @@ impl TuiApp {
// Render statusbar at the bottom
self.render_statusbar(frame, main_chunks[2]); // main_chunks[2] is the statusbar area
// Render terminal popup on top of everything else
if let Some(ref popup) = self.terminal_popup {
if popup.visible {
self.render_terminal_popup(frame, size, popup);
}
}
}
/// Render btop-style minimal title with host status colors
@@ -835,4 +947,112 @@ impl TuiApp {
}
}
/// Render terminal popup with streaming output
fn render_terminal_popup(&self, frame: &mut Frame, area: Rect, popup: &TerminalPopup) {
use ratatui::{
style::{Color, Modifier, Style},
text::{Line, Span},
widgets::{Block, Borders, Clear, Paragraph, Wrap},
};
// Calculate popup size (80% of screen, centered)
let popup_width = area.width * 80 / 100;
let popup_height = area.height * 80 / 100;
let popup_x = (area.width - popup_width) / 2;
let popup_y = (area.height - popup_height) / 2;
let popup_area = Rect {
x: popup_x,
y: popup_y,
width: popup_width,
height: popup_height,
};
// Clear background
frame.render_widget(Clear, popup_area);
// Create terminal-style block
let title = format!(" {}{} ({:.1}s) ",
popup.hostname,
popup.target,
popup.start_time.elapsed().as_secs_f32()
);
let block = Block::default()
.title(title)
.borders(Borders::ALL)
.border_style(Style::default().fg(Color::Cyan))
.style(Style::default().bg(Color::Black));
let inner_area = block.inner(popup_area);
frame.render_widget(block, popup_area);
// Render output content
let available_height = inner_area.height as usize;
let total_lines = popup.output_lines.len();
// Calculate which lines to show based on scroll offset
let start_line = popup.scroll_offset;
let end_line = (start_line + available_height).min(total_lines);
let visible_lines: Vec<Line> = popup.output_lines[start_line..end_line]
.iter()
.map(|line| {
// Style output lines with terminal colors
if line.contains("error") || line.contains("Error") || line.contains("failed") {
Line::from(Span::styled(line.clone(), Style::default().fg(Color::Red)))
} else if line.contains("warning") || line.contains("Warning") {
Line::from(Span::styled(line.clone(), Style::default().fg(Color::Yellow)))
} else if line.contains("building") || line.contains("Building") {
Line::from(Span::styled(line.clone(), Style::default().fg(Color::Blue)))
} else if line.contains("") || line.contains("success") || line.contains("completed") {
Line::from(Span::styled(line.clone(), Style::default().fg(Color::Green)))
} else {
Line::from(Span::styled(line.clone(), Style::default().fg(Color::White)))
}
})
.collect();
let content = Paragraph::new(visible_lines)
.wrap(Wrap { trim: false })
.style(Style::default().bg(Color::Black));
frame.render_widget(content, inner_area);
// Render scroll indicator if needed
if total_lines > available_height {
let scroll_info = format!(" {}% ",
if total_lines > 0 {
(end_line * 100) / total_lines
} else {
100
}
);
let scroll_area = Rect {
x: popup_area.x + popup_area.width - scroll_info.len() as u16 - 1,
y: popup_area.y + popup_area.height - 1,
width: scroll_info.len() as u16,
height: 1,
};
let scroll_widget = Paragraph::new(scroll_info)
.style(Style::default().fg(Color::Cyan).bg(Color::Black));
frame.render_widget(scroll_widget, scroll_area);
}
// Instructions at bottom
let instructions = " ESC/Q: Close • ↑↓: Scroll ";
let instructions_area = Rect {
x: popup_area.x + 1,
y: popup_area.y + popup_area.height - 1,
width: instructions.len() as u16,
height: 1,
};
let instructions_widget = Paragraph::new(instructions)
.style(Style::default().fg(Color::Gray).bg(Color::Black));
frame.render_widget(instructions_widget, instructions_area);
}
}

View File

@@ -339,9 +339,9 @@ impl Widget for SystemWidget {
self.active_users = Some(users.clone());
}
}
"system_agent_hash" => {
if let MetricValue::String(hash) = &metric.value {
self.agent_hash = Some(hash.clone());
"agent_version" => {
if let MetricValue::String(version) = &metric.value {
self.agent_hash = Some(version.clone());
}
}
@@ -422,9 +422,19 @@ impl SystemWidget {
Span::styled("NixOS:", Typography::widget_title())
]));
let build_text = self.nixos_build.as_deref().unwrap_or("unknown");
lines.push(Line::from(vec![
Span::styled(format!("Build: {}", build_text), Typography::secondary())
]));
let config_text = self.config_hash.as_deref().unwrap_or("unknown");
lines.push(Line::from(vec![
Span::styled(format!("Build: {}", config_text), Typography::secondary())
Span::styled(format!("Config: {}", config_text), Typography::secondary())
]));
let users_text = self.active_users.as_deref().unwrap_or("unknown");
lines.push(Line::from(vec![
Span::styled(format!("Active users: {}", users_text), Typography::secondary())
]));
let agent_hash_text = self.agent_hash.as_deref().unwrap_or("unknown");

View File

@@ -9,6 +9,17 @@ pub struct MetricMessage {
pub metrics: Vec<Metric>,
}
/// Command output streaming message
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CommandOutputMessage {
pub hostname: String,
pub command_id: String,
pub command_type: String,
pub output_line: String,
pub is_complete: bool,
pub timestamp: u64,
}
impl MetricMessage {
pub fn new(hostname: String, metrics: Vec<Metric>) -> Self {
Self {
@@ -19,6 +30,19 @@ impl MetricMessage {
}
}
impl CommandOutputMessage {
pub fn new(hostname: String, command_id: String, command_type: String, output_line: String, is_complete: bool) -> Self {
Self {
hostname,
command_id,
command_type,
output_line,
is_complete,
timestamp: chrono::Utc::now().timestamp() as u64,
}
}
}
/// Commands that can be sent from dashboard to agent
#[derive(Debug, Serialize, Deserialize)]
pub enum Command {
@@ -55,6 +79,7 @@ pub enum MessageType {
Metrics,
Command,
CommandResponse,
CommandOutput,
Heartbeat,
}
@@ -80,6 +105,13 @@ impl MessageEnvelope {
})
}
pub fn command_output(message: CommandOutputMessage) -> Result<Self, crate::SharedError> {
Ok(Self {
message_type: MessageType::CommandOutput,
payload: serde_json::to_vec(&message)?,
})
}
pub fn heartbeat() -> Result<Self, crate::SharedError> {
Ok(Self {
message_type: MessageType::Heartbeat,
@@ -113,4 +145,13 @@ impl MessageEnvelope {
}),
}
}
pub fn decode_command_output(&self) -> Result<CommandOutputMessage, crate::SharedError> {
match self.message_type {
MessageType::CommandOutput => Ok(serde_json::from_slice(&self.payload)?),
_ => Err(crate::SharedError::Protocol {
message: "Expected command output message".to_string(),
}),
}
}
}