cm-dashboard

Author	SHA1	Message	Date
Christoffer Martinsson	f635ba9c75	Remove Tailscale and connection type complexity Some checks failed Build and Release / build-and-release (push) Has been cancelled Details Simplifies host connection configuration by removing tailscale_ip field, connection_type preferences, and fallback retry logic. Now uses only the ip field or hostname as fallback. Eliminates blocking TCP connectivity tests that interfered with heartbeat processing. This resolves intermittent host lost/found issues by removing the connection retry timeouts that blocked the ZMQ message processing loop.	2025-11-15 10:04:47 +01:00
Christoffer Martinsson	0faed9309e	Improve host disconnection detection and fix notification exclusions All checks were successful Build and Release / build-and-release (push) Successful in 1m34s Details - Add dedicated heartbeat transmission every 5 seconds independent of metric collection - Fix host offline detection by clearing metrics for disconnected hosts - Move exclude_email_metrics to NotificationConfig for better organization - Add cleanup_offline_hosts method to remove stale metrics after heartbeat timeout - Ensure offline hosts show proper status icons and visual indicators Version 0.1.63	2025-11-08 11:33:32 +01:00
Christoffer Martinsson	5f6e47ece5	Implement heartbeat-based host connectivity detection All checks were successful Build and Release / build-and-release (push) Successful in 2m8s Details - Add agent_heartbeat metric to agent transmission for reliable host detection - Update dashboard to track heartbeat timestamps per host instead of general metrics - Add configurable heartbeat_timeout_seconds to dashboard ZMQ config (default 10s) - Remove unused timeout_ms from agent config and revert to non-blocking command reception - Remove unused heartbeat_interval_ms from agent configuration - Host disconnect detection now uses dedicated heartbeat metrics for improved reliability - Bump version to 0.1.57	2025-11-06 11:04:01 +01:00
Christoffer Martinsson	0e7cf24dbb	Add exclude_email_metrics configuration option All checks were successful Build and Release / build-and-release (push) Successful in 2m34s Details - Add exclude_email_metrics field to AgentConfig for filtering email notifications - Metrics matching excluded names skip notification processing but still appear in dashboard - Optional field with serde(default) for backward compatibility - Bump version to 0.1.56	2025-11-06 10:31:25 +01:00
Christoffer Martinsson	bd20f0cae1	Fix user-stopped flag timing and service transition handling All checks were successful Build and Release / build-and-release (push) Successful in 2m9s Details Correct user-stopped service behavior during startup transitions: User-Stopped Flag Timing Fix: - Clear user-stopped flag only when service actually becomes active, not when start command succeeds - Remove premature flag clearing from service control handler - Add automatic flag clearing when service status metrics show active state - Services retain user-stopped status during activating/transitioning states Service Transition Handling: - User-stopped services in activating state now report Status::OK instead of Status::Pending - Prevents host warnings during legitimate service startup transitions - Maintains accurate status reporting throughout service lifecycle - Failed service starts preserve user-stopped flags correctly Journalctl Popup Fix: - Fix terminal corruption when using J key for service logs - Correct command quoting to prevent tmux popup interference - Stable popup display without dashboard interface corruption Result: Clean service startup experience with no false warnings and proper user-stopped tracking throughout the entire service lifecycle. Bump version to v0.1.47	2025-10-30 12:05:54 +01:00
Christoffer Martinsson	aeae60146d	Fix user-stopped service display and flag timing issues All checks were successful Build and Release / build-and-release (push) Successful in 2m10s Details Improve user-stopped service tracking behavior: Service Display Fix: - Services widget now shows actual systemctl status (active/inactive) - Use info.status instead of hardcoded text based on widget_status - User-stopped services correctly display 'inactive' with green OK icon - Prevents misleading 'active' display for stopped services User-Stopped Flag Timing Fix: - Clear user-stopped flag AFTER successful service start, not when command sent - Prevents warnings during service startup transition period - Service remains Status::OK during 'activating' state for user-stopped services - Flag only cleared when systemctl start command actually succeeds - Failed start attempts preserve user-stopped flag Result: Clean service state tracking with accurate display and no false alerts during intentional user operations. Bump version to v0.1.45	2025-10-30 11:11:39 +01:00
Christoffer Martinsson	a82c81e8e3	Fix service control by adding .service suffix to systemctl commands All checks were successful Build and Release / build-and-release (push) Successful in 2m8s Details Service stop/start operations were failing because systemctl commands were missing the .service suffix. This caused the new user-stopped tracking feature to mark services but not actually control them. Changes: - Add .service suffix to systemctl commands in service control handler - Matches pattern used throughout systemd collector - Fixes service start/stop functionality via dashboard Clean up legacy documentation: - Remove outdated TODO.md, AGENTS.md, and test files - Update CLAUDE.md with current architecture and rules only - Comprehensive README.md rewrite with technical documentation - Document user-stopped service tracking feature Bump version to v0.1.44	2025-10-30 11:00:36 +01:00
Christoffer Martinsson	c56e9d7be2	Implement user-stopped service tracking system All checks were successful Build and Release / build-and-release (push) Successful in 2m34s Details Add comprehensive tracking for services stopped via dashboard to prevent false alerts when users intentionally stop services. Features: - User-stopped services report Status::Ok instead of Warning - Persistent storage survives agent restarts - Dashboard sends UserStart/UserStop commands - Agent tracks and syncs user-stopped state globally - Systemd collector respects user-stopped flags Implementation: - New service_tracker module with persistent JSON storage - Enhanced ServiceAction enum with UserStart/UserStop variants - Global singleton tracker accessible by collectors - Service status logic updated to check user-stopped flag - Dashboard version now uses CARGO_PKG_VERSION automatically Bump version to v0.1.43	2025-10-30 10:42:56 +01:00
Christoffer Martinsson	a847674004	Remove service restart functionality and make R always rebuild host All checks were successful Build and Release / build-and-release (push) Successful in 2m6s Details Simplified keyboard controls by removing service restart functionality: - Removed 'r' key restart functionality from Services panel - Made 'R' key always trigger system rebuild regardless of focused panel - Updated context shortcuts to show 'R: Rebuild Host' globally - Removed all ServiceRestart enum variants and associated code: - UiCommand::ServiceRestart - CommandType::ServiceRestart - ServiceAction::Restart - Cleaned up pending transition logic to only handle Start/Stop commands The 'R' key now consistently rebuilds the current host from any panel, while 's' and 'S' continue to handle service start/stop in Services panel.	2025-10-28 15:26:15 +01:00
Christoffer Martinsson	2910b7d875	Update version to 0.1.22 and fix system metric status calculation All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details - Fix /tmp usage status to use proper thresholds instead of hardcoded Ok status - Fix wear level status to use configurable thresholds instead of hardcoded values - Add dedicated tmp_status field to SystemWidget for proper /tmp status display - Remove host-level hourglass icon during service operations - Implement immediate service status updates after start/stop/restart commands - Remove active users display and collection from NixOS section - Fix immediate host status aggregation transmission to dashboard	2025-10-28 13:21:56 +01:00
Christoffer Martinsson	43242debce	Update version to 0.1.21 and fix dashboard data caching All checks were successful Build and Release / build-and-release (push) Successful in 1m13s Details - Separate dashboard updates from email notifications for immediate status aggregation - Add metric caching to MetricCollectionManager for instant dashboard updates - Dashboard now receives cached data every 1 second instead of waiting for collection intervals - Fix transmission to use cached metrics rather than triggering fresh collection - Email notifications maintain separate 60-second batching interval - Update configurable email notification aggregation interval	2025-10-28 12:16:31 +01:00
Christoffer Martinsson	a2519b2814	Update version to 0.1.20 and fix email notification aggregation All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details - Fix email notification aggregation to send batched notifications instead of individual emails - Fix startup data collection to properly process initial status without triggering change notifications - Maintain event-driven transmission while preserving aggregated notification batching - Update version from 0.1.19 to 0.1.20 across all components	2025-10-28 10:48:29 +01:00
Christoffer Martinsson	91f037aa3e	Update to v0.1.19 with event-driven status aggregation All checks were successful Build and Release / build-and-release (push) Successful in 2m4s Details Major architectural improvements: CORE CHANGES: - Remove notification_interval_seconds - status aggregation now immediate - Status calculation moved to collection phase instead of transmission - Event-driven transmission triggers immediately on status changes - Dual transmission strategy: immediate on change + periodic backup - Real-time notifications without batching delays TECHNICAL IMPROVEMENTS: - process_metric() now returns bool indicating status change - Immediate ZMQ broadcast when status changes detected - Status aggregation happens during metric collection, not later - Legacy get_nixos_build_info() method removed (unused) - All compilation warnings fixed BEHAVIOR CHANGES: - Critical alerts sent instantly instead of waiting for intervals - Dashboard receives real-time status updates - Notifications triggered immediately on status transitions - Backup periodic transmission every 1s ensures heartbeat This provides much more responsive monitoring with instant alerting while maintaining the reliability of periodic transmission as backup.	2025-10-28 10:36:34 +01:00
Christoffer Martinsson	627c533724	Update to v0.1.18 with per-collector intervals and tmux check All checks were successful Build and Release / build-and-release (push) Successful in 2m7s Details - Implement per-collector interval timing respecting NixOS config - Remove all hardcoded timeout/interval values and make configurable - Add tmux session requirement check for TUI mode (bypassed for headless) - Update agent to send config hash in Build field instead of nixos version - Add nginx check interval, HTTP timeouts, and ZMQ transmission interval configs - Update NixOS configuration with new configurable values Breaking changes: - Build field now shows nix store config hash (8 chars) instead of nixos version - All intervals now follow individual collector configuration instead of global New configuration fields: - systemd.nginx_check_interval_seconds - systemd.http_timeout_seconds - systemd.http_connect_timeout_seconds - zmq.transmission_interval_seconds	2025-10-28 10:08:25 +01:00
Christoffer Martinsson	e61a845965	Replace complex SystemRebuild with simple SSH + tmux popup approach All checks were successful Build and Release / build-and-release (push) Successful in 2m6s Details - Remove all SystemRebuild command infrastructure from agent and dashboard - Replace with direct tmux popup execution: ssh {user}@{host} {alias} - Add configurable SSH user and rebuild alias in dashboard config - Eliminate agent process crashes during rebuilds - Simplify architecture by removing ZMQ command streaming complexity - Clean up all related dead code and fix compilation warnings Benefits: - Process isolation: rebuild runs independently via SSH - Crash resilience: agent/dashboard can restart without affecting rebuilds - Configuration flexibility: SSH user and alias configurable per deployment - Operational simplicity: standard tmux popup interface	2025-10-27 14:25:45 +01:00
Christoffer Martinsson	ac5d2d4db5	Fix compilation error in agent service status check All checks were successful Build and Release / build-and-release (push) Successful in 1m31s Details	2025-10-26 23:42:19 +01:00
Christoffer Martinsson	69892a2d84	Implement systemd service approach for nixos-rebuild operations Some checks failed Build and Release / build-and-release (push) Failing after 1m58s Details - Add cm-rebuild systemd service for process isolation - Add sudo permissions for service control and journal access - Remove verbose flag for cleaner output - Ensures reliable rebuild operations without agent crashes	2025-10-26 23:18:09 +01:00
Christoffer Martinsson	af52d49194	Fix system panel layout and switch to version-based agent reporting All checks were successful Build and Release / build-and-release (push) Successful in 2m6s Details - Remove auto-close behavior from terminal popup for manual review - Fix system panel to show correct NixOS section layout - Add missing Active users line after Agent version - Switch agent version from nix store hash to actual version number (v0.1.11) - Display full version string without truncation for clear version tracking	2025-10-26 13:34:56 +01:00
Christoffer Martinsson	bc94f75328	Enable real-time output streaming for nixos-rebuild command All checks were successful Build and Release / build-and-release (push) Successful in 1m24s Details - Replace simulated progress messages with actual stdout/stderr capture - Stream all nixos-rebuild output line-by-line to terminal popup - Show transparent build process including downloads, compilation, and activation - Maintain real-time visibility into complete rebuild process	2025-10-26 13:00:53 +01:00
Christoffer Martinsson	b6da71b7e7	Implement real-time terminal popup for system rebuild operations All checks were successful Build and Release / build-and-release (push) Successful in 1m21s Details - Add terminal popup UI component with 80% screen coverage and terminal styling - Extend ZMQ protocol with CommandOutputMessage for streaming output - Implement real-time output streaming in agent system rebuild handler - Add keyboard controls (ESC/Q to close, ↑↓ to scroll) for popup interaction - Fix system panel Build display to show actual NixOS build instead of config hash - Update service filters in README with wildcard patterns for better matching - Add periodic progress updates during nixos-rebuild execution - Integrate command output handling in dashboard main loop	2025-10-26 11:39:03 +01:00
Christoffer Martinsson	bb72c42726	Add agent version reporting and display - Agent reports version via agent_version metric using nix store hash - Dashboard displays agent version in system widget - Foundation for cross-host version comparison - Both agent -V and dashboard show versions	2025-10-26 10:38:20 +01:00
Christoffer Martinsson	1b3f8671c0	Add rebuild output logging for debugging Redirect nixos-rebuild stdout/stderr to /var/log/cm-dashboard/nixos-rebuild.log while keeping the process detached. This allows monitoring rebuild progress and debugging why cargo builds in /tmp aren't visible when agent runs. Use: tail -f /var/log/cm-dashboard/nixos-rebuild.log to monitor progress.	2025-10-25 15:23:20 +02:00
Christoffer Martinsson	16ea853f5b	Fix agent self-update issue by running nixos-rebuild detached Run nixos-rebuild with nohup in background to prevent the agent from killing itself during system rebuild. The rebuild process now runs independently, allowing the agent to return success immediately and avoid crashes during binary updates. This fixes the issue where agent would crash during rebuild and restart with the old binary due to missing daemon-reload.	2025-10-25 15:09:17 +02:00
Christoffer Martinsson	4b54a59e35	Remove unused code and eliminate compiler warnings - Remove unused fields from CommandStatus variants - Clean up unused methods and unused collector fields - Fix lifetime syntax warning in SystemWidget - Delete unused cache module completely - Remove redundant render methods from widgets All agent and dashboard warnings eliminated while preserving panel switching and scrolling functionality.	2025-10-25 14:15:52 +02:00
Christoffer Martinsson	8dd943e8f1	Fix config hash to use nix store hash and disable cache persistence	2025-10-25 12:57:47 +02:00
Christoffer Martinsson	71671a8901	Fix nixos-rebuild sandbox option syntax Use --option sandbox false instead of --no-sandbox flag. The --no-sandbox flag is for nix build, not nixos-rebuild.	2025-10-25 01:44:40 +02:00
Christoffer Martinsson	f5d2ebeaec	Add --no-sandbox flag to nixos-rebuild command Fixes kernel namespace sandboxing issues when running as systemd service. The --no-sandbox flag disables Nix build sandboxing which requires kernel namespaces not available in restricted service environments.	2025-10-25 01:37:21 +02:00
Christoffer Martinsson	996a199050	Fix nixos-rebuild permission issue by running as root directly Remove sudo -u cm wrapper that was causing git repository ownership mismatch. Now cm-agent runs nixos-rebuild directly as root, avoiding the ownership conflict between cm-agent (git clone) and cm user. Updated sudo rules to allow cm-agent -> root nixos-rebuild access.	2025-10-25 00:45:50 +02:00
Christoffer Martinsson	a991fbb942	Add --flake argument to nixos-rebuild Use 'nixos-rebuild switch --flake .' to build from the flake.nix in the cloned repository, resolving 'nixos-config not found' errors.	2025-10-24 19:44:34 +02:00
Christoffer Martinsson	7b7e323fd8	Fix nixos-rebuild sudo path mismatch Use explicit /run/current-system/sw/bin/nixos-rebuild path instead of 'nixos-rebuild' command to match sudo rules exactly. This resolves 'command not allowed' errors when the command resolves to nix store paths.	2025-10-24 19:39:08 +02:00
Christoffer Martinsson	114ad52ae8	Add API key support for git authentication - Add nixos_config_api_key_file option to NixOS configuration - Support reading API token from file for private repositories - Automatically inject token into HTTPS URLs (https://token@host/repo.git) - Graceful fallback to original URL if key file missing/empty - Default key file location: /var/lib/cm-dashboard/git-api-key Usage: echo 'your-api-token' \| sudo tee /var/lib/cm-dashboard/git-api-key	2025-10-24 19:30:26 +02:00
Christoffer Martinsson	b3c67f4b7f	Implement git clone approach for nixos-rebuild Replace direct directory access with git clone/pull approach: - Add git configuration options (url, branch, working_dir) to NixOS module - Update SystemConfig and AgentCommand to use git parameters - Implement ensure_git_repository() method for clone/pull operations - Agent clones nixosbox to /var/lib/cm-dashboard/nixos-config - Maintains security while solving permission denied issues The agent now manages its own copy of the configuration without needing access to /home/cm directory.	2025-10-24 19:16:44 +02:00
Christoffer Martinsson	864cafd61f	Fix nixos-rebuild agent execution: run as cm user Change sudo command to use '-u cm' to run nixos-rebuild as the cm user instead of root, allowing access to /home/cm/nixosbox directory.	2025-10-24 18:52:51 +02:00
Christoffer Martinsson	967244064f	Fix command execution permissions and eliminate backup error spam - Add sudo permissions for systemctl and nixos-rebuild commands - Use sudo in agent command execution for proper privileges - Fix backup collector to handle missing status files gracefully - Eliminate backup error spam when no backup system is configured	2025-10-23 23:07:52 +02:00
Christoffer Martinsson	99da289183	Implement remote command execution and visual feedback for service control This implements the core functionality for executing remote commands through the dashboard and providing real-time visual feedback to users. Key Features: - Remote service control (start/stop/restart) via existing keyboard shortcuts - System rebuild command with maintenance mode integration - Real-time visual feedback with service status transitions - ZMQ command protocol extension for service and system operations Implementation Details: - Extended AgentCommand enum with ServiceControl and SystemRebuild variants - Added agent-side handlers for systemctl and nixos-rebuild execution - Implemented command status tracking system for visual feedback - Enhanced services widget to show progress states (⏳ restarting) - Integrated command execution with existing keyboard navigation Keyboard Controls: - Services Panel: Space (start/stop), R (restart) - System Panel: R (nixos-rebuild switch) - Backup Panel: B (trigger backup) Technical Architecture: - Command flow: UI → Dashboard → ZMQ → Agent → systemctl/nixos-rebuild - Status tracking: InProgress/Success/Failed states with visual indicators - Maintenance mode: Automatic /tmp/cm-maintenance file management - Service feedback: Icon transitions (● → ⏳ → ● with status text)	2025-10-23 22:55:44 +02:00
Christoffer Martinsson	a08670071c	Implement simple persistent cache with automatic saving on status changes	2025-10-21 20:12:19 +02:00
Christoffer Martinsson	f4b5bb814d	Fix dashboard UI: correct pending color (blue) and use host_status_summary metric	2025-10-21 19:32:37 +02:00
Christoffer Martinsson	41208aa2a0	Implement status aggregation with notification batching	2025-10-21 18:12:42 +02:00
Christoffer Martinsson	a937032eb1	Remove hardcoded defaults, require configuration file - Remove all Default implementations from agent configuration structs - Make configuration file required for agent startup - Update NixOS module to generate complete agent.toml configuration - Add comprehensive configuration options to NixOS module including: - Service include/exclude patterns for systemd collector - All thresholds and intervals - ZMQ communication settings - Notification and cache configuration - Agent now fails fast if no configuration provided - Eliminates configuration drift between defaults and NixOS settings	2025-10-21 00:01:26 +02:00
Christoffer Martinsson	00a8ed3da2	Implement hysteresis for metric status changes to prevent flapping Add comprehensive hysteresis support to prevent status oscillation near threshold boundaries while maintaining responsive alerting. Key Features: - HysteresisThresholds with configurable upper/lower limits - StatusTracker for per-metric status history - Default gaps: CPU load 10%, memory 5%, disk temp 5°C Updated Components: - CPU load collector (5-minute average with hysteresis) - Memory usage collector (percentage-based thresholds) - Disk temperature collector (SMART data monitoring) - All collectors updated to support StatusTracker interface Cache Interval Adjustments: - Service status: 60s → 10s (faster response) - Disk usage: 300s → 60s (more frequent checks) - Backup status: 900s → 60s (quicker updates) - SMART data: moved to 600s tier (10 minutes) Architecture: - Individual metric status calculation in collectors - Centralized StatusTracker in MetricCollectionManager - Status aggregation preserved in dashboard widgets	2025-10-20 18:45:41 +02:00
Christoffer Martinsson	125111ee99	Implement comprehensive backup monitoring and fix timestamp issues - Add BackupCollector for reading TOML status files with disk space metrics - Implement BackupWidget with disk usage display and service status details - Fix backup script disk space parsing by adding missing capture_output=True - Update backup widget to show actual disk usage instead of repository size - Fix timestamp parsing to use backup completion time instead of start time - Resolve timezone issues by using UTC timestamps in backup script - Add disk identification metrics (product name, serial number) to backup status - Enhance UI layout with proper backup monitoring integration	2025-10-18 18:33:41 +02:00
Christoffer Martinsson	8a36472a3d	Implement real-time process monitoring and fix UI hardcoded data This commit addresses several key issues identified during development: Major Changes: - Replace hardcoded top CPU/RAM process display with real system data - Add intelligent process monitoring to CpuCollector using ps command - Fix disk metrics permission issues in systemd collector - Optimize service collection to focus on status, memory, and disk only - Update dashboard widgets to display live process information Process Monitoring Implementation: - Added collect_top_cpu_process() and collect_top_ram_process() methods - Implemented ps-based monitoring with accurate CPU percentages - Added filtering to prevent self-monitoring artifacts (ps commands) - Enhanced error handling and validation for process data - Dashboard now shows realistic values like "claude (PID 2974) 11.0%" Service Collection Optimization: - Removed CPU monitoring from systemd collector for efficiency - Enhanced service directory permission error logging - Simplified services widget to show essential metrics only - Fixed service-to-directory mapping accuracy UI and Dashboard Improvements: - Reorganized dashboard layout with btop-inspired multi-panel design - Updated system panel to include real top CPU/RAM process display - Enhanced widget formatting and data presentation - Removed placeholder/hardcoded data throughout the interface Technical Details: - Updated agent/src/collectors/cpu.rs with process monitoring - Modified dashboard/src/ui/mod.rs for real-time process display - Enhanced systemd collector error handling and disk metrics - Updated CLAUDE.md documentation with implementation details	2025-10-16 23:55:05 +02:00
Christoffer Martinsson	d9edcda36c	Testing	2025-10-12 20:29:08 +02:00
Christoffer Martinsson	2581435b10	Implement per-service disk usage monitoring Replaced system-wide disk usage with accurate per-service tracking by scanning service-specific directories. Services like sshd now correctly show minimal disk usage instead of misleading system totals. - Rename storage widget and add drive capacity/usage columns - Move host display to main dashboard title for cleaner layout - Replace separate alert displays with color-coded row highlighting - Add per-service disk usage collection using du command - Update services widget formatting to handle small disk values - Restructure into workspace with dedicated agent and dashboard packages	2025-10-11 22:59:16 +02:00

44 Commits