- Replace complex SSH command patterns with simple script calls
- Create service-manage script for start/stop operations with proper logging
- Create rebuild script equivalent to rebuild_git alias with user feedback
- Update dashboard to use unified command pattern: sudo service-manage, sudo rebuild
- Simplify backup to use service management: service-manage start borgbackup
- Configure sudoers with wildcards for Nix store path compatibility
- Remove cmtec references from script names for better genericity
- Update version to 0.1.90
- Use bash -c to properly execute service start/stop command sequences
- Ensure SSH session stays alive for user input prompt
- Fix escaping issues with nested quotes in commands
- Update version to 0.1.89
- Move 'Press any key to close...' prompt inside SSH session
- Ensure tmux popup stays open until user manually closes
- Maintain consistent behavior with other SSH commands
- Update version to 0.1.88
- Add sudo to pkill commands to resolve permission errors when killing journalctl processes
- Fix service stop command timing to show logs during shutdown process
- Add sleep delays to ensure log visibility before cleanup
- Update version to 0.1.87
- Apply uniform pattern to all SSH commands: informational text + command + exit prompt
- Remove exit prompt from logging commands (J/L keys) that run continuously with -f flag
- Simplify rebuild and backup commands to match service command pattern
- Update version to 0.1.86
- Remove header text from start/stop commands for cleaner output
- Add automatic log termination when service reaches target state
- Start command auto-closes when service becomes active
- Stop command auto-closes when service becomes inactive
- Simplify SSH command structure by removing bash -c wrapper
- Version bump to 0.1.85
- Update stop command to use background systemctl with immediate log following
- Use same approach as start command for consistent real-time log viewing
- Version bump to 0.1.84
- Update service start to use background systemctl start with immediate log following
- Implement `sudo systemctl start service & sudo journalctl -fu service --since="1 second ago"`
- Remove buffering issues that prevented real-time log streaming
- Version bump to 0.1.82
- Remove complex background process monitoring that was buffering output
- Use direct journalctl -fu command for immediate real-time log streaming
- Eliminate monitoring loop that was killing log stream when service became active
- User now controls log following duration with Ctrl+C
- Fixes buffering issues that prevented seeing ark server startup logs in real-time
- Remove scroll offset fields from HostWidgets struct
- Replace scrolling with simple "X more below" indicators in all widgets
- Remove user-stopped service tracking from agent (now uses SSH control)
- Inactive services now consistently show Status::Inactive with empty circles
- Simplify widget render methods by removing scroll parameters
- Clean up unused imports and legacy scrolling infrastructure
- Fix journalctl command to use -fu for proper log following
- Move Status::Inactive to lowest priority in enum (before Ok)
- Status aggregation now prefers Ok over Inactive in mixed scenarios
- Title bar stays green when mixing active and inactive services
- Inactive services still show gray icons but don't affect overall status
- Ensures healthy systems with stopped services maintain green status
- Reorder Status enum variants to fix aggregation priority
- Status::Inactive now has same priority as Status::Ok in aggregation
- Prevents inactive services from causing gray title bar
- Title bar stays green when system has only active and inactive services
- Only Unknown/Offline/Pending/Warning/Critical statuses affect title color
- Add new Status::Inactive variant to enum for better service state representation
- Agent now assigns Status::Inactive instead of Status::Warning for inactive services
- Dashboard displays inactive services with empty circle (○) icon in gray color
- User-stopped services still show as Status::Ok with green filled circle
- Inactive services treated as OK for host status aggregation
- Improves visual clarity between active (●), inactive (○), and warning (◐) states
- Service start now follows logs in real-time until service becomes active
- Automatically stops log following when systemctl reports service as active
- Eliminates need for manual Ctrl+C to exit log stream
- Shows final service status after startup completes
- Background monitoring loop checks service state every second
- Remove all transitional icon infrastructure (CommandType, pending transitions)
- Clean up ZMQ command system remnants after SSH migration
- Add real-time log streaming for service start operations
- Show final logs and status for service stop operations
- Fix compilation warnings by removing unused methods
- Simplify UI architecture with pure SSH-based service control
Replace ZMQ-based service start/stop commands with SSH execution in tmux
popups. This provides better user feedback with real-time systemctl output
while eliminating blocking operations from the main message processing loop.
Changes:
- Service start/stop now use SSH with progress display
- Added backup functionality with 'B' key
- Preserved transitional icons (↑/↓) for immediate visual feedback
- Removed all ZMQ service control commands and handlers
- Updated configuration to include backup_alias setting
- All operations (rebuild, backup, services) now use consistent SSH interface
This ensures stable heartbeat processing while providing superior user
experience with live command output and service status feedback.
Eliminates automatic CollectNow command sending on host connection which
was blocking the main message processing loop for up to 5 seconds per
command. Since agents transmit cached data every 2 seconds anyway, the
CollectNow optimization provided minimal benefit while causing heartbeat
detection issues. Also removes unused send_command wrapper method.
This should completely resolve intermittent host connection dropping.
Eliminates test_tcp_connectivity function that was causing 5-10 second
startup delays. ZMQ connections are non-blocking and we rely entirely
on heartbeat mechanism for connectivity detection. This restores fast
dashboard startup time.
Update release.yml to use correct path hosts/services/cm-dashboard.nix
instead of hosts/common/cm-dashboard.nix. Also update documentation
in CLAUDE.md and README.md to reflect the correct file location.
Simplifies host connection configuration by removing tailscale_ip field,
connection_type preferences, and fallback retry logic. Now uses only the
ip field or hostname as fallback. Eliminates blocking TCP connectivity
tests that interfered with heartbeat processing.
This resolves intermittent host lost/found issues by removing the
connection retry timeouts that blocked the ZMQ message processing loop.
Update the auto connection type logic to try local network connections
before falling back to Tailscale. This provides better performance by
using faster local connections when available while maintaining Tailscale
as a reliable fallback.
Changes:
- Auto connection priority: local → tailscale → hostname (was tailscale → local)
- Fallback retry order updated to match new priority
- Supports omitting IP field in config for hosts without static local IP
Display the connection IP address that the dashboard is configured to use
for each host below the Agent version information. Shows which network
path (local/Tailscale) is being used for connections based on host
configuration.
Features:
- Display detected IP below Agent row in system widget
- Uses existing host configuration connection logic
- Shows actual IP being used for dashboard connections
Implement configurable network routing for both local and Tailscale networks.
Dashboard now supports intelligent connection selection with automatic fallback
between network types. Add IP configuration fields and connection routing logic
for ZMQ and SSH operations.
Features:
- Host configuration with local and Tailscale IP addresses
- Configurable connection types (local/tailscale/auto)
- Automatic fallback between network connections
- Updated ZMQ connection logic with retry support
- SSH command routing through configured IP addresses
Implement maintenance_mode_file configuration option in NotificationConfig
to allow customizable file paths for suppressing email notifications.
Updates maintenance mode check to use configured path instead of hardcoded
/tmp/cm-maintenance file.
- Press 't' to open SSH session to current host in tmux split
- Uses 30% vertical split consistent with logs and rebuild commands
- Auto-closes tmux window when SSH session ends
- Provides direct host administration access from dashboard
- Uses same SSH configuration as rebuild operations
Version 0.1.65
- Hide all system/backup/service panels when host is offline
- Show centered wake-up message with host status
- Display "Press 'w' to wake up host" if MAC address configured
- Provide clear visual indication when hosts are unreachable
- Improve user experience by removing confusing empty panels
Version 0.1.64
- Add dedicated heartbeat transmission every 5 seconds independent of metric collection
- Fix host offline detection by clearing metrics for disconnected hosts
- Move exclude_email_metrics to NotificationConfig for better organization
- Add cleanup_offline_hosts method to remove stale metrics after heartbeat timeout
- Ensure offline hosts show proper status icons and visual indicators
Version 0.1.63
- Add independent heartbeat checking timer (1 second interval) separate from metric reception
- Move get_connected_hosts() call outside metric receive condition to run periodically
- Remove duplicate update_hosts() call from metric processing to avoid redundancy
- Ensure offline host detection works even when no new metrics are received
- Fix issue where hosts going offline were never detected due to conditional heartbeat check
- Heartbeat timeouts now properly detected within configured timeout + 1 second
- Bump version to 0.1.62
- Change receive_metrics() from blocking to DONTWAIT to prevent main loop freezing
- Eliminate 1-second ZMQ socket timeout that was blocking UI after Tab key press
- Main loop now continues immediately after immediate render instead of waiting
- Maintain heartbeat-based host detection while fixing visual responsiveness
- Fix blocking operation introduced when implementing heartbeat timeout mechanism
- Tab navigation now truly immediate without any network operation delays
- Bump version to 0.1.61
- Add cached localhost field to TuiApp struct to avoid repeated gethostname() system calls
- Initialize localhost once in constructor instead of calling gethostname() on every navigation
- Replace gethostname() calls in update_hosts() and navigate_host() with cached value
- Eliminate expensive system call bottleneck causing Tab key responsiveness issues
- Reduce Tab navigation from 2+ system calls to zero system calls (memory access only)
- Fix performance regression introduced by immediate UI refresh implementation
- Bump version to 0.1.60
- Add immediate terminal.draw() call after input handling in main loop
- Eliminate delay between Tab key press and visual host switching
- Provide instant visual feedback for all navigation inputs
- Maintain existing metric update render cycle without duplication
- Fix UI update timing issue where changes only appeared on metric intervals
- Bump version to 0.1.59
- Replace 6 separate filter operations with single-pass metric categorization in update_metrics
- Reduce CPU overhead from 6x to 1x work per metric update cycle
- Fix Tab key sluggishness caused by competing expensive filtering operations
- Maintain exact same functionality with significantly better performance
- Improve UI responsiveness for host switching and navigation
- Bump version to 0.1.58
- Add agent_heartbeat metric to agent transmission for reliable host detection
- Update dashboard to track heartbeat timestamps per host instead of general metrics
- Add configurable heartbeat_timeout_seconds to dashboard ZMQ config (default 10s)
- Remove unused timeout_ms from agent config and revert to non-blocking command reception
- Remove unused heartbeat_interval_ms from agent configuration
- Host disconnect detection now uses dedicated heartbeat metrics for improved reliability
- Bump version to 0.1.57
- Add exclude_email_metrics field to AgentConfig for filtering email notifications
- Metrics matching excluded names skip notification processing but still appear in dashboard
- Optional field with serde(default) for backward compatibility
- Bump version to 0.1.56
- Add WakeOnLAN support for offline hosts using 'w' key
- Configure MAC addresses for all infrastructure hosts
- Implement Status::Offline for disconnected hosts
- Exclude offline hosts from status aggregation to prevent false alerts
- Update versions to 0.1.55
- Add Status::Offline enum variant for disconnected hosts
- All configured hosts now always visible showing offline status when disconnected
- Add WakeOnLAN support using wake-on-lan Rust crate
- Implement w key binding to wake offline hosts with MAC addresses
- Simplify configuration to single [hosts] section with MAC addresses only
- Change critical status icon from ◯ to ! for better visibility
- Add proper MAC address parsing and error handling
- Silent WakeOnLAN operation with logging for success/failure
Configuration format:
[hosts]
hostname = { mac_address = "AA:BB:CC:DD:EE:FF" }
- Change J/L log commands from popups to split windows for scrolling support
- Change rebuild command from popup to split window with consistent 30% height
- Add auto-close behavior with bash -c "command; exit" wrapper for logs
- Add "press any key to close" prompt with visual separators for rebuild
- Enable proper tmux copy mode and navigation in all split windows
Users can now scroll through logs, copy text, and resize windows while
maintaining clean auto-close behavior for all operations.
- Add sudo to journalctl command for proper systemd log access
- Add sudo to tail command for system log file access
- Add -n 50 to tail command to match journalctl behavior
- Both J and L keys now show last 50 lines before following
Ensures consistent behavior and proper permissions for all log viewing.
- Add ServiceLogConfig structure for per-host service log paths
- Implement L key handler for custom log file viewing via tmux popup
- Update dashboard config to support service_logs HashMap
- Add tail -f command execution over SSH for real-time log streaming
- Update status line to show L: Custom shortcut
- Document configuration format in CLAUDE.md
Each service can now have custom log file paths configured per host,
accessible via L key with same tmux popup interface as journalctl.
Fix journalctl popup that was failing with 'can't find session' error:
Issue Resolution:
- Change tmux display-popup flag from -t to -T for setting popup title
- -t flag was incorrectly trying to target a session named 'Logs: servicename'
- -T flag correctly sets the popup window title
The J key (Shift+j) service logs feature now works properly, opening
an 80% tmux popup with journalctl -f for real-time log viewing.
Bump version to v0.1.49
Fix notification issues for better operational experience:
Startup Notification Suppression:
- Suppress notifications for transitions from Status::Unknown during agent/server startup
- Prevents notification spam when services transition from Unknown to Warning/Critical on restart
- Only real status changes (not initial discovery) trigger notifications
- Maintains alerting for actual service state changes after startup
Recovery Notification Refinement:
- Recovery notifications only sent when ALL services reach OK status
- Individual service recoveries suppressed if other services still have problems
- Ensures recovery notifications indicate complete system health restoration
- Prevents premature celebration when partial recoveries occur
Result: Clean startup experience without false alerts and meaningful recovery
notifications that truly indicate full system health restoration.
Bump version to v0.1.48
Correct user-stopped service behavior during startup transitions:
User-Stopped Flag Timing Fix:
- Clear user-stopped flag only when service actually becomes active, not when start command succeeds
- Remove premature flag clearing from service control handler
- Add automatic flag clearing when service status metrics show active state
- Services retain user-stopped status during activating/transitioning states
Service Transition Handling:
- User-stopped services in activating state now report Status::OK instead of Status::Pending
- Prevents host warnings during legitimate service startup transitions
- Maintains accurate status reporting throughout service lifecycle
- Failed service starts preserve user-stopped flags correctly
Journalctl Popup Fix:
- Fix terminal corruption when using J key for service logs
- Correct command quoting to prevent tmux popup interference
- Stable popup display without dashboard interface corruption
Result: Clean service startup experience with no false warnings and proper
user-stopped tracking throughout the entire service lifecycle.
Bump version to v0.1.47
New Features:
- Add journalctl service logs viewer via Shift+J key
- Opens tmux popup with real-time log streaming using journalctl -f
- Shows last 50 lines and follows new log entries for selected service
- Popup titled 'Logs: service.service' for clear context
Improvements:
- Increase tmux popup size to 80% width and height for better readability
- Applies to both rebuild (R) and logs (J) popups
- Compact status line text to fit new J: Logs shortcut
- Updated documentation with new key binding
Navigation Updates:
- J: Show service logs (journalctl in tmux popup)
- Status line: Tab: Host • ↑↓/jk: Select • r: Rebuild • s/S: Start/Stop • J: Logs • q: Quit
Bump version to v0.1.46
Improve user-stopped service tracking behavior:
Service Display Fix:
- Services widget now shows actual systemctl status (active/inactive)
- Use info.status instead of hardcoded text based on widget_status
- User-stopped services correctly display 'inactive' with green OK icon
- Prevents misleading 'active' display for stopped services
User-Stopped Flag Timing Fix:
- Clear user-stopped flag AFTER successful service start, not when command sent
- Prevents warnings during service startup transition period
- Service remains Status::OK during 'activating' state for user-stopped services
- Flag only cleared when systemctl start command actually succeeds
- Failed start attempts preserve user-stopped flag
Result: Clean service state tracking with accurate display and no false alerts
during intentional user operations.
Bump version to v0.1.45
Service stop/start operations were failing because systemctl commands
were missing the .service suffix. This caused the new user-stopped
tracking feature to mark services but not actually control them.
Changes:
- Add .service suffix to systemctl commands in service control handler
- Matches pattern used throughout systemd collector
- Fixes service start/stop functionality via dashboard
Clean up legacy documentation:
- Remove outdated TODO.md, AGENTS.md, and test files
- Update CLAUDE.md with current architecture and rules only
- Comprehensive README.md rewrite with technical documentation
- Document user-stopped service tracking feature
Bump version to v0.1.44
Add comprehensive tracking for services stopped via dashboard to prevent
false alerts when users intentionally stop services.
Features:
- User-stopped services report Status::Ok instead of Warning
- Persistent storage survives agent restarts
- Dashboard sends UserStart/UserStop commands
- Agent tracks and syncs user-stopped state globally
- Systemd collector respects user-stopped flags
Implementation:
- New service_tracker module with persistent JSON storage
- Enhanced ServiceAction enum with UserStart/UserStop variants
- Global singleton tracker accessible by collectors
- Service status logic updated to check user-stopped flag
- Dashboard version now uses CARGO_PKG_VERSION automatically
Bump version to v0.1.43