Changed service discovery from 'systemctl list-units --all' to 'systemctl list-unit-files'
to ensure ALL service unit files are discovered, including services that have never been started.
Changes:
- Updated systemctl command to use list-unit-files instead of list-units --all
- Modified parsing logic to handle unit file format (2 fields vs 4 fields)
- Set placeholder values in discovery cache, actual runtime status fetched during collection
- This ensures all configured services (like inactive ARK servers) appear in dashboard
The issue was that list-units --all only shows services systemd has loaded/attempted to load,
but list-unit-files shows ALL service unit files regardless of their runtime state.
Simplified keyboard controls by removing service restart functionality:
- Removed 'r' key restart functionality from Services panel
- Made 'R' key always trigger system rebuild regardless of focused panel
- Updated context shortcuts to show 'R: Rebuild Host' globally
- Removed all ServiceRestart enum variants and associated code:
- UiCommand::ServiceRestart
- CommandType::ServiceRestart
- ServiceAction::Restart
- Cleaned up pending transition logic to only handle Start/Stop commands
The 'R' key now consistently rebuilds the current host from any panel,
while 's' and 'S' continue to handle service start/stop in Services panel.
Resolved issues with transitional service icons not being properly visible:
- Removed 3-second timeout that was clearing pending transitions prematurely
- Fixed selection highlighting disappearing when transitional icons appeared
- Implemented conditional coloring for transitional icons:
- Blue when service is not selected
- Dark background color when service is selected (for visibility against blue selection)
- Transitions now persist until actual service status changes occur
Both selection highlighting and transitional icons are now visible simultaneously.
The root cause of transitional service icons not showing was that service names
were stored as raw names (e.g., "sshd") in pending_transitions but looked up
against formatted display lines (e.g., "sshd active 1M ").
Changes:
- Modified display_lines structure to include both formatted text and raw service names
- Updated rendering loop to use raw service names for pending transition lookups
- Fixed get_selected_service() method to use the new tuple structure
- Transitional icons (↑ ↓ ↻) should now appear correctly when pressing s/S/r keys
- Force sshd service to always show "↑ starting" for debugging
- Test if basic directional arrow rendering works in services widget
- Temporary debug change to isolate rendering vs logic issues
- Will help determine if problem is in pending transitions or rendering
If arrow appears: pending transitions logic issue
If no arrow: basic rendering path issue
- Store pending transitions even for redundant commands (start active service)
- Add 3-second timeout for redundant command visual feedback
- Include timestamp in pending transitions to enable timeout clearing
- Show directional arrows immediately regardless of command validation result
- Fix core issue where state validation prevented visual feedback storage
Now pressing s/S/r always shows immediate directional arrows, even for
redundant operations, providing consistent visual feedback to users.
- Prevent selection highlighting when service has pending transition
- Allow directional arrows (↑ ↓ ↻) to show through on selected services
- Fix core issue where selection styling was overwriting transitional icons
- Transitional icons now properly visible during service command execution
The selection highlighting logic now skips services with pending transitions,
ensuring that directional arrows are visible when executing service commands.
- Replace timeout-based command status with pending service transitions
- Show immediate directional arrows when pressing service commands (↑ ↓ ↻)
- Track original service status and command type for each pending operation
- Automatically clear transitional icons when real status updates arrive
- Remove unused TerminalPopup and CommandStatus infrastructure
- Simplify visual feedback system using state-based approach
Service commands now provide instant visual feedback that persists until
the actual service state changes, eliminating timing issues and improving UX.
- Add service state detection before executing start/stop/restart commands
- Prevent redundant operations (start active services, stop inactive services)
- Show immediate directional arrows for command feedback (↑ starting, ↓ stopping, ↻ restarting)
- Add get_service_status() method to ServicesWidget for state access
- Remove unused TerminalPopup code and dangling methods
- Clean up warnings and unused code throughout codebase
Service commands now validate current state and provide instant UX feedback while
preserving existing status icons and colors during transitions.
- Fix /tmp usage status to use proper thresholds instead of hardcoded Ok status
- Fix wear level status to use configurable thresholds instead of hardcoded values
- Add dedicated tmp_status field to SystemWidget for proper /tmp status display
- Remove host-level hourglass icon during service operations
- Implement immediate service status updates after start/stop/restart commands
- Remove active users display and collection from NixOS section
- Fix immediate host status aggregation transmission to dashboard
- Separate dashboard updates from email notifications for immediate status aggregation
- Add metric caching to MetricCollectionManager for instant dashboard updates
- Dashboard now receives cached data every 1 second instead of waiting for collection intervals
- Fix transmission to use cached metrics rather than triggering fresh collection
- Email notifications maintain separate 60-second batching interval
- Update configurable email notification aggregation interval
- Fix email notification aggregation to send batched notifications instead of individual emails
- Fix startup data collection to properly process initial status without triggering change notifications
- Maintain event-driven transmission while preserving aggregated notification batching
- Update version from 0.1.19 to 0.1.20 across all components
Major architectural improvements:
CORE CHANGES:
- Remove notification_interval_seconds - status aggregation now immediate
- Status calculation moved to collection phase instead of transmission
- Event-driven transmission triggers immediately on status changes
- Dual transmission strategy: immediate on change + periodic backup
- Real-time notifications without batching delays
TECHNICAL IMPROVEMENTS:
- process_metric() now returns bool indicating status change
- Immediate ZMQ broadcast when status changes detected
- Status aggregation happens during metric collection, not later
- Legacy get_nixos_build_info() method removed (unused)
- All compilation warnings fixed
BEHAVIOR CHANGES:
- Critical alerts sent instantly instead of waiting for intervals
- Dashboard receives real-time status updates
- Notifications triggered immediately on status transitions
- Backup periodic transmission every 1s ensures heartbeat
This provides much more responsive monitoring with instant alerting
while maintaining the reliability of periodic transmission as backup.
- Implement per-collector interval timing respecting NixOS config
- Remove all hardcoded timeout/interval values and make configurable
- Add tmux session requirement check for TUI mode (bypassed for headless)
- Update agent to send config hash in Build field instead of nixos version
- Add nginx check interval, HTTP timeouts, and ZMQ transmission interval configs
- Update NixOS configuration with new configurable values
Breaking changes:
- Build field now shows nix store config hash (8 chars) instead of nixos version
- All intervals now follow individual collector configuration instead of global
New configuration fields:
- systemd.nginx_check_interval_seconds
- systemd.http_timeout_seconds
- systemd.http_connect_timeout_seconds
- zmq.transmission_interval_seconds
- Add cm-rebuild systemd service for process isolation
- Add sudo permissions for service control and journal access
- Remove verbose flag for cleaner output
- Ensures reliable rebuild operations without agent crashes
- Update agent, dashboard, and shared package versions from 0.1.0 to 0.1.11
- Ensures agent version reporting shows correct v0.1.11 instead of v0.1.0
- Synchronize package versions with git tag for consistent version tracking
This commit addresses several key issues identified during development:
Major Changes:
- Replace hardcoded top CPU/RAM process display with real system data
- Add intelligent process monitoring to CpuCollector using ps command
- Fix disk metrics permission issues in systemd collector
- Optimize service collection to focus on status, memory, and disk only
- Update dashboard widgets to display live process information
Process Monitoring Implementation:
- Added collect_top_cpu_process() and collect_top_ram_process() methods
- Implemented ps-based monitoring with accurate CPU percentages
- Added filtering to prevent self-monitoring artifacts (ps commands)
- Enhanced error handling and validation for process data
- Dashboard now shows realistic values like "claude (PID 2974) 11.0%"
Service Collection Optimization:
- Removed CPU monitoring from systemd collector for efficiency
- Enhanced service directory permission error logging
- Simplified services widget to show essential metrics only
- Fixed service-to-directory mapping accuracy
UI and Dashboard Improvements:
- Reorganized dashboard layout with btop-inspired multi-panel design
- Updated system panel to include real top CPU/RAM process display
- Enhanced widget formatting and data presentation
- Removed placeholder/hardcoded data throughout the interface
Technical Details:
- Updated agent/src/collectors/cpu.rs with process monitoring
- Modified dashboard/src/ui/mod.rs for real-time process display
- Enhanced systemd collector error handling and disk metrics
- Updated CLAUDE.md documentation with implementation details