This commit addresses several key issues identified during development:
Major Changes:
- Replace hardcoded top CPU/RAM process display with real system data
- Add intelligent process monitoring to CpuCollector using ps command
- Fix disk metrics permission issues in systemd collector
- Optimize service collection to focus on status, memory, and disk only
- Update dashboard widgets to display live process information
Process Monitoring Implementation:
- Added collect_top_cpu_process() and collect_top_ram_process() methods
- Implemented ps-based monitoring with accurate CPU percentages
- Added filtering to prevent self-monitoring artifacts (ps commands)
- Enhanced error handling and validation for process data
- Dashboard now shows realistic values like "claude (PID 2974) 11.0%"
Service Collection Optimization:
- Removed CPU monitoring from systemd collector for efficiency
- Enhanced service directory permission error logging
- Simplified services widget to show essential metrics only
- Fixed service-to-directory mapping accuracy
UI and Dashboard Improvements:
- Reorganized dashboard layout with btop-inspired multi-panel design
- Updated system panel to include real top CPU/RAM process display
- Enhanced widget formatting and data presentation
- Removed placeholder/hardcoded data throughout the interface
Technical Details:
- Updated agent/src/collectors/cpu.rs with process monitoring
- Modified dashboard/src/ui/mod.rs for real-time process display
- Enhanced systemd collector error handling and disk metrics
- Updated CLAUDE.md documentation with implementation details
- Remove 'r' key handler that was causing hang on refresh
- Remove RefreshRequested event and check_refresh_request method
- Remove send_refresh_commands function and ZMQ command protocol
- Remove refresh_requested field from App struct
- Clean up status line text (refresh -> tick)
The refresh functionality was causing the dashboard to become unresponsive
when pressing 'r' key. This removes all refresh-related code to fix the issue.
Implements ZMQ command protocol for dashboard-to-agent communication:
- Agents listen on port 6131 for REQ/REP commands
- Dashboard sends "refresh" command when 'r' key is pressed
- Agents force immediate collection of all metrics via force_refresh_all()
- Fresh data is broadcast immediately to dashboard
- Updated help text to show "r: Refresh all metrics"
Also includes metric-level caching architecture foundation for future
granular control over individual metric update frequencies.
- Add DEFAULT_HOSTS constant in config.rs for centralized host management
- Update ZMQ endpoint generation to connect to all configured hosts
- Implement graceful connection handling for unreachable endpoints
- Dashboard now auto-discovers and connects to available agents on cmbox, labbox, simonbox, steambox, srv01
- Storage widget: Restructure with Name/Temp/Wear/Usage columns, SMART details as descriptions
- Host navigation: Only cycle through connected hosts, no disconnected hosts
- Auto-discovery: Skip config files, use predefined CMTEC host list
- Maintenance mode: Suppress notifications during backup via /tmp/cm-maintenance file
- CPU thresholds: Update to warning ≥9.0, critical ≥10.0 for production use
- Agent-dashboard separation: Agent provides descriptions, dashboard displays only
Replaced system-wide disk usage with accurate per-service tracking by scanning
service-specific directories. Services like sshd now correctly show minimal
disk usage instead of misleading system totals.
- Rename storage widget and add drive capacity/usage columns
- Move host display to main dashboard title for cleaner layout
- Replace separate alert displays with color-coded row highlighting
- Add per-service disk usage collection using du command
- Update services widget formatting to handle small disk values
- Restructure into workspace with dedicated agent and dashboard packages