5 Commits

Author SHA1 Message Date
66a79574e0 Implement comprehensive monitoring improvements
- Add full email notifications with lettre and Stockholm timezone
- Add status persistence to prevent notification spam on restart
- Change nginx monitoring to check backend proxy_pass URLs instead of frontend domains
- Increase nginx site timeout to 10 seconds for backend health checks
- Fix cache intervals: disk (5min), backup (10min), systemd (30s), cpu/memory (5s)
- Remove rate limiting for immediate notifications on all status changes
- Store metric status in /var/lib/cm-dashboard/last-status.json
2025-10-20 14:32:44 +02:00
e7200fb1b0 Implement UUID-based disk detection for CMTEC infrastructure
Replace df-based auto-discovery with UUID-based detection using NixOS
hardware configuration data. Each host now has predefined filesystem
configurations with predictable metric names.

- Add FilesystemConfig struct with UUID, mount point, and filesystem type
- Remove auto_discover and devices fields from DiskConfig
- Add host-specific UUID defaults for cmbox, srv01, srv02, simonbox, steambox
- Remove legacy get_mounted_disks() df-based detection method
- Update DiskCollector to use UUID resolution via /dev/disk/by-uuid/
- Generate predictable metric names: disk_root_*, disk_boot_*, etc.
- Maintain fallback for labbox/wslbox (no UUIDs configured yet)

Provides consistent metric names across reboots and reliable detection
aligned with NixOS deployments without dependency on mount order.
2025-10-20 09:50:10 +02:00
7f85a6436e Clean up unused imports and fix build warnings
- Remove unused imports (Duration, HashMap, SharedError, DateTime, etc.)
- Fix unused variables by prefixing with underscore
- Remove redundant dashboard.toml config file
- Update theme imports to use only needed components
- Maintain all functionality while reducing warnings
- Add srv02 to predefined hosts configuration
- Remove unused broadcast_command methods
2025-10-18 23:12:07 +02:00
125111ee99 Implement comprehensive backup monitoring and fix timestamp issues
- Add BackupCollector for reading TOML status files with disk space metrics
- Implement BackupWidget with disk usage display and service status details
- Fix backup script disk space parsing by adding missing capture_output=True
- Update backup widget to show actual disk usage instead of repository size
- Fix timestamp parsing to use backup completion time instead of start time
- Resolve timezone issues by using UTC timestamps in backup script
- Add disk identification metrics (product name, serial number) to backup status
- Enhance UI layout with proper backup monitoring integration
2025-10-18 18:33:41 +02:00
8a36472a3d Implement real-time process monitoring and fix UI hardcoded data
This commit addresses several key issues identified during development:

Major Changes:
- Replace hardcoded top CPU/RAM process display with real system data
- Add intelligent process monitoring to CpuCollector using ps command
- Fix disk metrics permission issues in systemd collector
- Optimize service collection to focus on status, memory, and disk only
- Update dashboard widgets to display live process information

Process Monitoring Implementation:
- Added collect_top_cpu_process() and collect_top_ram_process() methods
- Implemented ps-based monitoring with accurate CPU percentages
- Added filtering to prevent self-monitoring artifacts (ps commands)
- Enhanced error handling and validation for process data
- Dashboard now shows realistic values like "claude (PID 2974) 11.0%"

Service Collection Optimization:
- Removed CPU monitoring from systemd collector for efficiency
- Enhanced service directory permission error logging
- Simplified services widget to show essential metrics only
- Fixed service-to-directory mapping accuracy

UI and Dashboard Improvements:
- Reorganized dashboard layout with btop-inspired multi-panel design
- Updated system panel to include real top CPU/RAM process display
- Enhanced widget formatting and data presentation
- Removed placeholder/hardcoded data throughout the interface

Technical Details:
- Updated agent/src/collectors/cpu.rs with process monitoring
- Modified dashboard/src/ui/mod.rs for real-time process display
- Enhanced systemd collector error handling and disk metrics
- Updated CLAUDE.md documentation with implementation details
2025-10-16 23:55:05 +02:00