165 Commits

Author SHA1 Message Date
b1bff4857b Update versions to 0.1.17 and fix backup panel visibility
All checks were successful
Build and Release / build-and-release (push) Successful in 1m16s
- Update all Cargo.toml versions to 0.1.17
- Fix backup panel to only show when meaningful data exists
- Hide backup panel when no backup configured
2025-10-27 18:50:20 +01:00
69892a2d84 Implement systemd service approach for nixos-rebuild operations
Some checks failed
Build and Release / build-and-release (push) Failing after 1m58s
- Add cm-rebuild systemd service for process isolation
- Add sudo permissions for service control and journal access
- Remove verbose flag for cleaner output
- Ensures reliable rebuild operations without agent crashes
2025-10-26 23:18:09 +01:00
a928d73134 Update Cargo.toml versions to 0.1.11
All checks were successful
Build and Release / build-and-release (push) Successful in 3m4s
- Update agent, dashboard, and shared package versions from 0.1.0 to 0.1.11
- Ensures agent version reporting shows correct v0.1.11 instead of v0.1.0
- Synchronize package versions with git tag for consistent version tracking
2025-10-26 14:12:03 +01:00
b6da71b7e7 Implement real-time terminal popup for system rebuild operations
All checks were successful
Build and Release / build-and-release (push) Successful in 1m21s
- Add terminal popup UI component with 80% screen coverage and terminal styling
- Extend ZMQ protocol with CommandOutputMessage for streaming output
- Implement real-time output streaming in agent system rebuild handler
- Add keyboard controls (ESC/Q to close, ↑↓ to scroll) for popup interaction
- Fix system panel Build display to show actual NixOS build instead of config hash
- Update service filters in README with wildcard patterns for better matching
- Add periodic progress updates during nixos-rebuild execution
- Integrate command output handling in dashboard main loop
2025-10-26 11:39:03 +01:00
a08670071c Implement simple persistent cache with automatic saving on status changes 2025-10-21 20:12:19 +02:00
d80f2ce811 Remove unused cache tiers system 2025-10-21 18:43:46 +02:00
41208aa2a0 Implement status aggregation with notification batching 2025-10-21 18:12:42 +02:00
00a8ed3da2 Implement hysteresis for metric status changes to prevent flapping
Add comprehensive hysteresis support to prevent status oscillation near
threshold boundaries while maintaining responsive alerting.

Key Features:
- HysteresisThresholds with configurable upper/lower limits
- StatusTracker for per-metric status history
- Default gaps: CPU load 10%, memory 5%, disk temp 5°C

Updated Components:
- CPU load collector (5-minute average with hysteresis)
- Memory usage collector (percentage-based thresholds)
- Disk temperature collector (SMART data monitoring)
- All collectors updated to support StatusTracker interface

Cache Interval Adjustments:
- Service status: 60s → 10s (faster response)
- Disk usage: 300s → 60s (more frequent checks)
- Backup status: 900s → 60s (quicker updates)
- SMART data: moved to 600s tier (10 minutes)

Architecture:
- Individual metric status calculation in collectors
- Centralized StatusTracker in MetricCollectionManager
- Status aggregation preserved in dashboard widgets
2025-10-20 18:45:41 +02:00
7f85a6436e Clean up unused imports and fix build warnings
- Remove unused imports (Duration, HashMap, SharedError, DateTime, etc.)
- Fix unused variables by prefixing with underscore
- Remove redundant dashboard.toml config file
- Update theme imports to use only needed components
- Maintain all functionality while reducing warnings
- Add srv02 to predefined hosts configuration
- Remove unused broadcast_command methods
2025-10-18 23:12:07 +02:00
dcca5bbea3 Fix cache tier test to match actual configuration
- Update test expectations from 5s to 2s intervals for realtime tier
- Fix comment to reflect actual 2s interval instead of outdated 5s reference
- All tests now pass correctly
2025-10-18 18:44:13 +02:00
8a36472a3d Implement real-time process monitoring and fix UI hardcoded data
This commit addresses several key issues identified during development:

Major Changes:
- Replace hardcoded top CPU/RAM process display with real system data
- Add intelligent process monitoring to CpuCollector using ps command
- Fix disk metrics permission issues in systemd collector
- Optimize service collection to focus on status, memory, and disk only
- Update dashboard widgets to display live process information

Process Monitoring Implementation:
- Added collect_top_cpu_process() and collect_top_ram_process() methods
- Implemented ps-based monitoring with accurate CPU percentages
- Added filtering to prevent self-monitoring artifacts (ps commands)
- Enhanced error handling and validation for process data
- Dashboard now shows realistic values like "claude (PID 2974) 11.0%"

Service Collection Optimization:
- Removed CPU monitoring from systemd collector for efficiency
- Enhanced service directory permission error logging
- Simplified services widget to show essential metrics only
- Fixed service-to-directory mapping accuracy

UI and Dashboard Improvements:
- Reorganized dashboard layout with btop-inspired multi-panel design
- Updated system panel to include real top CPU/RAM process display
- Enhanced widget formatting and data presentation
- Removed placeholder/hardcoded data throughout the interface

Technical Details:
- Updated agent/src/collectors/cpu.rs with process monitoring
- Modified dashboard/src/ui/mod.rs for real-time process display
- Enhanced systemd collector error handling and disk metrics
- Updated CLAUDE.md documentation with implementation details
2025-10-16 23:55:05 +02:00
1b572c5c1d Implement intelligent caching system for optimal CPU performance
Replace traditional 5-second polling with tiered collection strategy:
- RealTime (5s): CPU load, memory usage
- Medium (5min): Service status, disk usage
- Slow (15min): SMART data, backup status

Key improvements:
- Reduce CPU usage from 9.5% to <2%
- Cache warming for instant dashboard responsiveness
- Background refresh at 80% of tier intervals
- Thread-safe cache with automatic cleanup

Remove legacy polling code - smart caching is now the default and only mode.
Agent startup enhanced with parallel cache population for immediate data availability.

Architecture: SmartCache + CachedCollector + tiered CollectionScheduler
2025-10-15 11:21:36 +02:00
57b676ad25 Testing 2025-10-13 00:16:24 +02:00
2581435b10 Implement per-service disk usage monitoring
Replaced system-wide disk usage with accurate per-service tracking by scanning
service-specific directories. Services like sshd now correctly show minimal
disk usage instead of misleading system totals.

- Rename storage widget and add drive capacity/usage columns
- Move host display to main dashboard title for cleaner layout
- Replace separate alert displays with color-coded row highlighting
- Add per-service disk usage collection using du command
- Update services widget formatting to handle small disk values
- Restructure into workspace with dedicated agent and dashboard packages
2025-10-11 22:59:16 +02:00
82afe3d4f1 Restructure into workspace with dashboard and agent 2025-10-11 14:19:05 +02:00