- Remove Active users line from NixOS section
- Add status icons to CPU and RAM sections
- Restructure layout with proper tree symbols and spacing
- Add empty lines between sections for better readability
- Remove Storage header, show each filesystem as top-level item
- Fix tree indentation to match specification
- CPU shows load averages with frequency as sub-item
- RAM shows usage with /tmp as sub-item with status icon
- Add mount_point field to StoragePool struct
- Create mapping from pool names to mount points
- Update display to show user-friendly mount points (/, /mnt/steampool)
- Keep device detection for SMART data (temperature, wear)
- Resolves disk name confusion on different hosts
Implement agent version tracking to diagnose deployment issues:
- Add get_agent_hash() method to extract Nix store hash from executable path
- Collect system_agent_hash metric in NixOS collector
- Display "Agent Hash" in system panel under NixOS section
- Update metric filtering to include agent hash
This helps identify which version of the agent is actually running
when troubleshooting deployment or metric collection issues.
Update metric filtering to use exact metric names instead of prefix matching.
This resolves the issue where build version showed 'unknown' despite agent
correctly collecting the metric.
- Change from showing version to build format: 'hash dd/mm/yy H:M:S'
- Parse nixos-version output to extract short hash and format date
- Update system widget to display 'Build:' instead of 'Version:'
- Remove version/build_date fields in favor of single build string
- Follow TODO.md specification for NixOS section layout
- Add memory_tmp_usage_percent, memory_tmp_used_gb, memory_tmp_total_gb metric parsing
- Fix tmpfs display showing as —% —GB/—GB in dashboard
- System widget now properly receives and displays tmpfs metrics from memory collector
- Remove /tmp autodetection from disk collector (57 lines removed)
- Add tmpfs monitoring to memory collector with get_tmpfs_metrics() method
- Generate memory_tmp_* metrics for proper RAM-based tmpfs monitoring
- Fix type annotations in tmpfs parsing for compilation
- System widget now correctly displays tmpfs usage in RAM section
- Create NixOS collector for version and active users detection
- Add SystemWidget combining all system information in TODO.md layout
- Replace separate CPU/Memory widgets with unified system display
- Add tree structure for storage with drive temperature/wear info
- Support NixOS version, active users, load averages, memory usage
- Follow exact decimal formatting from specification
- Fix duplicate storage pool issue by clearing cache on agent startup
- Change storage pool header text to normal color for better readability
- Improve services panel tree icons with proper └─ symbols for last items
- Ensure fresh metrics data on each agent restart
Add comprehensive hysteresis support to prevent status oscillation near
threshold boundaries while maintaining responsive alerting.
Key Features:
- HysteresisThresholds with configurable upper/lower limits
- StatusTracker for per-metric status history
- Default gaps: CPU load 10%, memory 5%, disk temp 5°C
Updated Components:
- CPU load collector (5-minute average with hysteresis)
- Memory usage collector (percentage-based thresholds)
- Disk temperature collector (SMART data monitoring)
- All collectors updated to support StatusTracker interface
Cache Interval Adjustments:
- Service status: 60s → 10s (faster response)
- Disk usage: 300s → 60s (more frequent checks)
- Backup status: 900s → 60s (quicker updates)
- SMART data: moved to 600s tier (10 minutes)
Architecture:
- Individual metric status calculation in collectors
- Centralized StatusTracker in MetricCollectionManager
- Status aggregation preserved in dashboard widgets
- Add support for both proxied and static nginx sites
- Proxied sites show 'P' prefix and check backend URLs
- Static sites check external HTTPS URLs
- Fix services panel column alignment for main services
- Keep 10-second timeout for all site checks
- Add has_data() method to BackupWidget to check if backup metrics exist
- Modify dashboard layout to conditionally show backup panel only when data exists
- When no backup data: system panel takes full left side height
- When backup data exists: system and backup panels share left side equally
Prevents empty backup panel from taking up screen space unnecessarily.
Removed unused widget subscription system, cache utilities, error variants,
theme functions, and struct fields. Replaced subscription-based widgets
with direct metric filtering. Build now completes with zero warnings.
Resolves widget data persistence issue where switching hosts left stale data
from the previous host displayed in widgets.
Key improvements:
- Add Clone derives to all widget structs (CpuWidget, MemoryWidget,
ServicesWidget, BackupWidget)
- Create HostWidgets struct to cache widget states per hostname
- Update TuiApp with HashMap<String, HostWidgets> for per-host storage
- Fix borrowing issues by cloning hostname before mutable self borrow
- Implement instant widget state restoration when switching hosts
Tab key host switching now displays cached widget data for each host
without stale information persistence between switches.
- Add BackupCollector for reading TOML status files with disk space metrics
- Implement BackupWidget with disk usage display and service status details
- Fix backup script disk space parsing by adding missing capture_output=True
- Update backup widget to show actual disk usage instead of repository size
- Fix timestamp parsing to use backup completion time instead of start time
- Resolve timezone issues by using UTC timestamps in backup script
- Add disk identification metrics (product name, serial number) to backup status
- Enhance UI layout with proper backup monitoring integration
This commit addresses several key issues identified during development:
Major Changes:
- Replace hardcoded top CPU/RAM process display with real system data
- Add intelligent process monitoring to CpuCollector using ps command
- Fix disk metrics permission issues in systemd collector
- Optimize service collection to focus on status, memory, and disk only
- Update dashboard widgets to display live process information
Process Monitoring Implementation:
- Added collect_top_cpu_process() and collect_top_ram_process() methods
- Implemented ps-based monitoring with accurate CPU percentages
- Added filtering to prevent self-monitoring artifacts (ps commands)
- Enhanced error handling and validation for process data
- Dashboard now shows realistic values like "claude (PID 2974) 11.0%"
Service Collection Optimization:
- Removed CPU monitoring from systemd collector for efficiency
- Enhanced service directory permission error logging
- Simplified services widget to show essential metrics only
- Fixed service-to-directory mapping accuracy
UI and Dashboard Improvements:
- Reorganized dashboard layout with btop-inspired multi-panel design
- Updated system panel to include real top CPU/RAM process display
- Enhanced widget formatting and data presentation
- Removed placeholder/hardcoded data throughout the interface
Technical Details:
- Updated agent/src/collectors/cpu.rs with process monitoring
- Modified dashboard/src/ui/mod.rs for real-time process display
- Enhanced systemd collector error handling and disk metrics
- Updated CLAUDE.md documentation with implementation details