Updated the dashboard system widget to properly display MergerFS storage
pools in the exact format described in CLAUDE.md:
- Pool header showing "mergerfs (2+1):" format
- Total usage line: "├─ Total: ● 63% 2355.2GB/3686.4GB"
- Data Disks section with tree structure
- Individual drive entries: "│ ├─ ● sdb T: 24°C W: 5%"
- Parity drives section: "├─ Parity: ● sdc T: 24°C W: 5%"
- Mount point footer: "└─ Mount: /srv/media"
The dashboard now processes both data_drives and parity_drives arrays from
the agent data correctly and renders the complete MergerFS pool hierarchy
with proper status indicators, temperatures, and wear levels.
Storage display now matches the enhanced tree structure format specified
in documentation with correct Unicode tree characters and spacing.
Replace timestamp parsing with direct display of start_time from backup TOML file to ensure timestamp always appears regardless of format. Remove empty line spacing above backup section for compact layout.
Changes:
- Remove parsed timestamp fields and use raw start_time string from TOML
- Display backup time directly from TOML file without parsing
- Remove blank line above backup section for tighter layout
- Simplify BackupData structure by removing last_run and next_scheduled fields
Version bump to v0.1.150
Replace standalone backup widget with compact backup section in system widget displaying disk serial, temperature, wear level, timing, and usage information.
Changes:
- Remove standalone backup widget and integrate into system widget
- Update backup collector to read TOML format from backup script
- Add BackupDiskData structure with serial, usage, temperature, wear fields
- Implement compact backup display matching specification format
- Add time formatting utilities for backup timing display
- Update backup data extraction from TOML with disk space parsing
Version bump to v0.1.149
Fully restored CM Dashboard as a complete monitoring system with working
status evaluation and email notifications.
COMPLETED PHASES:
✅ Phase 1: Fixed storage display issues
- Use lsblk instead of findmnt (eliminates /nix/store bind mount)
- Fixed NVMe SMART parsing (Temperature: and Percentage Used:)
- Added sudo to smartctl for permissions
- Consistent filesystem and tmpfs sorting
✅ Phase 2a: Fixed missing NixOS build information
- Added build_version field to AgentData
- NixOS collector now populates build info
- Dashboard shows actual build instead of "unknown"
✅ Phase 2b: Restored status evaluation system
- Added status fields to all structured data types
- CPU: load and temperature status evaluation
- Memory: usage status evaluation
- Storage: temperature, health, and filesystem usage status
- All collectors now use their threshold configurations
✅ Phase 3: Restored notification system
- Status change detection between collection cycles
- Email alerts on status degradation (OK→Warning/Critical)
- Detailed notification content with metric values
- Full NotificationManager integration
CORE FUNCTIONALITY RESTORED:
- Real-time monitoring with proper status evaluation
- Email notifications on threshold violations
- Correct storage display (nvme0n1 T: 28°C W: 1%)
- Complete status-aware infrastructure monitoring
- Dashboard is now a monitoring system, not just data viewer
The CM Dashboard monitoring system is fully operational.
Implements clean structured data collection eliminating all string metric
parsing bugs. Collectors now populate AgentData directly with type-safe
field access.
Key improvements:
- Mount points preserved correctly (/ and /boot instead of root/boot)
- Tmpfs discovery added to memory collector
- Temperature data flows as typed f32 fields
- Zero string parsing overhead
- Complete removal of MetricCollectionManager bridge
- Direct ZMQ transmission of structured JSON
All functionality maintained: service tracking, notifications, status
evaluation, and multi-host monitoring.
Update storage display to match CLAUDE.md specification:
- Show drive temp/wear on main line: nvme0n1 T: 25°C W: 4%
- Display individual filesystems as sub-items: /: 55% 250.5GB/456.4GB
- Remove Total usage line in favor of filesystem breakdown
Clean up code warnings:
- Remove unused heartbeat methods and fields
- Remove unused backup widget fields and methods
- Add allow attributes for legacy methods
Remove parentheses from drive temperature/wear display to match the
hierarchical format specified in documentation. Drive details now show
directly with status icons as 'nvme0n1 T: 25°C W: 4%' format.
Replace fragile string-based metrics with type-safe JSON data structures.
Agent converts all metrics to structured data, dashboard processes typed fields.
Changes:
- Add AgentData struct with CPU, memory, storage, services, backup fields
- Replace string parsing with direct field access throughout system
- Maintain UI compatibility via temporary metric bridge conversion
- Fix NVMe temperature display and eliminate string parsing bugs
- Update protocol to support structured data transmission over ZMQ
- Comprehensive metric type coverage: CPU, memory, storage, services, backup
Version bump to 0.1.131
- Display wear percentage in storage headers for single physical drives
- Remove redundant drive type indicators, show wear data instead
- Fix wear metric parsing for physical drives (underscore count issue)
- Add NVMe temperature parsing support (Temperature: format)
- Add raw metrics debugging functionality for troubleshooting
- Clean up physical drive display to remove redundant information
- Display actual drive name (e.g., nvme0n1) instead of mount point for physical drives
- Fix health status parsing for physical drives to show proper status icons
- Update pool name extraction to handle disk_{drive}_health metrics correctly
- Improve storage widget rendering for physical drive identification
- Eliminate hardcoded mappings like 'root' -> '/' and 'steampool' -> '/mnt/steampool'
- Use device names directly for physical drives
- Rely on mount_point metrics from agent for actual mount paths
- Implement zero-configuration architecture as specified in CLAUDE.md
- Remove fallback logic that could extract incorrect pool names
- Simplify pool suffix matching to use explicit arrays
- Ensure only valid metric patterns create pools
- Use actual device names (sdb, sdc) instead of data_0, parity_0
- Fix physical drive naming to show device names instead of mount points
- Update pool name extraction to handle new device-based naming
- Ensure Drive: line shows temperature and wear data for physical drives
- Add SnapRAID parity drive detection to mergerfs discovery
- Remove Pool Status health line as discussed
- Update drive display to always show wear data when available
- Include /mnt/parity drives as part of mergerfs pool structure
- Restructure storage rendering logic to prevent drive duplication
- Use specific mergerfs check instead of generic multi-drive condition
- Ensure drives only appear once under organized data/parity sections
- Update extract_pool_name to handle data_/parity_ drive metrics correctly
- Fix extract_drive_name to parse mergerfs drive roles properly
- Prevent srv_media_data from being parsed as separate pool
- Improve pool name extraction in dashboard parsing
- Use consistent mergerfs pool naming in agent
- Add mount_point metric parsing to use actual mount paths
- Fix pool consolidation to prevent duplicate entries
1. Add missing _fs_ filter to usage_percent parsing in dashboard
2. Fix agent to use calculated fs_status instead of hardcoded Status::Ok
This completes the disk collector auto-discovery by ensuring filesystem
usage percentages and status indicators display correctly.
1. Prevent filesystem _fs_ metrics from overwriting pool totals
2. Fix filesystem name extraction to properly parse boot/root names
This resolves both the pool total display (showing 0.1GB instead of 220GB)
and individual filesystem display (showing —% —GB/—GB).
Prevent string slicing panic in extract_filesystem_metric when
parsing individual filesystem metrics. This resolves the issue
where filesystem entries show —% —GB/—GB instead of actual usage.
Fixed critical bug where dashboard crashed with 'begin <= end' slice error
when parsing disk metrics with new naming format. Added bounds checking
to prevent invalid string slicing operations.
- Fixed extract_pool_name string slicing bounds check
- Removed ineffective panic handling that caused infinite loop
- Dashboard now handles new disk collector metrics correctly
Added comprehensive error handling to storage metrics parsing to prevent
dashboard crashes when encountering unexpected metric formats or parsing
errors. Dashboard now continues gracefully with empty storage display
instead of crashing, improving reliability during metric format changes.
- Wrapped storage metric parsing in panic recovery
- Added logging for metric parsing failures
- Dashboard shows empty storage on errors instead of crashing
- Ensures dashboard remains functional during agent updates
- Make filesystem display more forgiving - show partial data if available
- Will display usage% even if GB values are missing, or vice versa
- This should help identify which specific metrics aren't being populated
- Debug version to identify filesystem data population issues
- Fix extract_filesystem_metric() to handle multi-underscore metric names correctly
- Parse known metric suffixes (usage_percent, mount_point, available_gb, etc.)
- Prevent incorrect parsing like boot_mount_point -> fs_name='boot_mount', metric_type='point'
- Should now correctly show /boot and / instead of /boot/mount and /root/mount
- Allow filesystem entries to be created with any metric, not just mount_point
- Ensure filesystem children appear under physical drive pools
- Improve mount point fallback logic for better compatibility
- Fix extract_pool_name() to handle filesystem metrics (_fs_) correctly
- Prevent individual filesystem pools (nvme0n1_fs_boot, nvme0n1_fs_root) from being created
- Fix incorrect mount point names (was showing /root/mount instead of /)
- Only create filesystem entries when receiving mount_point metrics
- Add available_gb field to FileSystem struct for proper available space handling
- Ensure filesystem children show correct usage data instead of —% —GB/—GB
- Implement filesystem children display under physical drive pools
- Agent generates individual filesystem metrics for each mount point
- Dashboard parses filesystem metrics and displays as tree children
- Add filesystem usage, total, and available space metrics
- Support target format: drive info + filesystem children hierarchy
- Fix compilation warnings by properly using available_bytes calculation
- Add support for mergerfs pool grouping with data and parity disk separation
- Implement pool health monitoring (healthy/degraded/critical status)
- Create hierarchical tree view for multi-disk storage arrays
- Add automatic pool type detection and member disk association
- Maintain backward compatibility for single disk configurations
- Support future extension for RAID and ZFS pool types
- Add disk wear percentage collection from SMART data in backup script
- Add backup_disk_wear_percent metric to backup collector with thresholds
- Display wear percentage in backup widget disk section
- Fix storage section overflow handling to use consistent "X more below" logic
- Update maintenance mode to return pending status instead of unknown
- Remove scroll offset fields from HostWidgets struct
- Replace scrolling with simple "X more below" indicators in all widgets
- Remove user-stopped service tracking from agent (now uses SSH control)
- Inactive services now consistently show Status::Inactive with empty circles
- Simplify widget render methods by removing scroll parameters
- Clean up unused imports and legacy scrolling infrastructure
- Fix journalctl command to use -fu for proper log following
Display the connection IP address that the dashboard is configured to use
for each host below the Agent version information. Shows which network
path (local/Tailscale) is being used for connections based on host
configuration.
Features:
- Display detected IP below Agent row in system widget
- Uses existing host configuration connection logic
- Shows actual IP being used for dashboard connections
- Change system panel title from 'NixOS:' to 'NixOS hostname:'
- Make main dashboard title 'cm-dashboard' bold in top bar
- Remove unused Typography::title() function to fix warnings
- Update SystemWidget::render_with_scroll to accept hostname parameter
- Update version to 0.1.41 in all Cargo.toml files and dashboard code
- Fix disk drive name extraction for mount points with underscores (e.g., /mnt/steampool)
- Replace confusing "1" and "2" drive names with proper device names like "sda1", "sda2"
- Update title bar with blue background and dark text styling
- Right-align host list in title bar while keeping "cm-dashboard" on left
- Bump version to v0.1.35
- Fix /tmp usage status to use proper thresholds instead of hardcoded Ok status
- Fix wear level status to use configurable thresholds instead of hardcoded values
- Add dedicated tmp_status field to SystemWidget for proper /tmp status display
- Remove host-level hourglass icon during service operations
- Implement immediate service status updates after start/stop/restart commands
- Remove active users display and collection from NixOS section
- Fix immediate host status aggregation transmission to dashboard
- Remove all SystemRebuild command infrastructure from agent and dashboard
- Replace with direct tmux popup execution: ssh {user}@{host} {alias}
- Add configurable SSH user and rebuild alias in dashboard config
- Eliminate agent process crashes during rebuilds
- Simplify architecture by removing ZMQ command streaming complexity
- Clean up all related dead code and fix compilation warnings
Benefits:
- Process isolation: rebuild runs independently via SSH
- Crash resilience: agent/dashboard can restart without affecting rebuilds
- Configuration flexibility: SSH user and alias configurable per deployment
- Operational simplicity: standard tmux popup interface
- Remove auto-close behavior from terminal popup for manual review
- Fix system panel to show correct NixOS section layout
- Add missing Active users line after Agent version
- Switch agent version from nix store hash to actual version number (v0.1.11)
- Display full version string without truncation for clear version tracking
- Add terminal popup UI component with 80% screen coverage and terminal styling
- Extend ZMQ protocol with CommandOutputMessage for streaming output
- Implement real-time output streaming in agent system rebuild handler
- Add keyboard controls (ESC/Q to close, ↑↓ to scroll) for popup interaction
- Fix system panel Build display to show actual NixOS build instead of config hash
- Update service filters in README with wildcard patterns for better matching
- Add periodic progress updates during nixos-rebuild execution
- Integrate command output handling in dashboard main loop
- Agent reports version via agent_version metric using nix store hash
- Dashboard displays agent version in system widget
- Foundation for cross-host version comparison
- Both agent -V and dashboard show versions
- Remove Config field completely from NixOS section
- Build: now shows NixOS system hash (from /run/current-system)
- Agent: shows cm-dashboard package hash (first 8 chars)
Build and Agent now display different hashes as intended.
- Add automatic timeout mechanism (5 minutes for rebuilds, 30 seconds for services)
- Implement agent hash change detection for rebuild completion
- Add visual feedback states: blue ↻ (in progress), green ✓ (success), red ✗ (failed)
- Clear status automatically after timeout or completion
- Fix command status lifecycle management
- Remove Network panel from navigation cycle
- Fix system panel scrolling to work in both directions
- Add complete scroll support to Services and Backup panels
- Update panel cycling to System → Services → Backup only
- Enhance scroll indicators with proper bounds checking
- Clean up unused Network panel code and references
Resolves issues with non-functional up/down scrolling and
mystery network panel appearing during navigation.
- Add Typography::tree() style using blue Theme::highlight() color
- Update system, backup, and services widgets to use consistent blue tree styling
- Centralizes tree color management in theme module for easy maintenance
Restore "... and X more" indicators when panel content doesn't fit:
- System widget: Shows overflow for storage pools when area is too small
- Backup widget: Shows overflow for repository list when area is too small
- Maintains consistent formatting with existing services widget overflow
Update all tree symbols (└─, ├─) in system and backup widgets to use
Typography::secondary() style instead of raw text for consistent
text coloring throughout the interface.
Backup widget:
- Restructure to match new layout specification
- Add section headers: Latest backup, Disk, Repos
- Show timestamp with status icon and duration as sub-item
- Display disk info with product name, S/N, and usage in tree structure
- List repositories with archive count and size
- Remove old render methods and unused imports
System widget:
- Hide (Single) storage type label for cleaner display