Docker images now display with distinctive 🐋 whale icon in blue (highlight color) instead of status icons. This provides clear visual identification that these are docker images while not implying operational status.
Changes:
- Rename docker images from 'image_node:18...' to 'I node:18...' for conciseness
- Change image status from 'active' to 'inactive' for neutral informational display
- Images now show with gray empty circle ○ instead of green filled circle ●
Docker images are static artifacts without meaningful operational status, so using inactive status provides neutral gray display that won't trigger alerts or affect service status aggregation.
Fixes random host disconnections caused by blocking operations preventing timely ZMQ packet transmission.
Changes:
- Add run_command_with_timeout() wrapper using tokio for async command execution
- Apply 10s timeout to smartctl (prevents 30+ second hangs on failing drives)
- Apply 5s timeout to du, lsblk, systemctl list commands
- Apply 3s timeout to systemctl show/is-active, df, ip commands
- Apply 2s timeout to hostname command
- Use system 'timeout' command for sync operations where async not needed
Critical fixes:
- smartctl: Failing drives could block for 30+ seconds per drive
- du: Large directories (Docker, PostgreSQL) could block 10-30+ seconds
- systemctl/docker: Commands could block indefinitely during system issues
With 1-second collection interval and 10-second heartbeat timeout, any blocking operation >10s causes false "host offline" alerts. These timeouts ensure collection completes quickly even during system degradation.
Agent changes:
- Added debug logging to Docker images collection function
- Log when Docker sub-services are being collected for a service
- Log count of containers and images found
- Log total sub-services added
- Show command failure details instead of silently returning empty vec
This will help diagnose why Docker images aren't showing up as sub-services
on some hosts. The logs will show if the docker commands are failing or if
the collection is working but data isn't being transmitted properly.
Updated to version 0.1.175
Agent changes:
- Added get_docker_images() function to list all Docker images
- Use docker images to show stored images with repository:tag and size
- Display images as sub-services under docker service with size in parentheses
- Skip dangling images (<none>:<none>)
- Images shown with active status (always present when listed)
Example display:
● docker active 139M 1MB
├─ ● docker_gitea active
├─ ○ docker_old-app inactive
├─ ● image_nginx:latest (142MB)
├─ ● image_postgres:15 (379MB)
└─ ● image_gitea:latest (256MB)
Updated to version 0.1.174
Agent changes:
- Use docker ps -a to show ALL containers (running and stopped)
- Map container status: Up -> active, Exited/Created -> inactive, other -> failed
- Display Docker containers as sub-services under the docker service
- Each container shown with proper status indicator
Example display:
● docker active 139M 1MB
├─ ● docker_gitea active
├─ ○ docker_old-app inactive
└─ ● docker_immich active
Updated to version 0.1.173
Agent changes:
- Changed docker ps to docker ps -a to show ALL containers (running and stopped)
- Map container status: Up -> active, Exited/Created -> inactive, other -> failed
- Display Docker containers as individual top-level services instead of sub-services
- Each container shown as "docker_{container_name}" in service list
This provides better visibility of all containers and their status directly in the
services panel, making it easier to see stopped containers at a glance.
Updated to version 0.1.172
Dashboard changes:
- Sort child interfaces under physical NICs with VLANs first (by VLAN ID ascending)
- Non-VLAN virtual interfaces sorted alphabetically by name
- Applied same sorting to both nested children and standalone virtual interfaces
Example output order:
- wan (vlan 5)
- lan (vlan 30)
- isolan (vlan 32)
- seclan (vlan 35)
- br-48df2d79b46f
- docker0
- tailscale0
Updated to version 0.1.171
Agent changes:
- Parse /proc/net/vlan/config to extract VLAN IDs for interfaces
- Detect primary physical interface via default route
- Auto-assign primary interface as parent for virtual interfaces without explicit parent
- Added vlan_id field to NetworkInterfaceData
Dashboard changes:
- Display VLAN ID in format "interface (vlan X): IP"
- Show VLAN IDs for both nested and standalone virtual interfaces
This ensures virtual interfaces (docker0, tailscale0, etc.) are properly nested
under the primary physical NIC, and VLAN interfaces show their IDs.
Updated to version 0.1.170
Agent changes:
- Filter out ifb* interfaces from network display
- Parse @parent notation for VLAN interfaces (e.g., lan@enp0s31f6)
- Show physical interfaces even without IP addresses
- Only filter virtual interfaces that have no IPs
- Extract parent interface relationships for proper nesting
Dashboard changes:
- Nest VLAN/child interfaces under their physical parent
- Show physical NICs with status icons even when down
- Display child interfaces grouped under parent interface
- Keep standalone virtual interfaces at root level
Updated to version 0.1.169
Reordered display sections in system widget:
- Network section now appears after RAM and tmpfs mounts
- Improves logical grouping by placing network info between memory and storage
- Updated to version 0.1.168
Nest IP addresses under physical interface names. Show physical interfaces with status icon on header line. Virtual interfaces show inline with compressed IPs.
Format:
● eno1:
├─ ip: 192.168.30.105
└─ tailscale0: 100.125.108.16
Version bump to 0.1.166
Move network collection from NixOS collector to dedicated NetworkCollector. Add link status detection for physical interfaces (up/down). Group interfaces by physical/virtual, show status icons for physical NICs only. Down interfaces show as Inactive instead of Critical.
Version bump to 0.1.165
Network interfaces now display without status icons since there's no meaningful status to show. Just shows interface name and IP addresses with subnet compression.
Version bump to 0.1.164
Compress IPv4 addresses from same subnet to save space. Shows first IP in full (192.168.30.1) and subsequent IPs in same subnet with only last octet (100, 142).
Version bump to 0.1.163
Extend NixOS collector to gather network interfaces using ip command JSON output. Display all interfaces with IPv4 and IPv6 addresses in Network section above CPU metrics. Filters out loopback and link-local addresses.
Version bump to 0.1.161
Update systemd collector to use sudo for docker ps command to resolve
permission issues when cm-agent user lacks docker group membership.
This ensures Docker containers are properly discovered and displayed
as sub-services under the docker service.
Version: 0.1.160
- Support Temperature_Case attribute for Intel SSDs
- Support Media_Wearout_Indicator attribute for wear percentage
- Parse wear value from column 3 (VALUE) for Media_Wearout_Indicator
- Fixes temperature and wear display for Intel PHLA847000FL512DGN drives
- Fix physical drive serial number display in dashboard
- Improve pool health calculation for arrays with multiple disks
- Support proper tree symbols for multiple parity drives
- Read git commit hash from /var/lib/cm-dashboard/git-commit for Build display
- Fix NVMe serial number parsing to handle whitespace variations
- Move mount point to MergerFS header, remove drive count
- Restructure data drives to same level as parity with Data_1, Data_2 labels
- Remove "Total:" label from pool usage line
- Update parity to use closing tree symbol as last item
- Add serial_number field to DriveData structure
- Collect serial numbers from SMART data for all drives
- Display truncated serial numbers (last 8 chars) in dashboard
- Fix parity drive label to show status icon before "Parity:"
- Fix mount point label styling to match other labels
Updates disk collector and dashboard to show drive serial numbers
instead of device names (sdX) for MergerFS data/parity drives.
Agent extracts serial numbers from SMART data and dashboard
displays them when available, falling back to device names.
Updated the dashboard system widget to properly display MergerFS storage
pools in the exact format described in CLAUDE.md:
- Pool header showing "mergerfs (2+1):" format
- Total usage line: "├─ Total: ● 63% 2355.2GB/3686.4GB"
- Data Disks section with tree structure
- Individual drive entries: "│ ├─ ● sdb T: 24°C W: 5%"
- Parity drives section: "├─ Parity: ● sdc T: 24°C W: 5%"
- Mount point footer: "└─ Mount: /srv/media"
The dashboard now processes both data_drives and parity_drives arrays from
the agent data correctly and renders the complete MergerFS pool hierarchy
with proper status indicators, temperatures, and wear levels.
Storage display now matches the enhanced tree structure format specified
in documentation with correct Unicode tree characters and spacing.
Updated the disk collector to include all missing functionality from the
previous string-based implementation while working with the new structured
JSON data architecture:
- MergerFS pool discovery from /proc/mounts parsing
- SnapRAID parity drive detection via mount path heuristics
- Drive categorization (data vs parity) based on path analysis
- Numeric mergerfs reference resolution (1:2 -> /mnt/disk paths)
- Pool health calculation based on member drive SMART status
- Complete SMART data integration for temperatures and wear levels
- Proper exclusion of pool member drives from physical drive grouping
The implementation replicates the exact logic from the old code while
adapting to structured AgentData output format. All mergerfs and snapraid
monitoring capabilities are fully restored.
Replace timestamp parsing with direct display of start_time from backup TOML file to ensure timestamp always appears regardless of format. Remove empty line spacing above backup section for compact layout.
Changes:
- Remove parsed timestamp fields and use raw start_time string from TOML
- Display backup time directly from TOML file without parsing
- Remove blank line above backup section for tighter layout
- Simplify BackupData structure by removing last_run and next_scheduled fields
Version bump to v0.1.150
Replace standalone backup widget with compact backup section in system widget displaying disk serial, temperature, wear level, timing, and usage information.
Changes:
- Remove standalone backup widget and integrate into system widget
- Update backup collector to read TOML format from backup script
- Add BackupDiskData structure with serial, usage, temperature, wear fields
- Implement compact backup display matching specification format
- Add time formatting utilities for backup timing display
- Update backup data extraction from TOML with disk space parsing
Version bump to v0.1.149
Resolves nginx sites appearing only briefly during collection cycles by implementing proper caching of complete service data including sub-services.
Changes:
- Add cached_service_data field to store complete ServiceData with sub-services
- Modify collection logic to cache full service objects instead of basic ServiceInfo
- Update cache retrieval to use complete cached data preserving nginx site metrics
- Eliminate flickering of nginx sites between collection cycles
Version bump to v0.1.148
- Remove nginx_ prefix from site names in hierarchical structure
- Fix get_nginx_site_metrics to call correct internal method
- Implement same caching functionality as old working version
- Sites now stay visible continuously with 30s latency updates
- Preserve cached results between refresh cycles
- Remove duplicate status string fields from ServiceData and SubServiceData
- Use only Status enum as single source of truth for service status
- Agent calculates Status enum using calculate_service_status()
- Dashboard converts Status enum to display text for UI
- Implement flexible metrics system for sub-services with label/value/unit
- Fix status icon/text mismatches (inactive services now show gray circles)
- Ensure perfect alignment between service icons and status text
- Add nginx site metrics caching with configurable intervals matching original
- Implement complex nginx config parsing with brace counting and redirect detection
- Replace curl with reqwest HTTP client for proper timeout and redirect handling
- Fix docker container parsing to use comma format with proper status mapping
- Add sudo to directory size command for permission handling
- Change nginx URLs to use https protocol matching original
- Add advanced NixOS ExecStart parsing for argv[] format support
- Add nginx -T fallback functionality for config discovery
- Implement proper server block parsing with domain validation and brace tracking
- Add get_service_memory function matching original signature
All functionality now matches pre-refactor implementation architecture.
Phase 1 fixes for storage display:
- Replace findmnt with lsblk to eliminate bind mount issues (/nix/store)
- Add sudo to smartctl commands for permission access
- Fix NVMe SMART parsing for Temperature: and Percentage Used: fields
- Use dynamic version from CARGO_PKG_VERSION instead of hardcoded strings
Storage display should now show correct mount points and temperature/wear.
Status evaluation and notifications still need restoration in subsequent phases.
Implements clean structured data collection eliminating all string metric
parsing bugs. Collectors now populate AgentData directly with type-safe
field access.
Key improvements:
- Mount points preserved correctly (/ and /boot instead of root/boot)
- Tmpfs discovery added to memory collector
- Temperature data flows as typed f32 fields
- Zero string parsing overhead
- Complete removal of MetricCollectionManager bridge
- Direct ZMQ transmission of structured JSON
All functionality maintained: service tracking, notifications, status
evaluation, and multi-host monitoring.
Update storage display to match CLAUDE.md specification:
- Show drive temp/wear on main line: nvme0n1 T: 25°C W: 4%
- Display individual filesystems as sub-items: /: 55% 250.5GB/456.4GB
- Remove Total usage line in favor of filesystem breakdown
Clean up code warnings:
- Remove unused heartbeat methods and fields
- Remove unused backup widget fields and methods
- Add allow attributes for legacy methods
Remove parentheses from drive temperature/wear display to match the
hierarchical format specified in documentation. Drive details now show
directly with status icons as 'nvme0n1 T: 25°C W: 4%' format.
Agent heartbeat was sending empty AgentData every few seconds, causing
dashboard to display zero values for all metrics intermittently. Since
agent already transmits complete data every 1 second, heartbeat is
redundant. Dashboard will detect offline hosts via data timestamps.