Compare commits

..

43 Commits

Author SHA1 Message Date
bd22ce265b Use direct smartctl with CAP_SYS_RAWIO instead of sudo
All checks were successful
Build and Release / build-and-release (push) Successful in 1m9s
2025-11-27 13:22:13 +01:00
bbc8b7b1cb Add info-level logging for SMART data collection debugging
All checks were successful
Build and Release / build-and-release (push) Successful in 1m19s
2025-11-27 13:15:53 +01:00
5dd8cadef3 Remove debug logging from Docker collection code
All checks were successful
Build and Release / build-and-release (push) Successful in 1m19s
2025-11-27 12:50:20 +01:00
fefe30ec51 Remove sudo from docker commands - use docker group membership instead
All checks were successful
Build and Release / build-and-release (push) Successful in 1m19s
Agent changes:
- Changed docker ps and docker images commands to run without sudo
- cm-agent user is already in docker group, so sudo is not needed
- Fixes "unable to change to root gid: Operation not permitted" error
- Systemd security restrictions were blocking sudo gid changes

This fixes Docker container and image collection on systems with
systemd security hardening enabled.

Updated to version 0.1.178
2025-11-27 12:35:38 +01:00
fb40cce748 Add stderr logging for Docker images command failure
All checks were successful
Build and Release / build-and-release (push) Successful in 1m9s
Agent changes:
- Log stderr output when docker images command fails
- This will show the actual error message (e.g., permission denied, docker not found)
- Helps diagnose why docker images collection is failing

Updated to version 0.1.177
2025-11-27 12:28:55 +01:00
eaa057b284 Change Docker collection logging from debug to info level
All checks were successful
Build and Release / build-and-release (push) Successful in 1m10s
Agent changes:
- Changed debug!() to info!() for Docker collection logs
- This allows logs to show with default RUST_LOG=info setting
- Added info import to tracing use statement

Now logs will be visible in journalctl without needing to change log level:
- "Collecting Docker sub-services for service: docker"
- "Found X Docker containers"
- "Found X Docker images"
- "Total Docker sub-services added: X"

Updated to version 0.1.176
2025-11-27 12:18:17 +01:00
f23a1b5cec Add debug logging for Docker container and image collection
All checks were successful
Build and Release / build-and-release (push) Successful in 1m10s
Agent changes:
- Added debug logging to Docker images collection function
- Log when Docker sub-services are being collected for a service
- Log count of containers and images found
- Log total sub-services added
- Show command failure details instead of silently returning empty vec

This will help diagnose why Docker images aren't showing up as sub-services
on some hosts. The logs will show if the docker commands are failing or if
the collection is working but data isn't being transmitted properly.

Updated to version 0.1.175
2025-11-27 12:04:51 +01:00
3f98f68b51 Show Docker images as sub-services under docker service
All checks were successful
Build and Release / build-and-release (push) Successful in 1m23s
Agent changes:
- Added get_docker_images() function to list all Docker images
- Use docker images to show stored images with repository:tag and size
- Display images as sub-services under docker service with size in parentheses
- Skip dangling images (<none>:<none>)
- Images shown with active status (always present when listed)

Example display:
● docker                      active     139M     1MB
  ├─ ● docker_gitea           active
  ├─ ○ docker_old-app         inactive
  ├─ ● image_nginx:latest     (142MB)
  ├─ ● image_postgres:15      (379MB)
  └─ ● image_gitea:latest     (256MB)

Updated to version 0.1.174
2025-11-27 11:43:35 +01:00
3d38a7a984 Show all Docker containers as sub-services with active/inactive status
All checks were successful
Build and Release / build-and-release (push) Successful in 1m9s
Agent changes:
- Use docker ps -a to show ALL containers (running and stopped)
- Map container status: Up -> active, Exited/Created -> inactive, other -> failed
- Display Docker containers as sub-services under the docker service
- Each container shown with proper status indicator

Example display:
● docker                 active     139M     1MB
  ├─ ● docker_gitea      active
  ├─ ○ docker_old-app    inactive
  └─ ● docker_immich     active

Updated to version 0.1.173
2025-11-27 10:56:15 +01:00
b0ee0242bd Show all Docker containers as top-level services with active/inactive status
All checks were successful
Build and Release / build-and-release (push) Successful in 1m20s
Agent changes:
- Changed docker ps to docker ps -a to show ALL containers (running and stopped)
- Map container status: Up -> active, Exited/Created -> inactive, other -> failed
- Display Docker containers as individual top-level services instead of sub-services
- Each container shown as "docker_{container_name}" in service list

This provides better visibility of all containers and their status directly in the
services panel, making it easier to see stopped containers at a glance.

Updated to version 0.1.172
2025-11-27 10:51:47 +01:00
8f9e9eabca Sort virtual interfaces: VLANs first by ID, then alphabetically
All checks were successful
Build and Release / build-and-release (push) Successful in 1m32s
Dashboard changes:
- Sort child interfaces under physical NICs with VLANs first (by VLAN ID ascending)
- Non-VLAN virtual interfaces sorted alphabetically by name
- Applied same sorting to both nested children and standalone virtual interfaces

Example output order:
- wan (vlan 5)
- lan (vlan 30)
- isolan (vlan 32)
- seclan (vlan 35)
- br-48df2d79b46f
- docker0
- tailscale0

Updated to version 0.1.171
2025-11-27 10:12:59 +01:00
937f4ad427 Add VLAN ID display and smart parent assignment for virtual interfaces
All checks were successful
Build and Release / build-and-release (push) Successful in 1m43s
Agent changes:
- Parse /proc/net/vlan/config to extract VLAN IDs for interfaces
- Detect primary physical interface via default route
- Auto-assign primary interface as parent for virtual interfaces without explicit parent
- Added vlan_id field to NetworkInterfaceData

Dashboard changes:
- Display VLAN ID in format "interface (vlan X): IP"
- Show VLAN IDs for both nested and standalone virtual interfaces

This ensures virtual interfaces (docker0, tailscale0, etc.) are properly nested
under the primary physical NIC, and VLAN interfaces show their IDs.

Updated to version 0.1.170
2025-11-27 09:52:45 +01:00
8aefab83ae Fix network interface display for VLANs and physical NICs
All checks were successful
Build and Release / build-and-release (push) Successful in 1m11s
Agent changes:
- Filter out ifb* interfaces from network display
- Parse @parent notation for VLAN interfaces (e.g., lan@enp0s31f6)
- Show physical interfaces even without IP addresses
- Only filter virtual interfaces that have no IPs
- Extract parent interface relationships for proper nesting

Dashboard changes:
- Nest VLAN/child interfaces under their physical parent
- Show physical NICs with status icons even when down
- Display child interfaces grouped under parent interface
- Keep standalone virtual interfaces at root level

Updated to version 0.1.169
2025-11-26 23:47:16 +01:00
748a9f3a3b Move Network section below RAM in system widget
All checks were successful
Build and Release / build-and-release (push) Successful in 1m11s
Reordered display sections in system widget:
- Network section now appears after RAM and tmpfs mounts
- Improves logical grouping by placing network info between memory and storage
- Updated to version 0.1.168
2025-11-26 23:23:56 +01:00
5c6b11c794 Filter out network interfaces without IP addresses
All checks were successful
Build and Release / build-and-release (push) Successful in 1m9s
Remove interfaces like ifb0, dummy devices that have no IPs. Only show interfaces with at least one IPv4 or IPv6 address.

Version bump to 0.1.167
2025-11-26 19:19:21 +01:00
9f0aa5f806 Update network display format to match CLAUDE.md specification
All checks were successful
Build and Release / build-and-release (push) Successful in 1m38s
Nest IP addresses under physical interface names. Show physical interfaces with status icon on header line. Virtual interfaces show inline with compressed IPs.

Format:
● eno1:
  ├─ ip: 192.168.30.105
  └─ tailscale0: 100.125.108.16

Version bump to 0.1.166
2025-11-26 19:13:28 +01:00
fc247bd0ad Create dedicated network collector with physical/virtual interface grouping
All checks were successful
Build and Release / build-and-release (push) Successful in 1m43s
Move network collection from NixOS collector to dedicated NetworkCollector. Add link status detection for physical interfaces (up/down). Group interfaces by physical/virtual, show status icons for physical NICs only. Down interfaces show as Inactive instead of Critical.

Version bump to 0.1.165
2025-11-26 19:02:50 +01:00
00fe8c28ab Remove status icon from network interface display
All checks were successful
Build and Release / build-and-release (push) Successful in 1m20s
Network interfaces now display without status icons since there's no meaningful status to show. Just shows interface name and IP addresses with subnet compression.

Version bump to 0.1.164
2025-11-26 18:15:01 +01:00
fbbb4a4cfb Add subnet compression for IP address display
All checks were successful
Build and Release / build-and-release (push) Successful in 1m8s
Compress IPv4 addresses from same subnet to save space. Shows first IP in full (192.168.30.1) and subsequent IPs in same subnet with only last octet (100, 142).

Version bump to 0.1.163
2025-11-26 18:10:08 +01:00
53e1d8bbce Version bump to 0.1.162
All checks were successful
Build and Release / build-and-release (push) Successful in 1m44s
2025-11-26 18:01:31 +01:00
1b9fecea98 Fix nixosbox file path in release workflow
Some checks failed
Build and Release / build-and-release (push) Has been cancelled
Correct path from hosts/services/cm-dashboard.nix to services/cm-dashboard.nix
2025-11-26 17:55:28 +01:00
b7ffeaced5 Add network interface collection and display
Some checks failed
Build and Release / build-and-release (push) Failing after 1m32s
Extend NixOS collector to gather network interfaces using ip command JSON output. Display all interfaces with IPv4 and IPv6 addresses in Network section above CPU metrics. Filters out loopback and link-local addresses.

Version bump to 0.1.161
2025-11-26 17:41:35 +01:00
3858309a5d Fix Docker container detection with sudo permissions
Some checks failed
Build and Release / build-and-release (push) Failing after 1m19s
Update systemd collector to use sudo for docker ps command to resolve
permission issues when cm-agent user lacks docker group membership.
This ensures Docker containers are properly discovered and displayed
as sub-services under the docker service.

Version: 0.1.160
2025-11-25 12:40:27 +01:00
df104bf940 Remove debug prints and unused code
All checks were successful
Build and Release / build-and-release (push) Successful in 1m19s
- Remove all debug println statements
- Remove unused service_tracker module
- Remove unused struct fields and methods
- Remove empty placeholder files (cpu.rs, memory.rs, defaults.rs)
- Fix all compiler warnings
- Clean build with zero warnings

Version bump to 0.1.159
2025-11-25 12:19:04 +01:00
d5ce36ee18 Add support for additional SMART attributes
All checks were successful
Build and Release / build-and-release (push) Successful in 1m30s
- Support Temperature_Case attribute for Intel SSDs
- Support Media_Wearout_Indicator attribute for wear percentage
- Parse wear value from column 3 (VALUE) for Media_Wearout_Indicator
- Fixes temperature and wear display for Intel PHLA847000FL512DGN drives
2025-11-25 11:53:08 +01:00
4f80701671 Fix NVMe serial display and improve pool health logic
All checks were successful
Build and Release / build-and-release (push) Successful in 1m20s
- Fix physical drive serial number display in dashboard
- Improve pool health calculation for arrays with multiple disks
- Support proper tree symbols for multiple parity drives
- Read git commit hash from /var/lib/cm-dashboard/git-commit for Build display
2025-11-25 11:44:20 +01:00
267654fda4 Improve NVMe serial parsing and restructure MergerFS display
All checks were successful
Build and Release / build-and-release (push) Successful in 1m25s
- Fix NVMe serial number parsing to handle whitespace variations
- Move mount point to MergerFS header, remove drive count
- Restructure data drives to same level as parity with Data_1, Data_2 labels
- Remove "Total:" label from pool usage line
- Update parity to use closing tree symbol as last item
2025-11-25 11:28:54 +01:00
dc1105eefe Display disk serial numbers instead of device names
All checks were successful
Build and Release / build-and-release (push) Successful in 1m18s
- Add serial_number field to DriveData structure
- Collect serial numbers from SMART data for all drives
- Display truncated serial numbers (last 8 chars) in dashboard
- Fix parity drive label to show status icon before "Parity:"
- Fix mount point label styling to match other labels
2025-11-25 11:06:54 +01:00
c9d12793ef Replace device names with serial numbers in MergerFS pool display
All checks were successful
Build and Release / build-and-release (push) Successful in 1m19s
Updates disk collector and dashboard to show drive serial numbers
instead of device names (sdX) for MergerFS data/parity drives.
Agent extracts serial numbers from SMART data and dashboard
displays them when available, falling back to device names.
2025-11-25 10:30:37 +01:00
8f80015273 Fix dashboard storage pool label styling
All checks were successful
Build and Release / build-and-release (push) Successful in 1m20s
Replace non-existent Typography::primary() with Typography::secondary() for
MergerFS pool labels following existing UI patterns.
2025-11-25 10:16:26 +01:00
7a95a9d762 Add MergerFS pool display to dashboard matching CLAUDE.md format
All checks were successful
Build and Release / build-and-release (push) Successful in 2m32s
Updated the dashboard system widget to properly display MergerFS storage
pools in the exact format described in CLAUDE.md:

- Pool header showing "mergerfs (2+1):" format
- Total usage line: "├─ Total: ● 63% 2355.2GB/3686.4GB"
- Data Disks section with tree structure
- Individual drive entries: "│  ├─ ● sdb T: 24°C W: 5%"
- Parity drives section: "├─ Parity: ● sdc T: 24°C W: 5%"
- Mount point footer: "└─ Mount: /srv/media"

The dashboard now processes both data_drives and parity_drives arrays from
the agent data correctly and renders the complete MergerFS pool hierarchy
with proper status indicators, temperatures, and wear levels.

Storage display now matches the enhanced tree structure format specified
in documentation with correct Unicode tree characters and spacing.
2025-11-25 09:12:13 +01:00
7b11db990c Restore complete MergerFS and SnapRAID functionality to disk collector
All checks were successful
Build and Release / build-and-release (push) Successful in 1m17s
Updated the disk collector to include all missing functionality from the
previous string-based implementation while working with the new structured
JSON data architecture:

- MergerFS pool discovery from /proc/mounts parsing
- SnapRAID parity drive detection via mount path heuristics
- Drive categorization (data vs parity) based on path analysis
- Numeric mergerfs reference resolution (1:2 -> /mnt/disk paths)
- Pool health calculation based on member drive SMART status
- Complete SMART data integration for temperatures and wear levels
- Proper exclusion of pool member drives from physical drive grouping

The implementation replicates the exact logic from the old code while
adapting to structured AgentData output format. All mergerfs and snapraid
monitoring capabilities are fully restored.
2025-11-25 08:37:32 +01:00
67b59e9551 Simplify backup timestamp display with raw TOML format and remove spacing
All checks were successful
Build and Release / build-and-release (push) Successful in 1m41s
Replace timestamp parsing with direct display of start_time from backup TOML file to ensure timestamp always appears regardless of format. Remove empty line spacing above backup section for compact layout.

Changes:
- Remove parsed timestamp fields and use raw start_time string from TOML
- Display backup time directly from TOML file without parsing
- Remove blank line above backup section for tighter layout
- Simplify BackupData structure by removing last_run and next_scheduled fields

Version bump to v0.1.150
2025-11-25 00:08:36 +01:00
da37e28b6a Integrate backup metrics into system widget with enhanced disk monitoring
All checks were successful
Build and Release / build-and-release (push) Successful in 2m5s
Replace standalone backup widget with compact backup section in system widget displaying disk serial, temperature, wear level, timing, and usage information.

Changes:
- Remove standalone backup widget and integrate into system widget
- Update backup collector to read TOML format from backup script
- Add BackupDiskData structure with serial, usage, temperature, wear fields
- Implement compact backup display matching specification format
- Add time formatting utilities for backup timing display
- Update backup data extraction from TOML with disk space parsing

Version bump to v0.1.149
2025-11-24 23:55:35 +01:00
d89b3ac881 Fix nginx sub-services persistent caching with complete service data storage
All checks were successful
Build and Release / build-and-release (push) Successful in 1m17s
Resolves nginx sites appearing only briefly during collection cycles by implementing proper caching of complete service data including sub-services.

Changes:
- Add cached_service_data field to store complete ServiceData with sub-services
- Modify collection logic to cache full service objects instead of basic ServiceInfo
- Update cache retrieval to use complete cached data preserving nginx site metrics
- Eliminate flickering of nginx sites between collection cycles

Version bump to v0.1.148
2025-11-24 23:24:00 +01:00
7f26991609 Fix nginx sub-services flickering with persistent caching
All checks were successful
Build and Release / build-and-release (push) Successful in 1m19s
- Remove nginx_ prefix from site names in hierarchical structure
- Fix get_nginx_site_metrics to call correct internal method
- Implement same caching functionality as old working version
- Sites now stay visible continuously with 30s latency updates
- Preserve cached results between refresh cycles
2025-11-24 23:01:51 +01:00
75ec190b93 Fix service status icon mismatch with single source of truth architecture
All checks were successful
Build and Release / build-and-release (push) Successful in 1m8s
- Remove duplicate status string fields from ServiceData and SubServiceData
- Use only Status enum as single source of truth for service status
- Agent calculates Status enum using calculate_service_status()
- Dashboard converts Status enum to display text for UI
- Implement flexible metrics system for sub-services with label/value/unit
- Fix status icon/text mismatches (inactive services now show gray circles)
- Ensure perfect alignment between service icons and status text
2025-11-24 22:43:22 +01:00
eb892096d9 Complete systemd collector restoration matching original architecture
All checks were successful
Build and Release / build-and-release (push) Successful in 2m8s
- Add nginx site metrics caching with configurable intervals matching original
- Implement complex nginx config parsing with brace counting and redirect detection
- Replace curl with reqwest HTTP client for proper timeout and redirect handling
- Fix docker container parsing to use comma format with proper status mapping
- Add sudo to directory size command for permission handling
- Change nginx URLs to use https protocol matching original
- Add advanced NixOS ExecStart parsing for argv[] format support
- Add nginx -T fallback functionality for config discovery
- Implement proper server block parsing with domain validation and brace tracking
- Add get_service_memory function matching original signature

All functionality now matches pre-refactor implementation architecture.
2025-11-24 22:02:15 +01:00
c006625a3f Restore complete systemd collector functionality
All checks were successful
Build and Release / build-and-release (push) Successful in 2m7s
- Enhanced directory size logic with minimum 0.001GB visibility and permission error logging
- Added nginx site monitoring with latency checks and NixOS config discovery
- Added docker container monitoring as sub-services
- Integrated sub-service collection for active nginx and docker services
- All missing features from original implementation now restored
2025-11-24 21:51:42 +01:00
dcd5fff8c1 Update version to v0.1.143
All checks were successful
Build and Release / build-and-release (push) Successful in 1m16s
2025-11-24 21:43:01 +01:00
9357e5f2a8 Properly restore systemd collector with original architecture
Some checks failed
Build and Release / build-and-release (push) Failing after 1m16s
- Restore service discovery caching with configurable intervals
- Add excluded services filtering logic
- Implement complete wildcard pattern matching (*prefix, suffix*, glob)
- Add ServiceStatusInfo caching from systemctl commands
- Restore cached service status retrieval to avoid repeated systemctl calls
- Add proper systemctl command error handling

All functionality now matches pre-refactor implementation.
2025-11-24 21:36:15 +01:00
d164c1da5f Add missing service_status field to ServiceData
All checks were successful
Build and Release / build-and-release (push) Successful in 1m19s
2025-11-24 21:20:09 +01:00
b120f95f8a Restore service discovery and disk usage calculation
Some checks failed
Build and Release / build-and-release (push) Failing after 1m2s
Fixes missing services and 0B disk usage issues by restoring:
- Wildcard pattern matching for service filters (gitea*, redis*)
- Service disk usage calculation from directories and WorkingDirectory
- Proper Status::Inactive for inactive services

Services now properly discovered and show actual disk usage.
2025-11-24 20:25:08 +01:00
29 changed files with 2556 additions and 2057 deletions

View File

@@ -113,13 +113,13 @@ jobs:
NIX_HASH="sha256-$(python3 -c "import base64, binascii; print(base64.b64encode(binascii.unhexlify('$NEW_HASH')).decode())")"
# Update the NixOS configuration
sed -i "s|version = \"v[^\"]*\"|version = \"$VERSION\"|" hosts/services/cm-dashboard.nix
sed -i "s|sha256 = \"sha256-[^\"]*\"|sha256 = \"$NIX_HASH\"|" hosts/services/cm-dashboard.nix
sed -i "s|version = \"v[^\"]*\"|version = \"$VERSION\"|" services/cm-dashboard.nix
sed -i "s|sha256 = \"sha256-[^\"]*\"|sha256 = \"$NIX_HASH\"|" services/cm-dashboard.nix
# Commit and push changes
git config user.name "Gitea Actions"
git config user.email "actions@gitea.cmtec.se"
git add hosts/services/cm-dashboard.nix
git add services/cm-dashboard.nix
git commit -m "Auto-update cm-dashboard to $VERSION
- Update version to $VERSION with automated release

113
CLAUDE.md
View File

@@ -304,27 +304,33 @@ exclude_fs_types = ["tmpfs", "devtmpfs", "sysfs", "proc"]
### Display Format
```
Network:
● eno1:
├─ ip: 192.168.30.105
└─ tailscale0: 100.125.108.16
● eno2:
└─ ip: 192.168.32.105
CPU:
● Load: 0.23 0.21 0.13
└─ Freq: 1048 MHz
RAM:
● Usage: 25% 5.8GB/23.3GB
├─ ● /tmp: 2% 0.5GB/2GB
└─ ● /var/tmp: 0% 0GB/1.0GB
Storage:
mergerfs (2+1):
├─ Total: ● 63% 2355.2GB/3686.4GB
├─ Data Disks:
│ ├─ ● sdb T: 24°C W: 5%
│ └─ ● sdd T: 27°C W: 5%
├─ Parity: ● sdc T: 24°C W: 5%
└─ Mount: /srv/media
● nvme0n1 T: 25C W: 4%
844B9A25 T: 25C W: 4%
├─ ● /: 55% 250.5GB/456.4GB
└─ ● /boot: 26% 0.3GB/1.0GB
● mergerfs /srv/media:
├─ ● 63% 2355.2GB/3686.4GB
├─ ● Data_1: WDZQ8H8D T: 28°C
├─ ● Data_2: GGA04461 T: 28°C
└─ ● Parity: WDZS8RY0 T: 29°C
Backup:
● WD-WCC7K1234567 T: 32°C W: 12%
├─ Last: 2h ago (12.3GB)
├─ Next: in 22h
└─ ● Usage: 45% 678GB/1.5TB
```
## Important Communication Guidelines
@@ -355,91 +361,6 @@ Keep responses concise and focused. Avoid extensive implementation summaries unl
- ✅ "Restructure storage widget with improved layout"
- ✅ "Update CPU thresholds to production values"
## Completed Architecture Migration (v0.1.131)
## Complete Fix Plan (v0.1.140)
**🎯 Goal: Fix ALL Issues - Display AND Core Functionality**
### Current Broken State (v0.1.139)
**❌ What's Broken:**
```
✅ Data Collection: Agent collects structured data correctly
❌ Storage Display: Shows wrong mount points, missing temperature/wear
❌ Status Evaluation: Everything shows "OK" regardless of actual values
❌ Notifications: Not working - can't send alerts when systems fail
❌ Thresholds: Not being evaluated (CPU load, memory usage, disk temperature)
```
**Root Cause:**
During atomic migration, I removed core monitoring functionality and only fixed data collection, making the dashboard useless as a monitoring tool.
### Complete Fix Plan - Do Everything Right
#### Phase 1: Fix Storage Display (CURRENT)
- ✅ Use `lsblk` instead of `findmnt` (eliminates `/nix/store` bind mount issue)
- ✅ Add `sudo smartctl` for permissions
- ✅ Fix NVMe SMART parsing (`Temperature:` and `Percentage Used:`)
- 🔄 Test that dashboard shows: `● nvme0n1 T: 28°C W: 1%` correctly
#### Phase 2: Restore Status Evaluation System
- **CPU Status**: Evaluate load averages against thresholds → Status::Warning/Critical
- **Memory Status**: Evaluate usage_percent against thresholds → Status::Warning/Critical
- **Storage Status**: Evaluate temperature & usage against thresholds → Status::Warning/Critical
- **Service Status**: Evaluate service states → Status::Warning if inactive
- **Overall Host Status**: Aggregate component statuses → host-level status
#### Phase 3: Restore Notification System
- **Status Change Detection**: Track when component status changes from OK→Warning/Critical
- **Email Notifications**: Send alerts when status degrades
- **Notification Rate Limiting**: Prevent spam (existing logic)
- **Maintenance Mode**: Honor `/tmp/cm-maintenance` to suppress alerts
- **Batched Notifications**: Group multiple alerts into single email
#### Phase 4: Integration & Testing
- **AgentData Status Fields**: Add status fields to structured data
- **Dashboard Status Display**: Show colored indicators based on actual status
- **End-to-End Testing**: Verify alerts fire when thresholds exceeded
- **Verify All Thresholds**: CPU load, memory usage, disk temperature, service states
### Target Architecture (CORRECT)
**Complete Flow:**
```
Collectors → AgentData → StatusEvaluator → Notifications
↘ ↗
ZMQ → Dashboard → Status Display
```
**Key Components:**
1. **Collectors**: Populate AgentData with raw metrics
2. **StatusEvaluator**: Apply thresholds to AgentData → Status enum values
3. **Notifications**: Send emails on status changes (OK→Warning/Critical)
4. **Dashboard**: Display data with correct status colors/indicators
### Implementation Rules
**MUST COMPLETE ALL:**
- Fix storage display to show correct mount points and temperature
- Restore working status evaluation (thresholds → Status enum)
- Restore working notifications (email alerts on status changes)
- Test that monitoring actually works (alerts fire when appropriate)
**NO SHORTCUTS:**
- Don't commit partial fixes
- Don't claim functionality works when it doesn't
- Test every component thoroughly
- Keep existing configuration and thresholds working
**Success Criteria:**
- Dashboard shows `● nvme0n1 T: 28°C W: 1%` format
- High CPU load triggers Warning status and email alert
- High memory usage triggers Warning status and email alert
- High disk temperature triggers Warning status and email alert
- Failed services trigger Warning status and email alert
- Maintenance mode suppresses notifications as expected
## Implementation Rules
1. **Agent Status Authority**: Agent calculates status for each metric using thresholds

6
Cargo.lock generated
View File

@@ -279,7 +279,7 @@ checksum = "a1d728cc89cf3aee9ff92b05e62b19ee65a02b5702cff7d5a377e32c6ae29d8d"
[[package]]
name = "cm-dashboard"
version = "0.1.140"
version = "0.1.180"
dependencies = [
"anyhow",
"chrono",
@@ -301,7 +301,7 @@ dependencies = [
[[package]]
name = "cm-dashboard-agent"
version = "0.1.140"
version = "0.1.180"
dependencies = [
"anyhow",
"async-trait",
@@ -324,7 +324,7 @@ dependencies = [
[[package]]
name = "cm-dashboard-shared"
version = "0.1.140"
version = "0.1.180"
dependencies = [
"chrono",
"serde",

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard-agent"
version = "0.1.141"
version = "0.1.181"
edition = "2021"
[dependencies]

View File

@@ -12,11 +12,11 @@ use crate::collectors::{
cpu::CpuCollector,
disk::DiskCollector,
memory::MemoryCollector,
network::NetworkCollector,
nixos::NixOSCollector,
systemd::SystemdCollector,
};
use crate::notifications::NotificationManager;
use crate::service_tracker::UserStoppedServiceTracker;
use cm_dashboard_shared::AgentData;
pub struct Agent {
@@ -25,7 +25,6 @@ pub struct Agent {
zmq_handler: ZmqHandler,
collectors: Vec<Box<dyn Collector>>,
notification_manager: NotificationManager,
service_tracker: UserStoppedServiceTracker,
previous_status: Option<SystemStatus>,
}
@@ -79,7 +78,11 @@ impl Agent {
if config.collectors.backup.enabled {
collectors.push(Box::new(BackupCollector::new()));
}
if config.collectors.network.enabled {
collectors.push(Box::new(NetworkCollector::new(config.collectors.network.clone())));
}
if config.collectors.nixos.enabled {
collectors.push(Box::new(NixOSCollector::new(config.collectors.nixos.clone())));
}
@@ -90,17 +93,12 @@ impl Agent {
let notification_manager = NotificationManager::new(&config.notifications, &hostname)?;
info!("Notification manager initialized");
// Initialize service tracker
let service_tracker = UserStoppedServiceTracker::new();
info!("Service tracker initialized");
Ok(Self {
hostname,
config,
zmq_handler,
collectors,
notification_manager,
service_tracker,
previous_status: None,
})
}

View File

@@ -1,13 +1,14 @@
use async_trait::async_trait;
use cm_dashboard_shared::{AgentData, BackupData};
use cm_dashboard_shared::{AgentData, BackupData, BackupDiskData};
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::fs;
use std::path::Path;
use tracing::debug;
use super::{Collector, CollectorError};
/// Backup collector that reads backup status from JSON files with structured data output
/// Backup collector that reads backup status from TOML files with structured data output
pub struct BackupCollector {
/// Path to backup status file
status_file_path: String,
@@ -16,12 +17,12 @@ pub struct BackupCollector {
impl BackupCollector {
pub fn new() -> Self {
Self {
status_file_path: "/var/lib/backup/status.json".to_string(),
status_file_path: "/var/lib/backup/backup-status.toml".to_string(),
}
}
/// Read backup status from JSON file
async fn read_backup_status(&self) -> Result<Option<BackupStatus>, CollectorError> {
/// Read backup status from TOML file
async fn read_backup_status(&self) -> Result<Option<BackupStatusToml>, CollectorError> {
if !Path::new(&self.status_file_path).exists() {
debug!("Backup status file not found: {}", self.status_file_path);
return Ok(None);
@@ -33,24 +34,57 @@ impl BackupCollector {
error: e.to_string(),
})?;
let status: BackupStatus = serde_json::from_str(&content)
let status: BackupStatusToml = toml::from_str(&content)
.map_err(|e| CollectorError::Parse {
value: content.clone(),
error: format!("Failed to parse backup status JSON: {}", e),
error: format!("Failed to parse backup status TOML: {}", e),
})?;
Ok(Some(status))
}
/// Convert BackupStatus to BackupData and populate AgentData
/// Convert BackupStatusToml to BackupData and populate AgentData
async fn populate_backup_data(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
if let Some(backup_status) = self.read_backup_status().await? {
// Use raw start_time string from TOML
// Extract disk information
let repository_disk = if let Some(disk_space) = &backup_status.disk_space {
Some(BackupDiskData {
serial: backup_status.disk_serial_number.clone().unwrap_or_else(|| "Unknown".to_string()),
usage_percent: disk_space.usage_percent as f32,
used_gb: disk_space.used_gb as f32,
total_gb: disk_space.total_gb as f32,
wear_percent: backup_status.disk_wear_percent,
temperature_celsius: None, // Not available in current TOML
})
} else if let Some(serial) = &backup_status.disk_serial_number {
// Fallback: create minimal disk info if we have serial but no disk_space
Some(BackupDiskData {
serial: serial.clone(),
usage_percent: 0.0,
used_gb: 0.0,
total_gb: 0.0,
wear_percent: backup_status.disk_wear_percent,
temperature_celsius: None,
})
} else {
None
};
// Calculate total repository size from services
let total_size_gb = backup_status.services
.values()
.map(|service| service.repo_size_bytes as f32 / (1024.0 * 1024.0 * 1024.0))
.sum::<f32>();
let backup_data = BackupData {
status: backup_status.status,
last_run: Some(backup_status.last_run),
next_scheduled: Some(backup_status.next_scheduled),
total_size_gb: Some(backup_status.total_size_gb),
repository_health: Some(backup_status.repository_health),
total_size_gb: Some(total_size_gb),
repository_health: Some("ok".to_string()), // Derive from status if needed
repository_disk,
last_backup_size_gb: None, // Not available in current TOML format
start_time_raw: Some(backup_status.start_time),
};
agent_data.backup = backup_data;
@@ -58,10 +92,11 @@ impl BackupCollector {
// No backup status available - set default values
agent_data.backup = BackupData {
status: "unavailable".to_string(),
last_run: None,
next_scheduled: None,
total_size_gb: None,
repository_health: None,
repository_disk: None,
last_backup_size_gb: None,
start_time_raw: None,
};
}
@@ -77,12 +112,38 @@ impl Collector for BackupCollector {
}
}
/// Backup status structure from JSON file
/// TOML structure for backup status file
#[derive(Debug, Clone, Serialize, Deserialize)]
struct BackupStatus {
pub status: String, // "completed", "running", "failed", etc.
pub last_run: u64, // Unix timestamp
pub next_scheduled: u64, // Unix timestamp
pub total_size_gb: f32, // Total backup size in GB
pub repository_health: String, // "ok", "warning", "error"
struct BackupStatusToml {
pub backup_name: String,
pub start_time: String,
pub current_time: String,
pub duration_seconds: i64,
pub status: String,
pub last_updated: String,
pub disk_space: Option<DiskSpace>,
pub disk_product_name: Option<String>,
pub disk_serial_number: Option<String>,
pub disk_wear_percent: Option<f32>,
pub services: HashMap<String, ServiceStatus>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
struct DiskSpace {
pub total_bytes: u64,
pub used_bytes: u64,
pub available_bytes: u64,
pub total_gb: f64,
pub used_gb: f64,
pub available_gb: f64,
pub usage_percent: f64,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
struct ServiceStatus {
pub status: String,
pub exit_code: i64,
pub repo_path: String,
pub archive_count: i64,
pub repo_size_bytes: u64,
}

View File

@@ -19,10 +19,8 @@ pub struct DiskCollector {
/// A physical drive with its filesystems
#[derive(Debug, Clone)]
struct PhysicalDrive {
name: String, // e.g., "nvme0n1", "sda"
name: String, // e.g., "nvme0n1", "sda"
health: String, // SMART health status
temperature_celsius: Option<f32>, // Drive temperature
wear_percent: Option<f32>, // SSD wear level
filesystems: Vec<Filesystem>, // mounted filesystems on this drive
}
@@ -50,6 +48,7 @@ struct MergerfsPool {
#[derive(Debug, Clone)]
struct PoolDrive {
name: String, // Drive name
mount_point: String, // e.g., "/mnt/disk1"
temperature_celsius: Option<f32>, // Drive temperature
}
@@ -75,11 +74,17 @@ impl DiskCollector {
let mount_devices = self.get_mount_devices().await?;
// Step 2: Get filesystem usage for each mount point using df
let filesystem_usage = self.get_filesystem_usage(&mount_devices).map_err(|e| CollectorError::Parse {
let mut filesystem_usage = self.get_filesystem_usage(&mount_devices).map_err(|e| CollectorError::Parse {
value: "filesystem usage".to_string(),
error: format!("Failed to get filesystem usage: {}", e),
})?;
// Step 2.5: Add MergerFS mount points that weren't in lsblk output
self.add_mergerfs_filesystem_usage(&mut filesystem_usage).map_err(|e| CollectorError::Parse {
value: "mergerfs filesystem usage".to_string(),
error: format!("Failed to get mergerfs filesystem usage: {}", e),
})?;
// Step 3: Detect MergerFS pools
let mergerfs_pools = self.detect_mergerfs_pools(&filesystem_usage).map_err(|e| CollectorError::Parse {
value: "mergerfs pools".to_string(),
@@ -155,6 +160,30 @@ impl DiskCollector {
Ok(filesystem_usage)
}
/// Add filesystem usage for MergerFS mount points that aren't in lsblk
fn add_mergerfs_filesystem_usage(&self, filesystem_usage: &mut HashMap<String, (u64, u64)>) -> anyhow::Result<()> {
let mounts_content = std::fs::read_to_string("/proc/mounts")
.map_err(|e| anyhow::anyhow!("Failed to read /proc/mounts: {}", e))?;
for line in mounts_content.lines() {
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 3 && parts[2] == "fuse.mergerfs" {
let mount_point = parts[1].to_string();
// Only add if we don't already have usage data for this mount point
if !filesystem_usage.contains_key(&mount_point) {
if let Ok((total, used)) = self.get_filesystem_info(&mount_point) {
debug!("Added MergerFS filesystem usage for {}: {}GB total, {}GB used",
mount_point, total as f32 / (1024.0 * 1024.0 * 1024.0), used as f32 / (1024.0 * 1024.0 * 1024.0));
filesystem_usage.insert(mount_point, (total, used));
}
}
}
}
Ok(())
}
/// Get filesystem info for a single mount point
fn get_filesystem_info(&self, mount_point: &str) -> Result<(u64, u64), CollectorError> {
let output = Command::new("df")
@@ -198,16 +227,80 @@ impl DiskCollector {
}
/// Detect MergerFS pools from mount data
fn detect_mergerfs_pools(&self, _filesystem_usage: &HashMap<String, (u64, u64)>) -> anyhow::Result<Vec<MergerfsPool>> {
let pools = Vec::new();
fn detect_mergerfs_pools(&self, filesystem_usage: &HashMap<String, (u64, u64)>) -> anyhow::Result<Vec<MergerfsPool>> {
let mounts_content = std::fs::read_to_string("/proc/mounts")
.map_err(|e| anyhow::anyhow!("Failed to read /proc/mounts: {}", e))?;
let mut pools = Vec::new();
// For now, return empty pools - full mergerfs detection would require parsing /proc/mounts for fuse.mergerfs
// This ensures we don't break existing functionality
for line in mounts_content.lines() {
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 3 && parts[2] == "fuse.mergerfs" {
let mount_point = parts[1].to_string();
let device_sources = parts[0]; // e.g., "/mnt/disk1:/mnt/disk2"
// Get pool usage
let (total_bytes, used_bytes) = filesystem_usage.get(&mount_point)
.copied()
.unwrap_or((0, 0));
// Extract pool name from mount point (e.g., "/srv/media" -> "srv_media")
let pool_name = if mount_point == "/" {
"root".to_string()
} else {
mount_point.trim_start_matches('/').replace('/', "_")
};
if pool_name.is_empty() {
debug!("Skipping mergerfs pool with empty name: {}", mount_point);
continue;
}
// Parse member paths - handle both full paths and numeric references
let raw_paths: Vec<String> = device_sources
.split(':')
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty())
.collect();
// Convert numeric references to actual mount points if needed
let member_paths = if raw_paths.iter().any(|path| !path.starts_with('/')) {
// Handle numeric format like "1:2" by finding corresponding /mnt/disk* paths
self.resolve_numeric_mergerfs_paths(&raw_paths)?
} else {
// Already full paths
raw_paths
};
// For SnapRAID setups, include parity drives that are related to this pool's data drives
let mut all_member_paths = member_paths.clone();
let related_parity_paths = self.discover_related_parity_drives(&member_paths)?;
all_member_paths.extend(related_parity_paths);
// Categorize as data vs parity drives
let (data_drives, parity_drives) = match self.categorize_pool_drives(&all_member_paths) {
Ok(drives) => drives,
Err(e) => {
debug!("Failed to categorize drives for pool {}: {}. Skipping.", mount_point, e);
continue;
}
};
pools.push(MergerfsPool {
name: pool_name,
mount_point,
total_bytes,
used_bytes,
data_drives,
parity_drives,
});
}
}
debug!("Found {} mergerfs pools", pools.len());
Ok(pools)
}
/// Group filesystems by physical drive (excluding mergerfs members)
/// Group filesystems by physical drive (excluding mergerfs members) - exact old logic
fn group_by_physical_drive(
&self,
mount_devices: &HashMap<String, String>,
@@ -216,14 +309,14 @@ impl DiskCollector {
) -> anyhow::Result<Vec<PhysicalDrive>> {
let mut drive_groups: HashMap<String, Vec<Filesystem>> = HashMap::new();
// Get all mergerfs member paths to exclude them
// Get all mergerfs member paths to exclude them - exactly like old code
let mut mergerfs_members = std::collections::HashSet::new();
for pool in mergerfs_pools {
for drive in &pool.data_drives {
mergerfs_members.insert(drive.name.clone());
mergerfs_members.insert(drive.mount_point.clone());
}
for drive in &pool.parity_drives {
mergerfs_members.insert(drive.name.clone());
mergerfs_members.insert(drive.mount_point.clone());
}
}
@@ -256,8 +349,6 @@ impl DiskCollector {
let physical_drive = PhysicalDrive {
name: drive_name,
health: "UNKNOWN".to_string(), // Will be updated with SMART data
temperature_celsius: None,
wear_percent: None,
filesystems,
};
physical_drives.push(physical_drive);
@@ -294,8 +385,9 @@ impl DiskCollector {
/// Get SMART data for drives
async fn get_smart_data_for_drives(&self, physical_drives: &[PhysicalDrive], mergerfs_pools: &[MergerfsPool]) -> HashMap<String, SmartData> {
use tracing::info;
let mut smart_data = HashMap::new();
// Collect all drive names
let mut all_drives = std::collections::HashSet::new();
for drive in physical_drives {
@@ -310,20 +402,33 @@ impl DiskCollector {
}
}
info!("Collecting SMART data for {} drives", all_drives.len());
// Get SMART data for each drive
for drive_name in all_drives {
if let Ok(data) = self.get_smart_data(&drive_name).await {
smart_data.insert(drive_name, data);
for drive_name in &all_drives {
match self.get_smart_data(drive_name).await {
Ok(data) => {
info!("SMART data collected for {}: serial={:?}, temp={:?}, health={}",
drive_name, data.serial_number, data.temperature_celsius, data.health);
smart_data.insert(drive_name.clone(), data);
}
Err(e) => {
info!("Failed to get SMART data for {}: {:?}", drive_name, e);
}
}
}
info!("SMART data collection complete: {}/{} drives successful", smart_data.len(), all_drives.len());
smart_data
}
/// Get SMART data for a single drive
async fn get_smart_data(&self, drive_name: &str) -> Result<SmartData, CollectorError> {
let output = Command::new("sudo")
.args(&["smartctl", "-a", &format!("/dev/{}", drive_name)])
use tracing::info;
// Use direct smartctl (no sudo) - service has CAP_SYS_RAWIO capability
let output = Command::new("smartctl")
.args(&["-a", &format!("/dev/{}", drive_name)])
.output()
.map_err(|e| CollectorError::SystemRead {
path: format!("SMART data for {}", drive_name),
@@ -332,22 +437,24 @@ impl DiskCollector {
let output_str = String::from_utf8_lossy(&output.stdout);
let error_str = String::from_utf8_lossy(&output.stderr);
// Debug logging for SMART command results
debug!("SMART output for {}: status={}, stdout_len={}, stderr={}",
debug!("SMART output for {}: status={}, stdout_len={}, stderr={}",
drive_name, output.status, output_str.len(), error_str);
if !output.status.success() {
debug!("SMART command failed for {}: {}", drive_name, error_str);
info!("SMART command failed for {}, status={}, stderr={}", drive_name, output.status, error_str);
// Return unknown data rather than failing completely
return Ok(SmartData {
health: "UNKNOWN".to_string(),
serial_number: None,
temperature_celsius: None,
wear_percent: None,
});
}
let mut health = "UNKNOWN".to_string();
let mut serial_number = None;
let mut temperature = None;
let mut wear_percent = None;
@@ -360,8 +467,21 @@ impl DiskCollector {
}
}
// Serial number parsing (both SATA and NVMe)
if line.contains("Serial Number:") {
if let Some(serial_part) = line.split("Serial Number:").nth(1) {
let serial_str = serial_part.trim();
if !serial_str.is_empty() {
// Take first whitespace-separated token
if let Some(serial) = serial_str.split_whitespace().next() {
serial_number = Some(serial.to_string());
}
}
}
}
// Temperature parsing for different drive types
if line.contains("Temperature_Celsius") || line.contains("Airflow_Temperature_Cel") {
if line.contains("Temperature_Celsius") || line.contains("Airflow_Temperature_Cel") || line.contains("Temperature_Case") {
// Traditional SATA drives: attribute table format
if let Some(temp_str) = line.split_whitespace().nth(9) {
if let Ok(temp) = temp_str.parse::<f32>() {
@@ -379,7 +499,15 @@ impl DiskCollector {
}
// Wear level parsing for SSDs
if line.contains("Wear_Leveling_Count") || line.contains("SSD_Life_Left") {
if line.contains("Media_Wearout_Indicator") {
// Media_Wearout_Indicator stores remaining life % in column 3 (VALUE)
if let Some(wear_str) = line.split_whitespace().nth(3) {
if let Ok(remaining) = wear_str.parse::<f32>() {
wear_percent = Some(100.0 - remaining); // Convert remaining life to wear
}
}
} else if line.contains("Wear_Leveling_Count") || line.contains("SSD_Life_Left") {
// Other wear attributes store value in column 9 (RAW_VALUE)
if let Some(wear_str) = line.split_whitespace().nth(9) {
if let Ok(wear) = wear_str.parse::<f32>() {
wear_percent = Some(100.0 - wear); // Convert remaining life to wear
@@ -402,6 +530,7 @@ impl DiskCollector {
Ok(SmartData {
health,
serial_number,
temperature_celsius: temperature,
wear_percent,
})
@@ -427,6 +556,7 @@ impl DiskCollector {
agent_data.system.storage.drives.push(DriveData {
name: drive.name.clone(),
serial_number: smart.and_then(|s| s.serial_number.clone()),
health: smart.map(|s| s.health.clone()).unwrap_or_else(|| drive.health.clone()),
temperature_celsius: smart.and_then(|s| s.temperature_celsius),
wear_percent: smart.and_then(|s| s.wear_percent),
@@ -444,28 +574,25 @@ impl DiskCollector {
}
/// Populate pools data into AgentData
fn populate_pools_data(&self, mergerfs_pools: &[MergerfsPool], _smart_data: &HashMap<String, SmartData>, agent_data: &mut AgentData) -> Result<(), CollectorError> {
fn populate_pools_data(&self, mergerfs_pools: &[MergerfsPool], smart_data: &HashMap<String, SmartData>, agent_data: &mut AgentData) -> Result<(), CollectorError> {
for pool in mergerfs_pools {
// Calculate pool health and statuses based on member drive health
let (pool_health, health_status, usage_status, data_drive_data, parity_drive_data) = self.calculate_pool_health(pool, smart_data);
let pool_data = PoolData {
name: pool.name.clone(),
mount: pool.mount_point.clone(),
pool_type: "mergerfs".to_string(),
health: "healthy".to_string(), // TODO: Calculate based on member drives
usage_percent: (pool.used_bytes as f32 / pool.total_bytes as f32) * 100.0,
pool_type: format!("mergerfs ({}+{})", pool.data_drives.len(), pool.parity_drives.len()),
health: pool_health,
usage_percent: if pool.total_bytes > 0 {
(pool.used_bytes as f32 / pool.total_bytes as f32) * 100.0
} else { 0.0 },
used_gb: pool.used_bytes as f32 / (1024.0 * 1024.0 * 1024.0),
total_gb: pool.total_bytes as f32 / (1024.0 * 1024.0 * 1024.0),
data_drives: pool.data_drives.iter().map(|d| cm_dashboard_shared::PoolDriveData {
name: d.name.clone(),
temperature_celsius: d.temperature_celsius,
health: "unknown".to_string(),
wear_percent: None,
}).collect(),
parity_drives: pool.parity_drives.iter().map(|d| cm_dashboard_shared::PoolDriveData {
name: d.name.clone(),
temperature_celsius: d.temperature_celsius,
health: "unknown".to_string(),
wear_percent: None,
}).collect(),
data_drives: data_drive_data,
parity_drives: parity_drive_data,
health_status,
usage_status,
};
agent_data.system.storage.pools.push(pool_data);
@@ -474,6 +601,93 @@ impl DiskCollector {
Ok(())
}
/// Calculate pool health based on member drive status
fn calculate_pool_health(&self, pool: &MergerfsPool, smart_data: &HashMap<String, SmartData>) -> (String, cm_dashboard_shared::Status, cm_dashboard_shared::Status, Vec<cm_dashboard_shared::PoolDriveData>, Vec<cm_dashboard_shared::PoolDriveData>) {
let mut failed_data = 0;
let mut failed_parity = 0;
// Process data drives
let data_drive_data: Vec<cm_dashboard_shared::PoolDriveData> = pool.data_drives.iter().map(|d| {
let smart = smart_data.get(&d.name);
let health = smart.map(|s| s.health.clone()).unwrap_or_else(|| "UNKNOWN".to_string());
let temperature = smart.and_then(|s| s.temperature_celsius).or(d.temperature_celsius);
if health == "FAILED" {
failed_data += 1;
}
// Calculate drive statuses using config thresholds
let health_status = self.calculate_health_status(&health);
let temperature_status = temperature.map(|t| self.temperature_thresholds.evaluate(t)).unwrap_or(cm_dashboard_shared::Status::Unknown);
cm_dashboard_shared::PoolDriveData {
name: d.name.clone(),
serial_number: smart.and_then(|s| s.serial_number.clone()),
temperature_celsius: temperature,
health,
wear_percent: smart.and_then(|s| s.wear_percent),
health_status,
temperature_status,
}
}).collect();
// Process parity drives
let parity_drive_data: Vec<cm_dashboard_shared::PoolDriveData> = pool.parity_drives.iter().map(|d| {
let smart = smart_data.get(&d.name);
let health = smart.map(|s| s.health.clone()).unwrap_or_else(|| "UNKNOWN".to_string());
let temperature = smart.and_then(|s| s.temperature_celsius).or(d.temperature_celsius);
if health == "FAILED" {
failed_parity += 1;
}
// Calculate drive statuses using config thresholds
let health_status = self.calculate_health_status(&health);
let temperature_status = temperature.map(|t| self.temperature_thresholds.evaluate(t)).unwrap_or(cm_dashboard_shared::Status::Unknown);
cm_dashboard_shared::PoolDriveData {
name: d.name.clone(),
serial_number: smart.and_then(|s| s.serial_number.clone()),
temperature_celsius: temperature,
health,
wear_percent: smart.and_then(|s| s.wear_percent),
health_status,
temperature_status,
}
}).collect();
// Calculate overall pool health string and status
// SnapRAID logic: can tolerate up to N parity drive failures (where N = number of parity drives)
// If data drives fail AND we've lost parity protection, that's critical
let (pool_health, health_status) = if failed_data == 0 && failed_parity == 0 {
("healthy".to_string(), cm_dashboard_shared::Status::Ok)
} else if failed_data == 0 && failed_parity > 0 {
// Parity failed but no data loss - degraded (reduced protection)
("degraded".to_string(), cm_dashboard_shared::Status::Warning)
} else if failed_data == 1 && failed_parity == 0 {
// One data drive failed, parity intact - degraded (recoverable)
("degraded".to_string(), cm_dashboard_shared::Status::Warning)
} else {
// Multiple data drives failed OR data+parity failed = data loss risk
("critical".to_string(), cm_dashboard_shared::Status::Critical)
};
// Calculate pool usage status using config thresholds
let usage_percent = if pool.total_bytes > 0 {
(pool.used_bytes as f32 / pool.total_bytes as f32) * 100.0
} else { 0.0 };
let usage_status = if usage_percent >= self.config.usage_critical_percent {
cm_dashboard_shared::Status::Critical
} else if usage_percent >= self.config.usage_warning_percent {
cm_dashboard_shared::Status::Warning
} else {
cm_dashboard_shared::Status::Ok
};
(pool_health, health_status, usage_status, data_drive_data, parity_drive_data)
}
/// Calculate filesystem usage status
fn calculate_filesystem_usage_status(&self, usage_percent: f32) -> Status {
// Use standard filesystem warning/critical thresholds
@@ -499,6 +713,134 @@ impl DiskCollector {
_ => Status::Unknown,
}
}
/// Discover parity drives that are related to the given data drives
fn discover_related_parity_drives(&self, data_drives: &[String]) -> anyhow::Result<Vec<String>> {
let mount_devices = tokio::task::block_in_place(|| {
tokio::runtime::Handle::current().block_on(self.get_mount_devices())
}).map_err(|e| anyhow::anyhow!("Failed to get mount devices: {}", e))?;
let mut related_parity = Vec::new();
// Find parity drives that share the same parent directory as the data drives
for data_path in data_drives {
if let Some(parent_dir) = self.get_parent_directory(data_path) {
// Look for parity drives in the same parent directory
for (mount_point, _device) in &mount_devices {
if mount_point.contains("parity") && mount_point.starts_with(&parent_dir) {
if !related_parity.contains(mount_point) {
related_parity.push(mount_point.clone());
}
}
}
}
}
Ok(related_parity)
}
/// Get parent directory of a mount path (e.g., "/mnt/disk1" -> "/mnt")
fn get_parent_directory(&self, path: &str) -> Option<String> {
if let Some(last_slash) = path.rfind('/') {
if last_slash > 0 {
return Some(path[..last_slash].to_string());
}
}
None
}
/// Categorize pool member drives as data vs parity
fn categorize_pool_drives(&self, member_paths: &[String]) -> anyhow::Result<(Vec<PoolDrive>, Vec<PoolDrive>)> {
let mut data_drives = Vec::new();
let mut parity_drives = Vec::new();
for path in member_paths {
let drive_info = self.get_drive_info_for_path(path)?;
// Heuristic: if path contains "parity", it's parity
if path.to_lowercase().contains("parity") {
parity_drives.push(drive_info);
} else {
data_drives.push(drive_info);
}
}
Ok((data_drives, parity_drives))
}
/// Get drive information for a mount path
fn get_drive_info_for_path(&self, path: &str) -> anyhow::Result<PoolDrive> {
// Use lsblk to find the backing device
let output = Command::new("lsblk")
.args(&["-rn", "-o", "NAME,MOUNTPOINT"])
.output()
.map_err(|e| anyhow::anyhow!("Failed to run lsblk: {}", e))?;
let output_str = String::from_utf8_lossy(&output.stdout);
let mut device = String::new();
for line in output_str.lines() {
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 2 && parts[1] == path {
device = parts[0].to_string();
break;
}
}
if device.is_empty() {
return Err(anyhow::anyhow!("Could not find device for path {}", path));
}
// Extract base device name (e.g., "sda1" -> "sda")
let base_device = self.extract_base_device(&format!("/dev/{}", device));
// Get temperature from SMART data if available
let temperature = if let Ok(smart_data) = tokio::task::block_in_place(|| {
tokio::runtime::Handle::current().block_on(self.get_smart_data(&base_device))
}) {
smart_data.temperature_celsius
} else {
None
};
Ok(PoolDrive {
name: base_device,
mount_point: path.to_string(),
temperature_celsius: temperature,
})
}
/// Resolve numeric mergerfs references like "1:2" to actual mount paths
fn resolve_numeric_mergerfs_paths(&self, numeric_refs: &[String]) -> anyhow::Result<Vec<String>> {
let mut resolved_paths = Vec::new();
// Get all mount points that look like /mnt/disk* or /mnt/parity*
let mount_devices = tokio::task::block_in_place(|| {
tokio::runtime::Handle::current().block_on(self.get_mount_devices())
}).map_err(|e| anyhow::anyhow!("Failed to get mount devices: {}", e))?;
let mut disk_mounts: Vec<String> = mount_devices.keys()
.filter(|path| path.starts_with("/mnt/disk") || path.starts_with("/mnt/parity"))
.cloned()
.collect();
disk_mounts.sort(); // Ensure consistent ordering
for num_ref in numeric_refs {
if let Ok(index) = num_ref.parse::<usize>() {
// Convert 1-based index to 0-based
if index > 0 && index <= disk_mounts.len() {
resolved_paths.push(disk_mounts[index - 1].clone());
}
}
}
// Fallback: if we couldn't resolve, return the original paths
if resolved_paths.is_empty() {
resolved_paths = numeric_refs.to_vec();
}
Ok(resolved_paths)
}
}
#[async_trait]
@@ -512,6 +854,7 @@ impl Collector for DiskCollector {
#[derive(Debug, Clone)]
struct SmartData {
health: String,
serial_number: Option<String>,
temperature_celsius: Option<f32>,
wear_percent: Option<f32>,
}

View File

@@ -7,6 +7,7 @@ pub mod cpu;
pub mod disk;
pub mod error;
pub mod memory;
pub mod network;
pub mod nixos;
pub mod systemd;

View File

@@ -0,0 +1,224 @@
use async_trait::async_trait;
use cm_dashboard_shared::{AgentData, NetworkInterfaceData, Status};
use std::process::Command;
use tracing::debug;
use super::{Collector, CollectorError};
use crate::config::NetworkConfig;
/// Network interface collector with physical/virtual classification and link status
pub struct NetworkCollector {
_config: NetworkConfig,
}
impl NetworkCollector {
pub fn new(config: NetworkConfig) -> Self {
Self { _config: config }
}
/// Check if interface is physical (not virtual)
fn is_physical_interface(name: &str) -> bool {
// Physical interface patterns
matches!(
&name[..],
s if s.starts_with("eth")
|| s.starts_with("ens")
|| s.starts_with("enp")
|| s.starts_with("wlan")
|| s.starts_with("wlp")
|| s.starts_with("eno")
|| s.starts_with("enx")
)
}
/// Get link status for an interface
fn get_link_status(interface: &str) -> Status {
let operstate_path = format!("/sys/class/net/{}/operstate", interface);
match std::fs::read_to_string(&operstate_path) {
Ok(state) => {
let state = state.trim();
match state {
"up" => Status::Ok,
"down" => Status::Inactive,
"unknown" => Status::Warning,
_ => Status::Unknown,
}
}
Err(_) => Status::Unknown,
}
}
/// Get the primary physical interface (the one with default route)
fn get_primary_physical_interface() -> Option<String> {
match Command::new("ip").args(["route", "show", "default"]).output() {
Ok(output) if output.status.success() => {
let output_str = String::from_utf8_lossy(&output.stdout);
// Parse: "default via 192.168.1.1 dev eno1 ..."
for line in output_str.lines() {
if line.starts_with("default") {
if let Some(dev_pos) = line.find(" dev ") {
let after_dev = &line[dev_pos + 5..];
if let Some(space_pos) = after_dev.find(' ') {
let interface = &after_dev[..space_pos];
// Only return if it's a physical interface
if Self::is_physical_interface(interface) {
return Some(interface.to_string());
}
} else {
// No space after interface name (end of line)
let interface = after_dev.trim();
if Self::is_physical_interface(interface) {
return Some(interface.to_string());
}
}
}
}
}
None
}
_ => None,
}
}
/// Parse VLAN configuration from /proc/net/vlan/config
/// Returns a map of interface name -> VLAN ID
fn parse_vlan_config() -> std::collections::HashMap<String, u16> {
let mut vlan_map = std::collections::HashMap::new();
if let Ok(contents) = std::fs::read_to_string("/proc/net/vlan/config") {
for line in contents.lines().skip(2) { // Skip header lines
let parts: Vec<&str> = line.split('|').collect();
if parts.len() >= 2 {
let interface_name = parts[0].trim();
let vlan_id_str = parts[1].trim();
if let Ok(vlan_id) = vlan_id_str.parse::<u16>() {
vlan_map.insert(interface_name.to_string(), vlan_id);
}
}
}
}
vlan_map
}
/// Collect network interfaces using ip command
async fn collect_interfaces(&self) -> Vec<NetworkInterfaceData> {
let mut interfaces = Vec::new();
// Parse VLAN configuration
let vlan_map = Self::parse_vlan_config();
match Command::new("ip").args(["-j", "addr"]).output() {
Ok(output) if output.status.success() => {
let json_str = String::from_utf8_lossy(&output.stdout);
if let Ok(json_data) = serde_json::from_str::<serde_json::Value>(&json_str) {
if let Some(ifaces) = json_data.as_array() {
for iface in ifaces {
let name = iface["ifname"].as_str().unwrap_or("").to_string();
// Skip loopback, empty names, and ifb* interfaces
if name.is_empty() || name == "lo" || name.starts_with("ifb") {
continue;
}
// Parse parent interface from @parent notation (e.g., lan@enp0s31f6)
let (interface_name, parent_interface) = if let Some(at_pos) = name.find('@') {
let (child, parent) = name.split_at(at_pos);
(child.to_string(), Some(parent[1..].to_string()))
} else {
(name.clone(), None)
};
let mut ipv4_addresses = Vec::new();
let mut ipv6_addresses = Vec::new();
// Extract IP addresses
if let Some(addr_info) = iface["addr_info"].as_array() {
for addr in addr_info {
if let Some(family) = addr["family"].as_str() {
if let Some(local) = addr["local"].as_str() {
match family {
"inet" => ipv4_addresses.push(local.to_string()),
"inet6" => {
// Skip link-local IPv6 addresses (fe80::)
if !local.starts_with("fe80:") {
ipv6_addresses.push(local.to_string());
}
}
_ => {}
}
}
}
}
}
// Determine if physical and get status
let is_physical = Self::is_physical_interface(&interface_name);
// Only filter out virtual interfaces without IPs
// Physical interfaces should always be shown even if down/no IPs
if !is_physical && ipv4_addresses.is_empty() && ipv6_addresses.is_empty() {
continue;
}
let link_status = if is_physical {
Self::get_link_status(&name)
} else {
Status::Unknown // Virtual interfaces don't have meaningful link status
};
// Look up VLAN ID from the map (use original name before @ parsing)
let vlan_id = vlan_map.get(&name).copied();
interfaces.push(NetworkInterfaceData {
name: interface_name,
ipv4_addresses,
ipv6_addresses,
is_physical,
link_status,
parent_interface,
vlan_id,
});
}
}
}
}
Err(e) => {
debug!("Failed to execute ip command: {}", e);
}
Ok(output) => {
debug!("ip command failed with status: {}", output.status);
}
}
// Assign primary physical interface as parent to virtual interfaces without explicit parent
let primary_interface = Self::get_primary_physical_interface();
if let Some(primary) = primary_interface {
for interface in interfaces.iter_mut() {
// Only assign parent to virtual interfaces that don't already have one
if !interface.is_physical && interface.parent_interface.is_none() {
interface.parent_interface = Some(primary.clone());
}
}
}
interfaces
}
}
#[async_trait]
impl Collector for NetworkCollector {
async fn collect_structured(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
debug!("Collecting network interface data");
// Collect all network interfaces
let interfaces = self.collect_interfaces().await;
agent_data.system.network.interfaces = interfaces;
Ok(())
}
}

View File

@@ -5,21 +5,18 @@ use std::process::Command;
use tracing::debug;
use super::{Collector, CollectorError};
use crate::config::NixOSConfig;
/// NixOS system information collector with structured data output
///
///
/// This collector gathers NixOS-specific information like:
/// - System generation/build information
/// - Version information
/// - Agent version from Nix store path
pub struct NixOSCollector {
config: NixOSConfig,
}
pub struct NixOSCollector;
impl NixOSCollector {
pub fn new(config: NixOSConfig) -> Self {
Self { config }
pub fn new(_config: crate::config::NixOSConfig) -> Self {
Self
}
/// Collect NixOS system information and populate AgentData
@@ -83,14 +80,25 @@ impl NixOSCollector {
std::env::var("CM_DASHBOARD_VERSION").unwrap_or_else(|_| "unknown".to_string())
}
/// Get NixOS system generation (build) information
/// Get NixOS system generation (build) information from git commit
async fn get_nixos_generation(&self) -> Option<String> {
match Command::new("nixos-version").output() {
Ok(output) => {
let version_str = String::from_utf8_lossy(&output.stdout);
Some(version_str.trim().to_string())
// Try to read git commit hash from file written during rebuild
let commit_file = "/var/lib/cm-dashboard/git-commit";
match fs::read_to_string(commit_file) {
Ok(content) => {
let commit_hash = content.trim();
if commit_hash.len() >= 7 {
debug!("Found git commit hash: {}", commit_hash);
Some(commit_hash.to_string())
} else {
debug!("Git commit hash too short: {}", commit_hash);
None
}
}
Err(e) => {
debug!("Failed to read git commit file {}: {}", commit_file, e);
None
}
Err(_) => None,
}
}
}

View File

@@ -1,6 +1,6 @@
use anyhow::Result;
use async_trait::async_trait;
use cm_dashboard_shared::{AgentData, ServiceData};
use cm_dashboard_shared::{AgentData, ServiceData, SubServiceData, SubServiceMetric, Status};
use std::process::Command;
use std::sync::RwLock;
use std::time::Instant;
@@ -22,26 +22,46 @@ pub struct SystemdCollector {
struct ServiceCacheState {
/// Last collection time for performance tracking
last_collection: Option<Instant>,
/// Cached service data
services: Vec<ServiceInfo>,
/// Cached complete service data with sub-services
cached_service_data: Vec<ServiceData>,
/// Interesting services to monitor (cached after discovery)
monitored_services: Vec<String>,
/// Cached service status information from discovery
service_status_cache: std::collections::HashMap<String, ServiceStatusInfo>,
/// Last time services were discovered
last_discovery_time: Option<Instant>,
/// How often to rediscover services (from config)
discovery_interval_seconds: u64,
/// Cached nginx site latency metrics
nginx_site_metrics: Vec<(String, f32)>,
/// Last time nginx sites were checked
last_nginx_check_time: Option<Instant>,
/// How often to check nginx site latency (configurable)
nginx_check_interval_seconds: u64,
}
/// Internal service information
/// Cached service status information from systemctl list-units
#[derive(Debug, Clone)]
struct ServiceInfo {
name: String,
status: String, // "active", "inactive", "failed", etc.
memory_mb: f32, // Memory usage in MB
disk_gb: f32, // Disk usage in GB (usually 0 for services)
struct ServiceStatusInfo {
load_state: String,
active_state: String,
sub_state: String,
}
impl SystemdCollector {
pub fn new(config: SystemdConfig) -> Self {
let state = ServiceCacheState {
last_collection: None,
services: Vec::new(),
cached_service_data: Vec::new(),
monitored_services: Vec::new(),
service_status_cache: std::collections::HashMap::new(),
last_discovery_time: None,
discovery_interval_seconds: config.interval_seconds,
nginx_site_metrics: Vec::new(),
last_nginx_check_time: None,
nginx_check_interval_seconds: config.nginx_check_interval_seconds,
};
Self {
state: RwLock::new(state),
config,
@@ -53,25 +73,107 @@ impl SystemdCollector {
let start_time = Instant::now();
debug!("Collecting systemd services metrics");
// Get systemd services status
let services = self.get_systemd_services().await?;
// Get cached services (discovery only happens when needed)
let monitored_services = match self.get_monitored_services() {
Ok(services) => services,
Err(e) => {
debug!("Failed to get monitored services: {}", e);
return Ok(());
}
};
// Collect service data for each monitored service
let mut complete_service_data = Vec::new();
for service_name in &monitored_services {
match self.get_service_status(service_name) {
Ok((active_status, _detailed_info)) => {
let memory_mb = self.get_service_memory_usage(service_name).await.unwrap_or(0.0);
let disk_gb = self.get_service_disk_usage(service_name).await.unwrap_or(0.0);
let mut sub_services = Vec::new();
// Sub-service metrics for specific services (always include cached results)
if service_name.contains("nginx") && active_status == "active" {
let nginx_sites = self.get_nginx_site_metrics();
for (site_name, latency_ms) in nginx_sites {
let site_status = if latency_ms >= 0.0 && latency_ms < self.config.nginx_latency_critical_ms {
"active"
} else {
"failed"
};
let mut metrics = Vec::new();
metrics.push(SubServiceMetric {
label: "latency_ms".to_string(),
value: latency_ms,
unit: Some("ms".to_string()),
});
sub_services.push(SubServiceData {
name: site_name.clone(),
service_status: self.calculate_service_status(&site_name, &site_status),
metrics,
});
}
}
if service_name.contains("docker") && active_status == "active" {
let docker_containers = self.get_docker_containers();
for (container_name, container_status) in docker_containers {
// For now, docker containers have no additional metrics
// Future: could add memory_mb, cpu_percent, restart_count, etc.
let metrics = Vec::new();
sub_services.push(SubServiceData {
name: container_name.clone(),
service_status: self.calculate_service_status(&container_name, &container_status),
metrics,
});
}
// Add Docker images
let docker_images = self.get_docker_images();
for (image_name, image_status, image_size) in docker_images {
let mut metrics = Vec::new();
metrics.push(SubServiceMetric {
label: "size".to_string(),
value: 0.0, // Size as string in name instead
unit: None,
});
sub_services.push(SubServiceData {
name: format!("{} ({})", image_name, image_size),
service_status: self.calculate_service_status(&image_name, &image_status),
metrics,
});
}
}
// Create complete service data
let service_data = ServiceData {
name: service_name.clone(),
memory_mb,
disk_gb,
user_stopped: false, // TODO: Integrate with service tracker
service_status: self.calculate_service_status(service_name, &active_status),
sub_services,
};
// Add to AgentData and cache
agent_data.services.push(service_data.clone());
complete_service_data.push(service_data);
}
Err(e) => {
debug!("Failed to get status for service {}: {}", service_name, e);
}
}
}
// Update cached state
{
let mut state = self.state.write().unwrap();
state.last_collection = Some(start_time);
state.services = services.clone();
}
// Populate AgentData with service information
for service in services {
agent_data.services.push(ServiceData {
name: service.name,
status: service.status,
memory_mb: service.memory_mb,
disk_gb: service.disk_gb,
user_stopped: false, // TODO: Integrate with service tracker
});
state.cached_service_data = complete_service_data;
}
let elapsed = start_time.elapsed();
@@ -80,57 +182,317 @@ impl SystemdCollector {
Ok(())
}
/// Get systemd services information
async fn get_systemd_services(&self) -> Result<Vec<ServiceInfo>, CollectorError> {
let mut services = Vec::new();
// Get basic service status from systemctl
let status_output = Command::new("systemctl")
.args(&["list-units", "--type=service", "--no-pager", "--plain"])
.output()
.map_err(|e| CollectorError::SystemRead {
path: "systemctl list-units".to_string(),
error: e.to_string(),
})?;
let status_str = String::from_utf8_lossy(&status_output.stdout);
// Parse service status
for line in status_str.lines() {
if line.trim().is_empty() || line.contains("UNIT") {
continue;
}
let parts: Vec<&str> = line.split_whitespace().collect();
if parts.len() >= 4 {
let service_name = parts[0].trim_end_matches(".service");
let load_state = parts[1];
let active_state = parts[2];
let sub_state = parts[3];
// Skip if not loaded
if load_state != "loaded" {
continue;
/// Get monitored services, discovering them if needed or cache is expired
fn get_monitored_services(&self) -> Result<Vec<String>> {
// Check if we need discovery without holding the lock
let needs_discovery = {
let state = self.state.read().unwrap();
match state.last_discovery_time {
None => true, // First time
Some(last_time) => {
let elapsed = last_time.elapsed().as_secs();
elapsed >= state.discovery_interval_seconds
}
}
};
// Filter services based on configuration
if self.config.service_name_filters.is_empty() || self.config.service_name_filters.contains(&service_name.to_string()) {
// Get memory usage for this service
let memory_mb = self.get_service_memory_usage(service_name).await.unwrap_or(0.0);
let service_info = ServiceInfo {
name: service_name.to_string(),
status: self.normalize_service_status(active_state, sub_state),
memory_mb,
disk_gb: 0.0, // Services typically don't have disk usage
};
services.push(service_info);
if needs_discovery {
debug!("Discovering systemd services (cache expired or first run)");
match self.discover_services_internal() {
Ok((services, status_cache)) => {
if let Ok(mut state) = self.state.write() {
state.monitored_services = services.clone();
state.service_status_cache = status_cache;
state.last_discovery_time = Some(Instant::now());
debug!("Auto-discovered {} services to monitor: {:?}",
state.monitored_services.len(), state.monitored_services);
return Ok(services);
}
}
Err(e) => {
debug!("Failed to discover services, using cached list: {}", e);
}
}
}
Ok(services)
// Return cached services
let state = self.state.read().unwrap();
Ok(state.monitored_services.clone())
}
/// Get nginx site metrics, checking them if cache is expired (like old working version)
fn get_nginx_site_metrics(&self) -> Vec<(String, f32)> {
let mut state = self.state.write().unwrap();
// Check if we need to refresh nginx site metrics
let needs_refresh = match state.last_nginx_check_time {
None => true, // First time
Some(last_time) => {
let elapsed = last_time.elapsed().as_secs();
elapsed >= state.nginx_check_interval_seconds
}
};
if needs_refresh {
// Only check nginx sites if nginx service is active
if state.monitored_services.iter().any(|s| s.contains("nginx")) {
let fresh_metrics = self.get_nginx_sites_internal();
state.nginx_site_metrics = fresh_metrics;
state.last_nginx_check_time = Some(Instant::now());
}
}
state.nginx_site_metrics.clone()
}
/// Auto-discover interesting services to monitor
fn discover_services_internal(&self) -> Result<(Vec<String>, std::collections::HashMap<String, ServiceStatusInfo>)> {
// First: Get all service unit files
let unit_files_output = Command::new("systemctl")
.args(&["list-unit-files", "--type=service", "--no-pager", "--plain"])
.output()?;
if !unit_files_output.status.success() {
return Err(anyhow::anyhow!("systemctl list-unit-files command failed"));
}
// Second: Get runtime status of all units
let units_status_output = Command::new("systemctl")
.args(&["list-units", "--type=service", "--all", "--no-pager", "--plain"])
.output()?;
if !units_status_output.status.success() {
return Err(anyhow::anyhow!("systemctl list-units command failed"));
}
let unit_files_str = String::from_utf8(unit_files_output.stdout)?;
let units_status_str = String::from_utf8(units_status_output.stdout)?;
let mut services = Vec::new();
let excluded_services = &self.config.excluded_services;
let service_name_filters = &self.config.service_name_filters;
// Parse all service unit files
let mut all_service_names = std::collections::HashSet::new();
for line in unit_files_str.lines() {
let fields: Vec<&str> = line.split_whitespace().collect();
if fields.len() >= 2 && fields[0].ends_with(".service") {
let service_name = fields[0].trim_end_matches(".service");
all_service_names.insert(service_name.to_string());
}
}
// Parse runtime status for all units
let mut status_cache = std::collections::HashMap::new();
for line in units_status_str.lines() {
let fields: Vec<&str> = line.split_whitespace().collect();
if fields.len() >= 4 && fields[0].ends_with(".service") {
let service_name = fields[0].trim_end_matches(".service");
let load_state = fields.get(1).unwrap_or(&"unknown").to_string();
let active_state = fields.get(2).unwrap_or(&"unknown").to_string();
let sub_state = fields.get(3).unwrap_or(&"unknown").to_string();
status_cache.insert(service_name.to_string(), ServiceStatusInfo {
load_state,
active_state,
sub_state,
});
}
}
// For services found in unit files but not in runtime status, set default inactive status
for service_name in &all_service_names {
if !status_cache.contains_key(service_name) {
status_cache.insert(service_name.to_string(), ServiceStatusInfo {
load_state: "not-loaded".to_string(),
active_state: "inactive".to_string(),
sub_state: "dead".to_string(),
});
}
}
// Process all discovered services and apply filters
for service_name in &all_service_names {
// Skip excluded services first
let mut is_excluded = false;
for excluded in excluded_services {
if service_name.contains(excluded) {
is_excluded = true;
break;
}
}
if is_excluded {
continue;
}
// Check if this service matches our filter patterns (supports wildcards)
for pattern in service_name_filters {
if self.matches_pattern(service_name, pattern) {
services.push(service_name.to_string());
break;
}
}
}
Ok((services, status_cache))
}
/// Get service status from cache (if available) or fallback to systemctl
fn get_service_status(&self, service: &str) -> Result<(String, String)> {
// Try to get status from cache first
if let Ok(state) = self.state.read() {
if let Some(cached_info) = state.service_status_cache.get(service) {
let active_status = cached_info.active_state.clone();
let detailed_info = format!(
"LoadState={}\nActiveState={}\nSubState={}",
cached_info.load_state,
cached_info.active_state,
cached_info.sub_state
);
return Ok((active_status, detailed_info));
}
}
// Fallback to systemctl if not in cache
let output = Command::new("systemctl")
.args(&["is-active", &format!("{}.service", service)])
.output()?;
let active_status = String::from_utf8(output.stdout)?.trim().to_string();
// Get more detailed info
let output = Command::new("systemctl")
.args(&["show", &format!("{}.service", service), "--property=LoadState,ActiveState,SubState"])
.output()?;
let detailed_info = String::from_utf8(output.stdout)?;
Ok((active_status, detailed_info))
}
/// Check if service name matches pattern (supports wildcards like nginx*)
fn matches_pattern(&self, service_name: &str, pattern: &str) -> bool {
if pattern.contains('*') {
if pattern.ends_with('*') {
// Pattern like "nginx*" - match if service starts with "nginx"
let prefix = &pattern[..pattern.len() - 1];
service_name.starts_with(prefix)
} else if pattern.starts_with('*') {
// Pattern like "*backup" - match if service ends with "backup"
let suffix = &pattern[1..];
service_name.ends_with(suffix)
} else {
// Pattern like "nginx*backup" - simple glob matching
self.simple_glob_match(service_name, pattern)
}
} else {
// Exact match
service_name == pattern
}
}
/// Simple glob matching for patterns with * in the middle
fn simple_glob_match(&self, text: &str, pattern: &str) -> bool {
let parts: Vec<&str> = pattern.split('*').collect();
let mut pos = 0;
for part in parts {
if part.is_empty() {
continue;
}
if let Some(found_pos) = text[pos..].find(part) {
pos += found_pos + part.len();
} else {
return false;
}
}
true
}
/// Get disk usage for a specific service
async fn get_service_disk_usage(&self, service_name: &str) -> Result<f32, CollectorError> {
// Check if this service has configured directory paths
if let Some(dirs) = self.config.service_directories.get(service_name) {
// Service has configured paths - use the first accessible one
for dir in dirs {
if let Some(size) = self.get_directory_size(dir) {
return Ok(size);
}
}
// If configured paths failed, return 0
return Ok(0.0);
}
// No configured path - try to get WorkingDirectory from systemctl
let output = Command::new("systemctl")
.args(&["show", &format!("{}.service", service_name), "--property=WorkingDirectory"])
.output()
.map_err(|e| CollectorError::SystemRead {
path: format!("WorkingDirectory for {}", service_name),
error: e.to_string(),
})?;
let output_str = String::from_utf8_lossy(&output.stdout);
for line in output_str.lines() {
if line.starts_with("WorkingDirectory=") && !line.contains("[not set]") {
let dir = line.strip_prefix("WorkingDirectory=").unwrap_or("");
if !dir.is_empty() && dir != "/" {
return Ok(self.get_directory_size(dir).unwrap_or(0.0));
}
}
}
Ok(0.0)
}
/// Get size of a directory in GB
fn get_directory_size(&self, path: &str) -> Option<f32> {
let output = Command::new("sudo")
.args(&["du", "-sb", path])
.output()
.ok()?;
if !output.status.success() {
// Log permission errors for debugging but don't spam logs
let stderr = String::from_utf8_lossy(&output.stderr);
if stderr.contains("Permission denied") {
debug!("Permission denied accessing directory: {}", path);
} else {
debug!("Failed to get size for directory {}: {}", path, stderr);
}
return None;
}
let output_str = String::from_utf8(output.stdout).ok()?;
let size_str = output_str.split_whitespace().next()?;
if let Ok(size_bytes) = size_str.parse::<u64>() {
let size_gb = size_bytes as f32 / (1024.0 * 1024.0 * 1024.0);
// Return size even if very small (minimum 0.001 GB = 1MB for visibility)
if size_gb > 0.0 {
Some(size_gb.max(0.001))
} else {
None
}
} else {
None
}
}
/// Calculate service status, taking user-stopped services into account
fn calculate_service_status(&self, service_name: &str, active_status: &str) -> Status {
match active_status.to_lowercase().as_str() {
"active" => Status::Ok,
"inactive" | "dead" => {
debug!("Service '{}' is inactive - treating as Inactive status", service_name);
Status::Inactive
},
"failed" | "error" => Status::Critical,
"activating" | "deactivating" | "reloading" | "starting" | "stopping" => {
debug!("Service '{}' is transitioning - treating as Pending", service_name);
Status::Pending
},
_ => Status::Unknown,
}
}
/// Get memory usage for a specific service
@@ -160,24 +522,10 @@ impl SystemdCollector {
Ok(0.0)
}
/// Normalize service status to standard values
fn normalize_service_status(&self, active_state: &str, sub_state: &str) -> String {
match (active_state, sub_state) {
("active", "running") => "active".to_string(),
("active", _) => "active".to_string(),
("inactive", "dead") => "inactive".to_string(),
("inactive", _) => "inactive".to_string(),
("failed", _) => "failed".to_string(),
("activating", _) => "starting".to_string(),
("deactivating", _) => "stopping".to_string(),
_ => format!("{}:{}", active_state, sub_state),
}
}
/// Check if service collection cache should be updated
fn should_update_cache(&self) -> bool {
let state = self.state.read().unwrap();
match state.last_collection {
None => true,
Some(last) => {
@@ -187,31 +535,342 @@ impl SystemdCollector {
}
}
/// Get cached service data if available and fresh
fn get_cached_services(&self) -> Option<Vec<ServiceInfo>> {
/// Get cached complete service data with sub-services if available and fresh
fn get_cached_complete_services(&self) -> Option<Vec<ServiceData>> {
if !self.should_update_cache() {
let state = self.state.read().unwrap();
Some(state.services.clone())
Some(state.cached_service_data.clone())
} else {
None
}
}
/// Get nginx sites with latency checks (internal - no caching)
fn get_nginx_sites_internal(&self) -> Vec<(String, f32)> {
let mut sites = Vec::new();
// Discover nginx sites from configuration
let discovered_sites = self.discover_nginx_sites();
// Always add all discovered sites, even if checks fail (like old version)
for (site_name, url) in &discovered_sites {
match self.check_site_latency(url) {
Ok(latency_ms) => {
sites.push((site_name.clone(), latency_ms));
}
Err(_) => {
// Site is unreachable - use -1.0 to indicate error (like old version)
sites.push((site_name.clone(), -1.0));
}
}
}
sites
}
/// Discover nginx sites from configuration
fn discover_nginx_sites(&self) -> Vec<(String, String)> {
// Use the same approach as the old working agent: get nginx config from systemd
let config_content = match self.get_nginx_config_from_systemd() {
Some(content) => content,
None => {
debug!("Could not get nginx config from systemd, trying nginx -T fallback");
match self.get_nginx_config_via_command() {
Some(content) => content,
None => {
debug!("Could not get nginx config via any method");
return Vec::new();
}
}
}
};
// Parse the config content to extract sites
self.parse_nginx_config_for_sites(&config_content)
}
/// Fallback: get nginx config via nginx -T command
fn get_nginx_config_via_command(&self) -> Option<String> {
let output = Command::new("nginx")
.args(&["-T"])
.output()
.ok()?;
if !output.status.success() {
debug!("nginx -T failed");
return None;
}
Some(String::from_utf8_lossy(&output.stdout).to_string())
}
/// Get nginx config from systemd service definition (NixOS compatible)
fn get_nginx_config_from_systemd(&self) -> Option<String> {
let output = Command::new("systemctl")
.args(&["show", "nginx", "--property=ExecStart", "--no-pager"])
.output()
.ok()?;
if !output.status.success() {
debug!("Failed to get nginx ExecStart from systemd");
return None;
}
let stdout = String::from_utf8_lossy(&output.stdout);
debug!("systemctl show nginx output: {}", stdout);
// Parse ExecStart to extract -c config path
for line in stdout.lines() {
if line.starts_with("ExecStart=") {
debug!("Found ExecStart line: {}", line);
if let Some(config_path) = self.extract_config_path_from_exec_start(line) {
debug!("Extracted config path: {}", config_path);
return std::fs::read_to_string(&config_path).ok();
}
}
}
None
}
/// Extract config path from ExecStart line
fn extract_config_path_from_exec_start(&self, exec_start: &str) -> Option<String> {
// Remove ExecStart= prefix
let exec_part = exec_start.strip_prefix("ExecStart=")?;
debug!("Parsing exec part: {}", exec_part);
// Handle NixOS format: ExecStart={ path=...; argv[]=...nginx -c /config; ... }
if exec_part.contains("argv[]=") {
// Extract the part after argv[]=
let argv_start = exec_part.find("argv[]=")?;
let argv_part = &exec_part[argv_start + 7..]; // Skip "argv[]="
debug!("Found NixOS argv part: {}", argv_part);
// Look for -c flag followed by config path
if let Some(c_pos) = argv_part.find(" -c ") {
let after_c = &argv_part[c_pos + 4..];
// Find the config path (until next space or semicolon)
let config_path = after_c.split([' ', ';']).next()?;
return Some(config_path.to_string());
}
} else {
// Handle traditional format: ExecStart=/path/nginx -c /config
debug!("Parsing traditional format");
if let Some(c_pos) = exec_part.find(" -c ") {
let after_c = &exec_part[c_pos + 4..];
let config_path = after_c.split_whitespace().next()?;
return Some(config_path.to_string());
}
}
None
}
/// Parse nginx config content to extract server names and build site list
fn parse_nginx_config_for_sites(&self, config_content: &str) -> Vec<(String, String)> {
let mut sites = Vec::new();
let lines: Vec<&str> = config_content.lines().collect();
let mut i = 0;
debug!("Parsing nginx config with {} lines", lines.len());
while i < lines.len() {
let line = lines[i].trim();
if line.starts_with("server") && line.contains("{") {
if let Some(server_name) = self.parse_server_block(&lines, &mut i) {
let url = format!("https://{}", server_name);
sites.push((server_name.clone(), url));
}
}
i += 1;
}
debug!("Discovered {} nginx sites total", sites.len());
sites
}
/// Parse a server block to extract the primary server_name
fn parse_server_block(&self, lines: &[&str], start_index: &mut usize) -> Option<String> {
let mut server_names = Vec::new();
let mut has_redirect = false;
let mut i = *start_index + 1;
let mut brace_count = 1;
// Parse until we close the server block
while i < lines.len() && brace_count > 0 {
let trimmed = lines[i].trim();
// Track braces
brace_count += trimmed.matches('{').count();
brace_count -= trimmed.matches('}').count();
// Extract server_name
if trimmed.starts_with("server_name") {
if let Some(names_part) = trimmed.strip_prefix("server_name") {
let names_clean = names_part.trim().trim_end_matches(';');
for name in names_clean.split_whitespace() {
if name != "_"
&& !name.is_empty()
&& name.contains('.')
&& !name.starts_with('$')
{
server_names.push(name.to_string());
debug!("Found server_name in block: {}", name);
}
}
}
}
// Check for redirects (skip redirect-only servers)
if trimmed.contains("return") && (trimmed.contains("301") || trimmed.contains("302")) {
has_redirect = true;
}
i += 1;
}
*start_index = i - 1;
if !server_names.is_empty() && !has_redirect {
return Some(server_names[0].clone());
}
None
}
/// Check site latency using HTTP GET requests
fn check_site_latency(&self, url: &str) -> Result<f32, Box<dyn std::error::Error>> {
use std::time::Duration;
use std::time::Instant;
let start = Instant::now();
// Create HTTP client with timeouts from configuration
let client = reqwest::blocking::Client::builder()
.timeout(Duration::from_secs(self.config.http_timeout_seconds))
.connect_timeout(Duration::from_secs(self.config.http_connect_timeout_seconds))
.redirect(reqwest::redirect::Policy::limited(10))
.build()?;
// Make GET request and measure latency
let response = client.get(url).send()?;
let latency = start.elapsed().as_millis() as f32;
// Check if response is successful (2xx or 3xx status codes)
if response.status().is_success() || response.status().is_redirection() {
Ok(latency)
} else {
Err(format!(
"HTTP request failed for {} with status: {}",
url,
response.status()
)
.into())
}
}
/// Get docker containers as sub-services
fn get_docker_containers(&self) -> Vec<(String, String)> {
let mut containers = Vec::new();
// Check if docker is available (cm-agent user is in docker group)
// Use -a to show ALL containers (running and stopped)
let output = Command::new("docker")
.args(&["ps", "-a", "--format", "{{.Names}},{{.Status}}"])
.output();
let output = match output {
Ok(out) if out.status.success() => out,
_ => return containers, // Docker not available or failed
};
let output_str = match String::from_utf8(output.stdout) {
Ok(s) => s,
Err(_) => return containers,
};
for line in output_str.lines() {
if line.trim().is_empty() {
continue;
}
let parts: Vec<&str> = line.split(',').collect();
if parts.len() >= 2 {
let container_name = parts[0].trim();
let status_str = parts[1].trim();
let container_status = if status_str.contains("Up") {
"active"
} else if status_str.contains("Exited") || status_str.contains("Created") {
"inactive" // Stopped/created containers are inactive
} else {
"failed" // Other states (restarting, paused, dead) → failed
};
containers.push((format!("docker_{}", container_name), container_status.to_string()));
}
}
containers
}
/// Get docker images as sub-services
fn get_docker_images(&self) -> Vec<(String, String, String)> {
let mut images = Vec::new();
// Check if docker is available (cm-agent user is in docker group)
let output = Command::new("docker")
.args(&["images", "--format", "{{.Repository}}:{{.Tag}},{{.Size}}"])
.output();
let output = match output {
Ok(out) if out.status.success() => out,
Ok(_) => {
return images;
}
Err(_) => {
return images;
}
};
let output_str = match String::from_utf8(output.stdout) {
Ok(s) => s,
Err(_) => return images,
};
for line in output_str.lines() {
if line.trim().is_empty() {
continue;
}
let parts: Vec<&str> = line.split(',').collect();
if parts.len() >= 2 {
let image_name = parts[0].trim();
let size = parts[1].trim();
// Skip <none>:<none> images (dangling images)
if image_name.contains("<none>") {
continue;
}
images.push((
format!("image_{}", image_name),
"active".to_string(), // Images are always "active" (present)
size.to_string()
));
}
}
images
}
}
#[async_trait]
impl Collector for SystemdCollector {
async fn collect_structured(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
// Use cached data if available and fresh
if let Some(cached_services) = self.get_cached_services() {
debug!("Using cached systemd services data");
for service in cached_services {
agent_data.services.push(ServiceData {
name: service.name,
status: service.status,
memory_mb: service.memory_mb,
disk_gb: service.disk_gb,
user_stopped: false, // TODO: Integrate with service tracker
});
// Use cached complete data if available and fresh
if let Some(cached_complete_services) = self.get_cached_complete_services() {
for service_data in cached_complete_services {
agent_data.services.push(service_data);
}
Ok(())
} else {

View File

@@ -0,0 +1,403 @@
use anyhow::Result;
use async_trait::async_trait;
use cm_dashboard_shared::{AgentData, ServiceData, Status};
use std::process::Command;
use std::sync::RwLock;
use std::time::Instant;
use tracing::debug;
use super::{Collector, CollectorError};
use crate::config::SystemdConfig;
/// Systemd collector for monitoring systemd services with structured data output
pub struct SystemdCollector {
/// Cached state with thread-safe interior mutability
state: RwLock<ServiceCacheState>,
/// Configuration for service monitoring
config: SystemdConfig,
}
/// Internal state for service caching
#[derive(Debug, Clone)]
struct ServiceCacheState {
/// Last collection time for performance tracking
last_collection: Option<Instant>,
/// Cached service data
services: Vec<ServiceInfo>,
/// Interesting services to monitor (cached after discovery)
monitored_services: Vec<String>,
/// Cached service status information from discovery
service_status_cache: std::collections::HashMap<String, ServiceStatusInfo>,
/// Last time services were discovered
last_discovery_time: Option<Instant>,
/// How often to rediscover services (from config)
discovery_interval_seconds: u64,
}
/// Cached service status information from systemctl list-units
#[derive(Debug, Clone)]
struct ServiceStatusInfo {
load_state: String,
active_state: String,
sub_state: String,
}
/// Internal service information
#[derive(Debug, Clone)]
struct ServiceInfo {
name: String,
status: String, // "active", "inactive", "failed", etc.
memory_mb: f32, // Memory usage in MB
disk_gb: f32, // Disk usage in GB (usually 0 for services)
}
impl SystemdCollector {
pub fn new(config: SystemdConfig) -> Self {
let state = ServiceCacheState {
last_collection: None,
services: Vec::new(),
monitored_services: Vec::new(),
service_status_cache: std::collections::HashMap::new(),
last_discovery_time: None,
discovery_interval_seconds: config.interval_seconds,
};
Self {
state: RwLock::new(state),
config,
}
}
/// Collect service data and populate AgentData
async fn collect_service_data(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
let start_time = Instant::now();
debug!("Collecting systemd services metrics");
// Get cached services (discovery only happens when needed)
let monitored_services = match self.get_monitored_services() {
Ok(services) => services,
Err(e) => {
debug!("Failed to get monitored services: {}", e);
return Ok(());
}
};
// Collect service data for each monitored service
let mut services = Vec::new();
for service_name in &monitored_services {
match self.get_service_status(service_name) {
Ok((active_status, _detailed_info)) => {
let memory_mb = self.get_service_memory_usage(service_name).await.unwrap_or(0.0);
let disk_gb = self.get_service_disk_usage(service_name).await.unwrap_or(0.0);
let service_info = ServiceInfo {
name: service_name.clone(),
status: active_status,
memory_mb,
disk_gb,
};
services.push(service_info);
}
Err(e) => {
debug!("Failed to get status for service {}: {}", service_name, e);
}
}
}
// Update cached state
{
let mut state = self.state.write().unwrap();
state.last_collection = Some(start_time);
state.services = services.clone();
}
// Populate AgentData with service information
for service in services {
agent_data.services.push(ServiceData {
name: service.name.clone(),
status: service.status.clone(),
memory_mb: service.memory_mb,
disk_gb: service.disk_gb,
user_stopped: false, // TODO: Integrate with service tracker
service_status: self.calculate_service_status(&service.name, &service.status),
});
}
let elapsed = start_time.elapsed();
debug!("Systemd collection completed in {:?} with {} services", elapsed, agent_data.services.len());
Ok(())
}
/// Get systemd services information
async fn get_systemd_services(&self) -> Result<Vec<ServiceInfo>, CollectorError> {
let mut services = Vec::new();
// Get ALL service unit files (includes inactive services)
let unit_files_output = Command::new("systemctl")
.args(&["list-unit-files", "--type=service", "--no-pager", "--plain"])
.output()
.map_err(|e| CollectorError::SystemRead {
path: "systemctl list-unit-files".to_string(),
error: e.to_string(),
})?;
// Get runtime status of ALL units (including inactive)
let status_output = Command::new("systemctl")
.args(&["list-units", "--type=service", "--all", "--no-pager", "--plain"])
.output()
.map_err(|e| CollectorError::SystemRead {
path: "systemctl list-units --all".to_string(),
error: e.to_string(),
})?;
let unit_files_str = String::from_utf8_lossy(&unit_files_output.stdout);
let status_str = String::from_utf8_lossy(&status_output.stdout);
// Parse all service unit files to get complete service list
let mut all_service_names = std::collections::HashSet::new();
for line in unit_files_str.lines() {
let fields: Vec<&str> = line.split_whitespace().collect();
if fields.len() >= 2 && fields[0].ends_with(".service") {
let service_name = fields[0].trim_end_matches(".service");
all_service_names.insert(service_name.to_string());
}
}
// Parse runtime status for all units
let mut status_cache = std::collections::HashMap::new();
for line in status_str.lines() {
let fields: Vec<&str> = line.split_whitespace().collect();
if fields.len() >= 4 && fields[0].ends_with(".service") {
let service_name = fields[0].trim_end_matches(".service");
let load_state = fields.get(1).unwrap_or(&"unknown").to_string();
let active_state = fields.get(2).unwrap_or(&"unknown").to_string();
let sub_state = fields.get(3).unwrap_or(&"unknown").to_string();
status_cache.insert(service_name.to_string(), (load_state, active_state, sub_state));
}
}
// For services found in unit files but not in runtime status, set default inactive status
for service_name in &all_service_names {
if !status_cache.contains_key(service_name) {
status_cache.insert(service_name.to_string(), (
"not-loaded".to_string(),
"inactive".to_string(),
"dead".to_string()
));
}
}
// Process all discovered services and apply filters
for service_name in &all_service_names {
if self.should_monitor_service(service_name) {
if let Some((load_state, active_state, sub_state)) = status_cache.get(service_name) {
let memory_mb = self.get_service_memory_usage(service_name).await.unwrap_or(0.0);
let disk_gb = self.get_service_disk_usage(service_name).await.unwrap_or(0.0);
let normalized_status = self.normalize_service_status(active_state, sub_state);
let service_info = ServiceInfo {
name: service_name.to_string(),
status: normalized_status,
memory_mb,
disk_gb,
};
services.push(service_info);
}
}
}
Ok(services)
}
/// Check if a service should be monitored based on configuration filters with wildcard support
fn should_monitor_service(&self, service_name: &str) -> bool {
// If no filters configured, monitor nothing (to prevent noise)
if self.config.service_name_filters.is_empty() {
return false;
}
// Check if service matches any of the configured patterns
for pattern in &self.config.service_name_filters {
if self.matches_pattern(service_name, pattern) {
return true;
}
}
false
}
/// Check if service name matches pattern (supports wildcards like nginx*)
fn matches_pattern(&self, service_name: &str, pattern: &str) -> bool {
if pattern.ends_with('*') {
let prefix = &pattern[..pattern.len() - 1];
service_name.starts_with(prefix)
} else {
service_name == pattern
}
}
/// Get disk usage for a specific service
async fn get_service_disk_usage(&self, service_name: &str) -> Result<f32, CollectorError> {
// Check if this service has configured directory paths
if let Some(dirs) = self.config.service_directories.get(service_name) {
// Service has configured paths - use the first accessible one
for dir in dirs {
if let Some(size) = self.get_directory_size(dir) {
return Ok(size);
}
}
// If configured paths failed, return 0
return Ok(0.0);
}
// No configured path - try to get WorkingDirectory from systemctl
let output = Command::new("systemctl")
.args(&["show", &format!("{}.service", service_name), "--property=WorkingDirectory"])
.output()
.map_err(|e| CollectorError::SystemRead {
path: format!("WorkingDirectory for {}", service_name),
error: e.to_string(),
})?;
let output_str = String::from_utf8_lossy(&output.stdout);
for line in output_str.lines() {
if line.starts_with("WorkingDirectory=") && !line.contains("[not set]") {
let dir = line.strip_prefix("WorkingDirectory=").unwrap_or("");
if !dir.is_empty() {
return Ok(self.get_directory_size(dir).unwrap_or(0.0));
}
}
}
Ok(0.0)
}
/// Get size of a directory in GB
fn get_directory_size(&self, path: &str) -> Option<f32> {
let output = Command::new("du")
.args(&["-sb", path])
.output()
.ok()?;
if !output.status.success() {
return None;
}
let output_str = String::from_utf8_lossy(&output.stdout);
let parts: Vec<&str> = output_str.split_whitespace().collect();
if let Some(size_str) = parts.first() {
if let Ok(size_bytes) = size_str.parse::<u64>() {
return Some(size_bytes as f32 / (1024.0 * 1024.0 * 1024.0));
}
}
None
}
/// Calculate service status, taking user-stopped services into account
fn calculate_service_status(&self, service_name: &str, active_status: &str) -> Status {
match active_status.to_lowercase().as_str() {
"active" => Status::Ok,
"inactive" | "dead" => {
debug!("Service '{}' is inactive - treating as Inactive status", service_name);
Status::Inactive
},
"failed" | "error" => Status::Critical,
"activating" | "deactivating" | "reloading" | "starting" | "stopping" => {
debug!("Service '{}' is transitioning - treating as Pending", service_name);
Status::Pending
},
_ => Status::Unknown,
}
}
/// Get memory usage for a specific service
async fn get_service_memory_usage(&self, service_name: &str) -> Result<f32, CollectorError> {
let output = Command::new("systemctl")
.args(&["show", &format!("{}.service", service_name), "--property=MemoryCurrent"])
.output()
.map_err(|e| CollectorError::SystemRead {
path: format!("memory usage for {}", service_name),
error: e.to_string(),
})?;
let output_str = String::from_utf8_lossy(&output.stdout);
for line in output_str.lines() {
if line.starts_with("MemoryCurrent=") {
if let Some(mem_str) = line.strip_prefix("MemoryCurrent=") {
if mem_str != "[not set]" {
if let Ok(memory_bytes) = mem_str.parse::<u64>() {
return Ok(memory_bytes as f32 / (1024.0 * 1024.0)); // Convert to MB
}
}
}
}
}
Ok(0.0)
}
/// Normalize service status to standard values
fn normalize_service_status(&self, active_state: &str, sub_state: &str) -> String {
match (active_state, sub_state) {
("active", "running") => "active".to_string(),
("active", _) => "active".to_string(),
("inactive", "dead") => "inactive".to_string(),
("inactive", _) => "inactive".to_string(),
("failed", _) => "failed".to_string(),
("activating", _) => "starting".to_string(),
("deactivating", _) => "stopping".to_string(),
_ => format!("{}:{}", active_state, sub_state),
}
}
/// Check if service collection cache should be updated
fn should_update_cache(&self) -> bool {
let state = self.state.read().unwrap();
match state.last_collection {
None => true,
Some(last) => {
let cache_duration = std::time::Duration::from_secs(30);
last.elapsed() > cache_duration
}
}
}
/// Get cached service data if available and fresh
fn get_cached_services(&self) -> Option<Vec<ServiceInfo>> {
if !self.should_update_cache() {
let state = self.state.read().unwrap();
Some(state.services.clone())
} else {
None
}
}
}
#[async_trait]
impl Collector for SystemdCollector {
async fn collect_structured(&self, agent_data: &mut AgentData) -> Result<(), CollectorError> {
// Use cached data if available and fresh
if let Some(cached_services) = self.get_cached_services() {
debug!("Using cached systemd services data");
for service in cached_services {
agent_data.services.push(ServiceData {
name: service.name.clone(),
status: service.status.clone(),
memory_mb: service.memory_mb,
disk_gb: service.disk_gb,
user_stopped: false, // TODO: Integrate with service tracker
service_status: self.calculate_service_status(&service.name, &service.status),
});
}
Ok(())
} else {
// Collect fresh data
self.collect_service_data(agent_data).await
}
}
}

View File

@@ -1,2 +0,0 @@
// This file is now empty - all configuration values come from config files
// No hardcoded defaults are used

View File

@@ -8,7 +8,6 @@ mod collectors;
mod communication;
mod config;
mod notifications;
mod service_tracker;
use agent::Agent;

View File

@@ -1,164 +0,0 @@
use anyhow::Result;
use serde::{Deserialize, Serialize};
use std::collections::HashSet;
use std::fs;
use std::path::Path;
use std::sync::{Arc, Mutex, OnceLock};
use tracing::{debug, info, warn};
/// Shared instance for global access
static GLOBAL_TRACKER: OnceLock<Arc<Mutex<UserStoppedServiceTracker>>> = OnceLock::new();
/// Tracks services that have been stopped by user action
/// These services should be treated as OK status instead of Warning
#[derive(Debug)]
pub struct UserStoppedServiceTracker {
/// Set of services stopped by user action
user_stopped_services: HashSet<String>,
/// Path to persistent storage file
storage_path: String,
}
/// Serializable data structure for persistence
#[derive(Debug, Serialize, Deserialize)]
struct UserStoppedData {
services: Vec<String>,
}
impl UserStoppedServiceTracker {
/// Create new tracker with default storage path
pub fn new() -> Self {
Self::with_storage_path("/var/lib/cm-dashboard/user-stopped-services.json")
}
/// Initialize global instance (called by agent)
pub fn init_global() -> Result<Self> {
let tracker = Self::new();
// Set global instance
let global_instance = Arc::new(Mutex::new(tracker));
if GLOBAL_TRACKER.set(global_instance).is_err() {
warn!("Global service tracker was already initialized");
}
// Return a new instance for the agent to use
Ok(Self::new())
}
/// Check if a service is user-stopped (global access for collectors)
pub fn is_service_user_stopped(service_name: &str) -> bool {
if let Some(global) = GLOBAL_TRACKER.get() {
if let Ok(tracker) = global.lock() {
tracker.is_user_stopped(service_name)
} else {
debug!("Failed to lock global service tracker");
false
}
} else {
debug!("Global service tracker not initialized");
false
}
}
/// Update global tracker (called by agent when tracker state changes)
pub fn update_global(updated_tracker: &UserStoppedServiceTracker) {
if let Some(global) = GLOBAL_TRACKER.get() {
if let Ok(mut tracker) = global.lock() {
tracker.user_stopped_services = updated_tracker.user_stopped_services.clone();
} else {
debug!("Failed to lock global service tracker for update");
}
} else {
debug!("Global service tracker not initialized for update");
}
}
/// Create new tracker with custom storage path
pub fn with_storage_path<P: AsRef<Path>>(storage_path: P) -> Self {
let storage_path = storage_path.as_ref().to_string_lossy().to_string();
let mut tracker = Self {
user_stopped_services: HashSet::new(),
storage_path,
};
// Load existing data from storage
if let Err(e) = tracker.load_from_storage() {
warn!("Failed to load user-stopped services from storage: {}", e);
info!("Starting with empty user-stopped services list");
}
tracker
}
/// Clear user-stopped flag for a service (when user starts it)
pub fn clear_user_stopped(&mut self, service_name: &str) -> Result<()> {
if self.user_stopped_services.remove(service_name) {
info!("Cleared user-stopped flag for service '{}'", service_name);
self.save_to_storage()?;
debug!("Service '{}' user-stopped flag cleared and saved to storage", service_name);
} else {
debug!("Service '{}' was not marked as user-stopped", service_name);
}
Ok(())
}
/// Check if a service is marked as user-stopped
pub fn is_user_stopped(&self, service_name: &str) -> bool {
let is_stopped = self.user_stopped_services.contains(service_name);
debug!("Service '{}' user-stopped status: {}", service_name, is_stopped);
is_stopped
}
/// Save current state to persistent storage
fn save_to_storage(&self) -> Result<()> {
// Create parent directory if it doesn't exist
if let Some(parent_dir) = Path::new(&self.storage_path).parent() {
if !parent_dir.exists() {
fs::create_dir_all(parent_dir)?;
debug!("Created parent directory: {}", parent_dir.display());
}
}
let data = UserStoppedData {
services: self.user_stopped_services.iter().cloned().collect(),
};
let json_data = serde_json::to_string_pretty(&data)?;
fs::write(&self.storage_path, json_data)?;
debug!(
"Saved {} user-stopped services to {}",
data.services.len(),
self.storage_path
);
Ok(())
}
/// Load state from persistent storage
fn load_from_storage(&mut self) -> Result<()> {
if !Path::new(&self.storage_path).exists() {
debug!("Storage file {} does not exist, starting fresh", self.storage_path);
return Ok(());
}
let json_data = fs::read_to_string(&self.storage_path)?;
let data: UserStoppedData = serde_json::from_str(&json_data)?;
self.user_stopped_services = data.services.into_iter().collect();
info!(
"Loaded {} user-stopped services from {}",
self.user_stopped_services.len(),
self.storage_path
);
if !self.user_stopped_services.is_empty() {
debug!("User-stopped services: {:?}", self.user_stopped_services);
}
Ok(())
}
}

View File

@@ -1,1001 +0,0 @@
warning: fields `total_services`, `backup_disk_filesystem_label`, `services_completed_count`, `services_failed_count`, and `services_disabled_count` are never read
--> dashboard/src/ui/widgets/backup.rs:22:5
|
14 | pub struct BackupWidget {
| ------------ fields in this struct
...
22 | total_services: Option<i64>,
| ^^^^^^^^^^^^^^
...
36 | backup_disk_filesystem_label: Option<String>,
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
37 | /// Number of completed services
38 | services_completed_count: Option<i64>,
| ^^^^^^^^^^^^^^^^^^^^^^^^
39 | /// Number of failed services
40 | services_failed_count: Option<i64>,
| ^^^^^^^^^^^^^^^^^^^^^
41 | /// Number of disabled services
42 | services_disabled_count: Option<i64>,
| ^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `BackupWidget` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
= note: `#[warn(dead_code)]` on by default
warning: field `exit_code` is never read
--> dashboard/src/ui/widgets/backup.rs:53:5
|
50 | struct ServiceMetricData {
| ----------------- field in this struct
...
53 | exit_code: Option<i64>,
| ^^^^^^^^^
|
= note: `ServiceMetricData` has derived impls for the traits `Clone` and `Debug`, but these are intentionally ignored during dead code analysis
warning: associated function `extract_service_name` is never used
--> dashboard/src/ui/widgets/backup.rs:115:8
|
58 | impl BackupWidget {
| ----------------- associated function in this implementation
...
115 | fn extract_service_name(metric_name: &str) -> Option<String> {
| ^^^^^^^^^^^^^^^^^^^^
warning: method `update_from_metrics` is never used
--> dashboard/src/ui/widgets/backup.rs:157:8
|
156 | impl BackupWidget {
| ----------------- method in this implementation
157 | fn update_from_metrics(&mut self, metrics: &[&Metric]) {
| ^^^^^^^^^^^^^^^^^^^
warning: associated function `extract_service_info` is never used
--> dashboard/src/ui/widgets/services.rs:50:8
|
38 | impl ServicesWidget {
| ------------------- associated function in this implementation
...
50 | fn extract_service_info(metric_name: &str) -> Option<(String, Option<String>)> {
| ^^^^^^^^^^^^^^^^^^^^
warning: method `update_from_metrics` is never used
--> dashboard/src/ui/widgets/services.rs:285:8
|
284 | impl ServicesWidget {
| ------------------- method in this implementation
285 | fn update_from_metrics(&mut self, metrics: &[&Metric]) {
| ^^^^^^^^^^^^^^^^^^^
warning: field `health_status` is never read
--> dashboard/src/ui/widgets/system.rs:53:5
|
43 | struct StoragePool {
| ----------- field in this struct
...
53 | health_status: Status, // Separate status for pool health vs usage
| ^^^^^^^^^^^^^
|
= note: `StoragePool` has a derived impl for the trait `Clone`, but this is intentionally ignored during dead code analysis
warning: `cm-dashboard` (bin "cm-dashboard") generated 7 warnings
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/cm-dashboard --headless --raw-data`
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936501,
"system": {
"cpu": {
"load_1min": 1.82,
"load_5min": 2.1,
"load_15min": 2.1,
"frequency_mhz": 3743.09,
"temperature_celsius": 55.0
},
"memory": {
"usage_percent": 27.183601,
"total_gb": 23.339516,
"used_gb": 6.3445206,
"available_gb": 16.994995,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.094376,
"used_gb": 0.3018875,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.582031,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936502,
"system": {
"cpu": {
"load_1min": 1.82,
"load_5min": 2.1,
"load_15min": 2.1,
"frequency_mhz": 3743.09,
"temperature_celsius": 55.0
},
"memory": {
"usage_percent": 27.183601,
"total_gb": 23.339516,
"used_gb": 6.3445206,
"available_gb": 16.994995,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.094376,
"used_gb": 0.3018875,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.582031,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936503,
"system": {
"cpu": {
"load_1min": 1.82,
"load_5min": 2.1,
"load_15min": 2.1,
"frequency_mhz": 3743.09,
"temperature_celsius": 55.0
},
"memory": {
"usage_percent": 27.183601,
"total_gb": 23.339516,
"used_gb": 6.3445206,
"available_gb": 16.994995,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.094376,
"used_gb": 0.3018875,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.582031,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936505,
"system": {
"cpu": {
"load_1min": 1.75,
"load_5min": 2.08,
"load_15min": 2.1,
"frequency_mhz": 3600.005,
"temperature_celsius": 56.0
},
"memory": {
"usage_percent": 26.780334,
"total_gb": 23.339516,
"used_gb": 6.2504005,
"available_gb": 17.089115,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.095139,
"used_gb": 0.30190277,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.59375,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936506,
"system": {
"cpu": {
"load_1min": 1.75,
"load_5min": 2.08,
"load_15min": 2.1,
"frequency_mhz": 3600.005,
"temperature_celsius": 56.0
},
"memory": {
"usage_percent": 26.780334,
"total_gb": 23.339516,
"used_gb": 6.2504005,
"available_gb": 17.089115,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.095139,
"used_gb": 0.30190277,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.59375,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936507,
"system": {
"cpu": {
"load_1min": 1.75,
"load_5min": 2.08,
"load_15min": 2.1,
"frequency_mhz": 3600.005,
"temperature_celsius": 56.0
},
"memory": {
"usage_percent": 26.780334,
"total_gb": 23.339516,
"used_gb": 6.2504005,
"available_gb": 17.089115,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.095139,
"used_gb": 0.30190277,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.59375,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936508,
"system": {
"cpu": {
"load_1min": 1.75,
"load_5min": 2.08,
"load_15min": 2.1,
"frequency_mhz": 3600.005,
"temperature_celsius": 56.0
},
"memory": {
"usage_percent": 26.780334,
"total_gb": 23.339516,
"used_gb": 6.2504005,
"available_gb": 17.089115,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.095139,
"used_gb": 0.30190277,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.59375,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936509,
"system": {
"cpu": {
"load_1min": 1.75,
"load_5min": 2.08,
"load_15min": 2.1,
"frequency_mhz": 3638.71,
"temperature_celsius": 56.0
},
"memory": {
"usage_percent": 27.014532,
"total_gb": 23.339516,
"used_gb": 6.3050613,
"available_gb": 17.034454,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.095139,
"used_gb": 0.30190277,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.59375,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936509,
"system": {
"cpu": {
"load_1min": 0.0,
"load_5min": 0.0,
"load_15min": 0.0,
"frequency_mhz": 0.0,
"temperature_celsius": null
},
"memory": {
"usage_percent": 0.0,
"total_gb": 0.0,
"used_gb": 0.0,
"available_gb": 0.0,
"swap_total_gb": 0.0,
"swap_used_gb": 0.0,
"tmpfs": []
},
"storage": {
"drives": [],
"pools": []
}
},
"services": [],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936510,
"system": {
"cpu": {
"load_1min": 1.75,
"load_5min": 2.08,
"load_15min": 2.1,
"frequency_mhz": 3638.71,
"temperature_celsius": 56.0
},
"memory": {
"usage_percent": 27.014532,
"total_gb": 23.339516,
"used_gb": 6.3050613,
"available_gb": 17.034454,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.095139,
"used_gb": 0.30190277,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.59375,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936511,
"system": {
"cpu": {
"load_1min": 1.75,
"load_5min": 2.08,
"load_15min": 2.1,
"frequency_mhz": 3638.71,
"temperature_celsius": 56.0
},
"memory": {
"usage_percent": 27.014532,
"total_gb": 23.339516,
"used_gb": 6.3050613,
"available_gb": 17.034454,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.095139,
"used_gb": 0.30190277,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.59375,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
RAW AGENT DATA FROM cmbox:
{
"hostname": "cmbox",
"agent_version": "v0.1.133",
"timestamp": 1763936512,
"system": {
"cpu": {
"load_1min": 1.75,
"load_5min": 2.08,
"load_15min": 2.1,
"frequency_mhz": 3638.71,
"temperature_celsius": 56.0
},
"memory": {
"usage_percent": 27.014532,
"total_gb": 23.339516,
"used_gb": 6.3050613,
"available_gb": 17.034454,
"swap_total_gb": 14.634708,
"swap_used_gb": 0.17599106,
"tmpfs": [
{
"mount": "/tmp",
"usage_percent": 15.095139,
"used_gb": 0.30190277,
"total_gb": 2.0
}
]
},
"storage": {
"drives": [
{
"name": "nvme0n1",
"health": "PASSED",
"temperature_celsius": 28.0,
"wear_percent": 1.0,
"filesystems": [
{
"mount": "root",
"usage_percent": 24.404377,
"used_gb": 226.51398,
"total_gb": 928.1695
},
{
"mount": "boot",
"usage_percent": 10.666672,
"used_gb": 0.10645676,
"total_gb": 0.9980316
}
]
}
],
"pools": []
}
},
"services": [
{
"name": "tailscaled",
"status": "active",
"memory_mb": 25.59375,
"disk_gb": 0.0,
"user_stopped": false
},
{
"name": "sshd",
"status": "active",
"memory_mb": 4.3085938,
"disk_gb": 0.0,
"user_stopped": false
}
],
"backup": {
"status": "unknown",
"last_run": null,
"next_scheduled": null,
"total_size_gb": null,
"repository_health": null
}
}
────────────────────────────────────────────────────────────────────────────────
Terminated

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard"
version = "0.1.141"
version = "0.1.181"
edition = "2021"
[dependencies]

View File

@@ -20,13 +20,12 @@ pub struct Dashboard {
tui_app: Option<TuiApp>,
terminal: Option<Terminal<CrosstermBackend<io::Stdout>>>,
headless: bool,
raw_data: bool,
initial_commands_sent: std::collections::HashSet<String>,
config: DashboardConfig,
}
impl Dashboard {
pub async fn new(config_path: Option<String>, headless: bool, raw_data: bool) -> Result<Self> {
pub async fn new(config_path: Option<String>, headless: bool) -> Result<Self> {
info!("Initializing dashboard");
// Load configuration - try default path if not specified
@@ -120,7 +119,6 @@ impl Dashboard {
tui_app,
terminal,
headless,
raw_data,
initial_commands_sent: std::collections::HashSet::new(),
config,
})
@@ -205,13 +203,6 @@ impl Dashboard {
.insert(agent_data.hostname.clone());
}
// Show raw data if requested (before processing)
if self.raw_data {
println!("RAW AGENT DATA FROM {}:", agent_data.hostname);
println!("{}", serde_json::to_string_pretty(&agent_data).unwrap_or_else(|e| format!("Serialization error: {}", e)));
println!("{}", "".repeat(80));
}
// Store structured data directly
self.metric_store.store_agent_data(agent_data);

View File

@@ -51,10 +51,6 @@ struct Cli {
/// Run in headless mode (no TUI, just logging)
#[arg(long)]
headless: bool,
/// Show raw agent data in headless mode
#[arg(long)]
raw_data: bool,
}
#[tokio::main]
@@ -90,7 +86,7 @@ async fn main() -> Result<()> {
}
// Create and run dashboard
let mut dashboard = Dashboard::new(cli.config, cli.headless, cli.raw_data).await?;
let mut dashboard = Dashboard::new(cli.config, cli.headless).await?;
// Setup graceful shutdown
let ctrl_c = async {

View File

@@ -18,7 +18,7 @@ use crate::config::DashboardConfig;
use crate::metrics::MetricStore;
use cm_dashboard_shared::Status;
use theme::{Components, Layout as ThemeLayout, Theme, Typography};
use widgets::{BackupWidget, ServicesWidget, SystemWidget, Widget};
use widgets::{ServicesWidget, SystemWidget, Widget};
@@ -32,8 +32,6 @@ pub struct HostWidgets {
pub system_widget: SystemWidget,
/// Services widget state
pub services_widget: ServicesWidget,
/// Backup widget state
pub backup_widget: BackupWidget,
/// Last update time for this host
pub last_update: Option<Instant>,
}
@@ -43,7 +41,6 @@ impl HostWidgets {
Self {
system_widget: SystemWidget::new(),
services_widget: ServicesWidget::new(),
backup_widget: BackupWidget::new(),
last_update: None,
}
}
@@ -112,7 +109,6 @@ impl TuiApp {
// Update all widgets with structured data directly
host_widgets.system_widget.update_from_agent_data(agent_data);
host_widgets.services_widget.update_from_agent_data(agent_data);
host_widgets.backup_widget.update_from_agent_data(agent_data);
host_widgets.last_update = Some(Instant::now());
}
@@ -469,40 +465,17 @@ impl TuiApp {
return;
}
// Check if backup panel should be shown
let show_backup = if let Some(hostname) = self.current_host.clone() {
let host_widgets = self.get_or_create_host_widgets(&hostname);
host_widgets.backup_widget.has_data()
} else {
false
};
// Left side: dynamic layout based on backup data availability
let left_chunks = if show_backup {
// Show both system and backup panels
ratatui::layout::Layout::default()
.direction(Direction::Vertical)
.constraints([
Constraint::Percentage(ThemeLayout::SYSTEM_PANEL_HEIGHT), // System section
Constraint::Percentage(ThemeLayout::BACKUP_PANEL_HEIGHT), // Backup section
])
.split(content_chunks[0])
} else {
// Show only system panel (full height)
ratatui::layout::Layout::default()
.direction(Direction::Vertical)
.constraints([Constraint::Percentage(100)]) // System section takes full height
.split(content_chunks[0])
};
// Left side: system panel only (full height)
let left_chunks = ratatui::layout::Layout::default()
.direction(Direction::Vertical)
.constraints([Constraint::Percentage(100)]) // System section takes full height
.split(content_chunks[0]);
// Render title bar
self.render_btop_title(frame, main_chunks[0], metric_store);
// Render new panel layout
// Render system panel
self.render_system_panel(frame, left_chunks[0], metric_store);
if show_backup && left_chunks.len() > 1 {
self.render_backup_panel(frame, left_chunks[1]);
}
// Render services widget for current host
if let Some(hostname) = self.current_host.clone() {
@@ -669,17 +642,6 @@ impl TuiApp {
}
}
fn render_backup_panel(&mut self, frame: &mut Frame, area: Rect) {
let backup_block = Components::widget_block("backup");
let inner_area = backup_block.inner(area);
frame.render_widget(backup_block, area);
// Get current host widgets for backup widget
if let Some(hostname) = self.current_host.clone() {
let host_widgets = self.get_or_create_host_widgets(&hostname);
host_widgets.backup_widget.render(frame, inner_area);
}
}
/// Render offline host message with wake-up option
fn render_offline_host_message(&self, frame: &mut Frame, area: Rect) {

View File

@@ -225,9 +225,6 @@ impl Layout {
pub const LEFT_PANEL_WIDTH: u16 = 45;
/// Right panel percentage (services)
pub const RIGHT_PANEL_WIDTH: u16 = 55;
/// System vs backup split (equal)
pub const SYSTEM_PANEL_HEIGHT: u16 = 50;
pub const BACKUP_PANEL_HEIGHT: u16 = 50;
}
/// Typography system

View File

@@ -1,418 +0,0 @@
use cm_dashboard_shared::{Metric, Status};
use super::Widget;
use ratatui::{
layout::Rect,
widgets::Paragraph,
Frame,
};
use tracing::debug;
use crate::ui::theme::{StatusIcons, Typography};
/// Backup widget displaying backup status, services, and repository information
#[derive(Clone)]
pub struct BackupWidget {
/// Overall backup status
overall_status: Status,
/// Last backup duration in seconds
duration_seconds: Option<i64>,
/// Last backup timestamp
last_run_timestamp: Option<i64>,
/// Total repository size in GB
total_repo_size_gb: Option<f32>,
/// Total disk space for backups in GB
backup_disk_total_gb: Option<f32>,
/// Used disk space for backups in GB
backup_disk_used_gb: Option<f32>,
/// Backup disk product name from SMART data
backup_disk_product_name: Option<String>,
/// Backup disk serial number from SMART data
backup_disk_serial_number: Option<String>,
/// Backup disk wear percentage from SMART data
backup_disk_wear_percent: Option<f32>,
/// All individual service metrics for detailed display
service_metrics: Vec<ServiceMetricData>,
/// Last update indicator
has_data: bool,
}
#[derive(Debug, Clone)]
struct ServiceMetricData {
name: String,
status: Status,
archive_count: Option<i64>,
repo_size_gb: Option<f32>,
}
impl BackupWidget {
pub fn new() -> Self {
Self {
overall_status: Status::Unknown,
duration_seconds: None,
last_run_timestamp: None,
total_repo_size_gb: None,
backup_disk_total_gb: None,
backup_disk_used_gb: None,
backup_disk_product_name: None,
backup_disk_serial_number: None,
backup_disk_wear_percent: None,
service_metrics: Vec::new(),
has_data: false,
}
}
/// Check if the backup widget has any data to display
pub fn has_data(&self) -> bool {
self.has_data
}
/// Format size with proper units (xxxkB/MB/GB/TB)
fn format_size_with_proper_units(size_gb: f32) -> String {
if size_gb >= 1000.0 {
// TB range
format!("{:.1}TB", size_gb / 1000.0)
} else if size_gb >= 1.0 {
// GB range
format!("{:.1}GB", size_gb)
} else if size_gb >= 0.001 {
// MB range (size_gb * 1024 = MB)
let size_mb = size_gb * 1024.0;
format!("{:.1}MB", size_mb)
} else if size_gb >= 0.000001 {
// kB range (size_gb * 1024 * 1024 = kB)
let size_kb = size_gb * 1024.0 * 1024.0;
format!("{:.0}kB", size_kb)
} else {
// B range (size_gb * 1024^3 = bytes)
let size_bytes = size_gb * 1024.0 * 1024.0 * 1024.0;
format!("{:.0}B", size_bytes)
}
}
/// Extract service name from metric name (e.g., "backup_service_gitea_status" -> "gitea")
#[allow(dead_code)]
fn extract_service_name(metric_name: &str) -> Option<String> {
if metric_name.starts_with("backup_service_") {
let name_part = &metric_name[15..]; // Remove "backup_service_" prefix
// Try to extract service name by removing known suffixes
if let Some(service_name) = name_part.strip_suffix("_status") {
Some(service_name.to_string())
} else if let Some(service_name) = name_part.strip_suffix("_archive_count") {
Some(service_name.to_string())
} else if let Some(service_name) = name_part.strip_suffix("_repo_size_gb") {
Some(service_name.to_string())
} else if let Some(service_name) = name_part.strip_suffix("_repo_path") {
Some(service_name.to_string())
} else {
None
}
} else {
None
}
}
}
impl Widget for BackupWidget {
fn update_from_agent_data(&mut self, agent_data: &cm_dashboard_shared::AgentData) {
self.has_data = true;
let backup = &agent_data.backup;
self.overall_status = Status::Ok;
if let Some(size) = backup.total_size_gb {
self.total_repo_size_gb = Some(size);
}
if let Some(last_run) = backup.last_run {
self.last_run_timestamp = Some(last_run as i64);
}
}
}
impl BackupWidget {
#[allow(dead_code)]
fn update_from_metrics(&mut self, metrics: &[&Metric]) {
debug!("Backup widget updating with {} metrics", metrics.len());
for metric in metrics {
debug!(
"Backup metric: {} = {:?} (status: {:?})",
metric.name, metric.value, metric.status
);
}
// Also debug the service_data after processing
debug!("Processing individual service metrics...");
// Log how many metrics are backup service metrics
let service_metric_count = metrics
.iter()
.filter(|m| m.name.starts_with("backup_service_"))
.count();
debug!(
"Found {} backup_service_ metrics out of {} total backup metrics",
service_metric_count,
metrics.len()
);
// Reset service metrics
self.service_metrics.clear();
let mut service_data: std::collections::HashMap<String, ServiceMetricData> =
std::collections::HashMap::new();
for metric in metrics {
match metric.name.as_str() {
"backup_overall_status" => {
let status_str = metric.value.as_string();
self.overall_status = match status_str.as_str() {
"ok" => Status::Ok,
"warning" => Status::Warning,
"critical" => Status::Critical,
_ => Status::Unknown,
};
}
"backup_duration_seconds" => {
self.duration_seconds = metric.value.as_i64();
}
"backup_last_run_timestamp" => {
self.last_run_timestamp = metric.value.as_i64();
}
"backup_total_repo_size_gb" => {
self.total_repo_size_gb = metric.value.as_f32();
}
"backup_disk_total_gb" => {
self.backup_disk_total_gb = metric.value.as_f32();
}
"backup_disk_used_gb" => {
self.backup_disk_used_gb = metric.value.as_f32();
}
"backup_disk_product_name" => {
self.backup_disk_product_name = Some(metric.value.as_string());
}
"backup_disk_serial_number" => {
self.backup_disk_serial_number = Some(metric.value.as_string());
}
"backup_disk_wear_percent" => {
self.backup_disk_wear_percent = metric.value.as_f32();
}
_ => {
// Handle individual service metrics
if let Some(service_name) = Self::extract_service_name(&metric.name) {
debug!(
"Extracted service name '{}' from metric '{}'",
service_name, metric.name
);
let entry = service_data.entry(service_name.clone()).or_insert_with(|| {
ServiceMetricData {
name: service_name,
status: Status::Unknown,
archive_count: None,
repo_size_gb: None,
}
});
if metric.name.ends_with("_status") {
entry.status = metric.status;
debug!("Set status for {}: {:?}", entry.name, entry.status);
} else if metric.name.ends_with("_archive_count") {
entry.archive_count = metric.value.as_i64();
debug!(
"Set archive_count for {}: {:?}",
entry.name, entry.archive_count
);
} else if metric.name.ends_with("_repo_size_gb") {
entry.repo_size_gb = metric.value.as_f32();
debug!(
"Set repo_size_gb for {}: {:?}",
entry.name, entry.repo_size_gb
);
}
} else {
debug!(
"Could not extract service name from metric: {}",
metric.name
);
}
}
}
}
// Convert service data to sorted vector
let mut services: Vec<ServiceMetricData> = service_data.into_values().collect();
services.sort_by(|a, b| a.name.cmp(&b.name));
self.service_metrics = services;
// Only show backup panel if we have meaningful backup data
self.has_data = !metrics.is_empty() && (
self.last_run_timestamp.is_some() ||
self.total_repo_size_gb.is_some() ||
!self.service_metrics.is_empty()
);
debug!(
"Backup widget updated: status={:?}, services={}, total_size={:?}GB",
self.overall_status,
self.service_metrics.len(),
self.total_repo_size_gb
);
// Debug individual service data
for service in &self.service_metrics {
debug!(
"Service {}: status={:?}, archives={:?}, size={:?}GB",
service.name, service.status, service.archive_count, service.repo_size_gb
);
}
}
}
impl BackupWidget {
/// Render backup widget
pub fn render(&mut self, frame: &mut Frame, area: Rect) {
let mut lines = Vec::new();
// Latest backup section
lines.push(ratatui::text::Line::from(vec![
ratatui::text::Span::styled("Latest backup:", Typography::widget_title())
]));
// Timestamp with status icon
let timestamp_text = if let Some(timestamp) = self.last_run_timestamp {
self.format_timestamp(timestamp)
} else {
"Unknown".to_string()
};
let timestamp_spans = StatusIcons::create_status_spans(
self.overall_status,
&timestamp_text
);
lines.push(ratatui::text::Line::from(timestamp_spans));
// Duration as sub-item
if let Some(duration) = self.duration_seconds {
let duration_text = self.format_duration(duration);
lines.push(ratatui::text::Line::from(vec![
ratatui::text::Span::styled(" └─ ", Typography::tree()),
ratatui::text::Span::styled(format!("Duration: {}", duration_text), Typography::secondary())
]));
}
// Disk section
lines.push(ratatui::text::Line::from(vec![
ratatui::text::Span::styled("Disk:", Typography::widget_title())
]));
// Disk product name with status
if let Some(product) = &self.backup_disk_product_name {
let disk_spans = StatusIcons::create_status_spans(
Status::Ok, // Assuming disk is OK if we have data
product
);
lines.push(ratatui::text::Line::from(disk_spans));
// Collect sub-items to determine tree structure
let mut sub_items = Vec::new();
if let Some(serial) = &self.backup_disk_serial_number {
sub_items.push(format!("S/N: {}", serial));
}
if let Some(wear) = self.backup_disk_wear_percent {
sub_items.push(format!("Wear: {:.0}%", wear));
}
if let (Some(used), Some(total)) = (self.backup_disk_used_gb, self.backup_disk_total_gb) {
let used_str = Self::format_size_with_proper_units(used);
let total_str = Self::format_size_with_proper_units(total);
sub_items.push(format!("Usage: {}/{}", used_str, total_str));
}
// Render sub-items with proper tree structure
let num_items = sub_items.len();
for (i, item) in sub_items.into_iter().enumerate() {
let is_last = i == num_items - 1;
let tree_char = if is_last { " └─ " } else { " ├─ " };
lines.push(ratatui::text::Line::from(vec![
ratatui::text::Span::styled(tree_char, Typography::tree()),
ratatui::text::Span::styled(item, Typography::secondary())
]));
}
}
// Repos section
lines.push(ratatui::text::Line::from(vec![
ratatui::text::Span::styled("Repos:", Typography::widget_title())
]));
// Add all repository lines (no truncation here - scroll will handle display)
for service in &self.service_metrics {
if let (Some(archives), Some(size_gb)) = (service.archive_count, service.repo_size_gb) {
let size_str = Self::format_size_with_proper_units(size_gb);
let repo_text = format!("{} ({}) {}", service.name, archives, size_str);
let repo_spans = StatusIcons::create_status_spans(service.status, &repo_text);
lines.push(ratatui::text::Line::from(repo_spans));
}
}
// Apply scroll offset
let total_lines = lines.len();
let available_height = area.height as usize;
// Show only what fits, with "X more below" if needed
if total_lines > available_height {
let lines_for_content = available_height.saturating_sub(1); // Reserve one line for "more below"
let mut visible_lines: Vec<_> = lines
.into_iter()
.take(lines_for_content)
.collect();
let hidden_below = total_lines.saturating_sub(lines_for_content);
if hidden_below > 0 {
let more_line = ratatui::text::Line::from(vec![
ratatui::text::Span::styled(format!("... {} more below", hidden_below), Typography::muted())
]);
visible_lines.push(more_line);
}
let paragraph = Paragraph::new(ratatui::text::Text::from(visible_lines));
frame.render_widget(paragraph, area);
} else {
let paragraph = Paragraph::new(ratatui::text::Text::from(lines));
frame.render_widget(paragraph, area);
}
}
}
impl BackupWidget {
/// Format timestamp for display
fn format_timestamp(&self, timestamp: i64) -> String {
let datetime = chrono::DateTime::from_timestamp(timestamp, 0)
.unwrap_or_else(|| chrono::Utc::now());
datetime.format("%Y-%m-%d %H:%M:%S").to_string()
}
/// Format duration in seconds to human readable format
fn format_duration(&self, duration_seconds: i64) -> String {
let minutes = duration_seconds / 60;
let seconds = duration_seconds % 60;
if minutes > 0 {
format!("{}.{}m", minutes, seconds / 6) // Show 1 decimal for minutes
} else {
format!("{}s", seconds)
}
}
}
impl Default for BackupWidget {
fn default() -> Self {
Self::new()
}
}

View File

@@ -1 +0,0 @@
// This file is intentionally left minimal - CPU functionality is handled by the SystemWidget

View File

@@ -1 +0,0 @@
// This file is intentionally left minimal - Memory functionality is handled by the SystemWidget

View File

@@ -1,12 +1,8 @@
use cm_dashboard_shared::AgentData;
pub mod backup;
pub mod cpu;
pub mod memory;
pub mod services;
pub mod system;
pub use backup::BackupWidget;
pub use services::ServicesWidget;
pub use system::SystemWidget;

View File

@@ -28,10 +28,9 @@ pub struct ServicesWidget {
#[derive(Clone)]
struct ServiceInfo {
status: String,
memory_mb: Option<f32>,
disk_gb: Option<f32>,
latency_ms: Option<f32>,
metrics: Vec<(String, f32, Option<String>)>, // (label, value, unit)
widget_status: Status,
}
@@ -113,10 +112,15 @@ impl ServicesWidget {
name.to_string()
};
// Parent services always show actual systemctl status
// Convert Status enum to display text
let status_str = match info.widget_status {
Status::Pending => "pending".to_string(),
_ => info.status.clone(), // Use actual status from agent (active/inactive/failed)
Status::Ok => "active",
Status::Inactive => "inactive",
Status::Critical => "failed",
Status::Pending => "pending",
Status::Warning => "warning",
Status::Unknown => "unknown",
Status::Offline => "offline",
};
format!(
@@ -153,15 +157,25 @@ impl ServicesWidget {
Status::Offline => Theme::muted_text(),
};
// For sub-services, prefer latency if available
let status_str = if let Some(latency) = info.latency_ms {
if latency < 0.0 {
"timeout".to_string()
} else {
format!("{:.0}ms", latency)
// Display metrics or status for sub-services
let status_str = if !info.metrics.is_empty() {
// Show first metric with label and unit
let (label, value, unit) = &info.metrics[0];
match unit {
Some(u) => format!("{}: {:.1} {}", label, value, u),
None => format!("{}: {:.1}", label, value),
}
} else {
info.status.clone()
// Convert Status enum to display text for sub-services
match info.widget_status {
Status::Ok => "active",
Status::Inactive => "inactive",
Status::Critical => "failed",
Status::Pending => "pending",
Status::Warning => "warning",
Status::Unknown => "unknown",
Status::Offline => "offline",
}.to_string()
};
let tree_symbol = if is_last { "└─" } else { "├─" };
@@ -262,18 +276,48 @@ impl Widget for ServicesWidget {
self.sub_services.clear();
for service in &agent_data.services {
let service_info = ServiceInfo {
status: service.status.clone(),
// Store parent service
let parent_info = ServiceInfo {
memory_mb: Some(service.memory_mb),
disk_gb: Some(service.disk_gb),
latency_ms: None,
widget_status: Status::Ok,
metrics: Vec::new(), // Parent services don't have custom metrics
widget_status: service.service_status,
};
self.parent_services.insert(service.name.clone(), parent_info);
self.parent_services.insert(service.name.clone(), service_info);
// Process sub-services if any
if !service.sub_services.is_empty() {
let mut sub_list = Vec::new();
for sub_service in &service.sub_services {
// Convert metrics to display format
let metrics: Vec<(String, f32, Option<String>)> = sub_service.metrics.iter()
.map(|m| (m.label.clone(), m.value, m.unit.clone()))
.collect();
let sub_info = ServiceInfo {
memory_mb: None, // Not used for sub-services
disk_gb: None, // Not used for sub-services
metrics,
widget_status: sub_service.service_status,
};
sub_list.push((sub_service.name.clone(), sub_info));
}
self.sub_services.insert(service.name.clone(), sub_list);
}
}
self.status = Status::Ok;
// Aggregate status from all services
let mut all_statuses = Vec::new();
all_statuses.extend(self.parent_services.values().map(|info| info.widget_status));
for sub_list in self.sub_services.values() {
all_statuses.extend(sub_list.iter().map(|(_, info)| info.widget_status));
}
self.status = if all_statuses.is_empty() {
Status::Unknown
} else {
Status::aggregate(&all_statuses)
};
}
}
@@ -294,15 +338,13 @@ impl ServicesWidget {
self.parent_services
.entry(parent_service)
.or_insert(ServiceInfo {
status: "unknown".to_string(),
memory_mb: None,
disk_gb: None,
latency_ms: None,
metrics: Vec::new(),
widget_status: Status::Unknown,
});
if metric.name.ends_with("_status") {
service_info.status = metric.value.as_string();
service_info.widget_status = metric.status;
} else if metric.name.ends_with("_memory_mb") {
if let Some(memory) = metric.value.as_f32() {
@@ -331,10 +373,9 @@ impl ServicesWidget {
sub_service_list.push((
sub_name.clone(),
ServiceInfo {
status: "unknown".to_string(),
memory_mb: None,
disk_gb: None,
latency_ms: None,
metrics: Vec::new(),
widget_status: Status::Unknown,
},
));
@@ -342,7 +383,6 @@ impl ServicesWidget {
};
if metric.name.ends_with("_status") {
sub_service_info.status = metric.value.as_string();
sub_service_info.widget_status = metric.status;
} else if metric.name.ends_with("_memory_mb") {
if let Some(memory) = metric.value.as_f32() {
@@ -352,11 +392,6 @@ impl ServicesWidget {
if let Some(disk) = metric.value.as_f32() {
sub_service_info.disk_gb = Some(disk);
}
} else if metric.name.ends_with("_latency_ms") {
if let Some(latency) = metric.value.as_f32() {
sub_service_info.latency_ms = Some(latency);
sub_service_info.widget_status = metric.status;
}
}
}
}

View File

@@ -8,13 +8,16 @@ use ratatui::{
use crate::ui::theme::{StatusIcons, Typography};
/// System widget displaying NixOS info, CPU, RAM, and Storage in unified layout
/// System widget displaying NixOS info, Network, CPU, RAM, and Storage in unified layout
#[derive(Clone)]
pub struct SystemWidget {
// NixOS information
nixos_build: Option<String>,
agent_hash: Option<String>,
// Network interfaces
network_interfaces: Vec<cm_dashboard_shared::NetworkInterfaceData>,
// CPU metrics
cpu_load_1min: Option<f32>,
cpu_load_5min: Option<f32>,
@@ -37,6 +40,17 @@ pub struct SystemWidget {
// Storage metrics (collected from disk metrics)
storage_pools: Vec<StoragePool>,
// Backup metrics
backup_status: String,
backup_start_time_raw: Option<String>,
backup_disk_serial: Option<String>,
backup_disk_usage_percent: Option<f32>,
backup_disk_used_gb: Option<f32>,
backup_disk_total_gb: Option<f32>,
backup_disk_wear_percent: Option<f32>,
backup_disk_temperature: Option<f32>,
backup_last_size_gb: Option<f32>,
// Overall status
has_data: bool,
}
@@ -46,7 +60,9 @@ struct StoragePool {
name: String,
mount_point: String,
pool_type: String, // "single", "mergerfs (2+1)", "RAID5 (3+1)", etc.
drives: Vec<StorageDrive>,
drives: Vec<StorageDrive>, // For physical drives
data_drives: Vec<StorageDrive>, // For MergerFS pools
parity_drives: Vec<StorageDrive>, // For MergerFS pools
filesystems: Vec<FileSystem>, // For physical drive pools: individual filesystem children
usage_percent: Option<f32>,
used_gb: Option<f32>,
@@ -76,6 +92,7 @@ impl SystemWidget {
Self {
nixos_build: None,
agent_hash: None,
network_interfaces: Vec::new(),
cpu_load_1min: None,
cpu_load_5min: None,
cpu_load_15min: None,
@@ -91,6 +108,15 @@ impl SystemWidget {
tmp_status: Status::Unknown,
tmpfs_mounts: Vec::new(),
storage_pools: Vec::new(),
backup_status: "unknown".to_string(),
backup_start_time_raw: None,
backup_disk_serial: None,
backup_disk_usage_percent: None,
backup_disk_used_gb: None,
backup_disk_total_gb: None,
backup_disk_wear_percent: None,
backup_disk_temperature: None,
backup_last_size_gb: None,
has_data: false,
}
}
@@ -142,6 +168,9 @@ impl Widget for SystemWidget {
// Extract build version
self.nixos_build = agent_data.build_version.clone();
// Extract network interfaces
self.network_interfaces = agent_data.system.network.interfaces.clone();
// Extract CPU data directly
let cpu = &agent_data.system.cpu;
self.cpu_load_1min = Some(cpu.load_1min);
@@ -170,6 +199,28 @@ impl Widget for SystemWidget {
// Convert storage data to internal format
self.update_storage_from_agent_data(agent_data);
// Extract backup data
let backup = &agent_data.backup;
self.backup_status = backup.status.clone();
self.backup_start_time_raw = backup.start_time_raw.clone();
self.backup_last_size_gb = backup.last_backup_size_gb;
if let Some(disk) = &backup.repository_disk {
self.backup_disk_serial = Some(disk.serial.clone());
self.backup_disk_usage_percent = Some(disk.usage_percent);
self.backup_disk_used_gb = Some(disk.used_gb);
self.backup_disk_total_gb = Some(disk.total_gb);
self.backup_disk_wear_percent = disk.wear_percent;
self.backup_disk_temperature = disk.temperature_celsius;
} else {
self.backup_disk_serial = None;
self.backup_disk_usage_percent = None;
self.backup_disk_used_gb = None;
self.backup_disk_total_gb = None;
self.backup_disk_wear_percent = None;
self.backup_disk_temperature = None;
}
}
}
@@ -185,6 +236,8 @@ impl SystemWidget {
mount_point: drive.name.clone(),
pool_type: "drive".to_string(),
drives: Vec::new(),
data_drives: Vec::new(),
parity_drives: Vec::new(),
filesystems: Vec::new(),
usage_percent: None,
used_gb: None,
@@ -193,8 +246,11 @@ impl SystemWidget {
};
// Add drive info
let display_name = drive.serial_number.as_ref()
.map(|s| truncate_serial(s))
.unwrap_or(drive.name.clone());
let storage_drive = StorageDrive {
name: drive.name.clone(),
name: display_name,
temperature: drive.temperature_celsius,
wear_percent: drive.wear_percent,
status: Status::Ok,
@@ -225,7 +281,85 @@ impl SystemWidget {
pools.insert(drive.name.clone(), pool);
}
// Convert pools
// Convert pools (MergerFS, RAID, etc.)
for pool in &agent_data.system.storage.pools {
// Use agent-calculated status (combined health and usage status)
let pool_status = if pool.health_status == Status::Critical || pool.usage_status == Status::Critical {
Status::Critical
} else if pool.health_status == Status::Warning || pool.usage_status == Status::Warning {
Status::Warning
} else if pool.health_status == Status::Ok && pool.usage_status == Status::Ok {
Status::Ok
} else {
Status::Unknown
};
let mut storage_pool = StoragePool {
name: pool.name.clone(),
mount_point: pool.mount.clone(),
pool_type: pool.pool_type.clone(),
drives: Vec::new(),
data_drives: Vec::new(),
parity_drives: Vec::new(),
filesystems: Vec::new(),
usage_percent: Some(pool.usage_percent),
used_gb: Some(pool.used_gb),
total_gb: Some(pool.total_gb),
status: pool_status,
};
// Add data drives - use agent-calculated status
for drive in &pool.data_drives {
// Use combined health and temperature status
let drive_status = if drive.health_status == Status::Critical || drive.temperature_status == Status::Critical {
Status::Critical
} else if drive.health_status == Status::Warning || drive.temperature_status == Status::Warning {
Status::Warning
} else if drive.health_status == Status::Ok && drive.temperature_status == Status::Ok {
Status::Ok
} else {
Status::Unknown
};
let display_name = drive.serial_number.as_ref()
.map(|s| truncate_serial(s))
.unwrap_or(drive.name.clone());
let storage_drive = StorageDrive {
name: display_name,
temperature: drive.temperature_celsius,
wear_percent: drive.wear_percent,
status: drive_status,
};
storage_pool.data_drives.push(storage_drive);
}
// Add parity drives - use agent-calculated status
for drive in &pool.parity_drives {
// Use combined health and temperature status
let drive_status = if drive.health_status == Status::Critical || drive.temperature_status == Status::Critical {
Status::Critical
} else if drive.health_status == Status::Warning || drive.temperature_status == Status::Warning {
Status::Warning
} else if drive.health_status == Status::Ok && drive.temperature_status == Status::Ok {
Status::Ok
} else {
Status::Unknown
};
let display_name = drive.serial_number.as_ref()
.map(|s| truncate_serial(s))
.unwrap_or(drive.name.clone());
let storage_drive = StorageDrive {
name: display_name,
temperature: drive.temperature_celsius,
wear_percent: drive.wear_percent,
status: drive_status,
};
storage_pool.parity_drives.push(storage_drive);
}
pools.insert(pool.name.clone(), storage_pool);
}
// Store pools
let mut pool_list: Vec<StoragePool> = pools.into_values().collect();
@@ -241,12 +375,8 @@ impl SystemWidget {
// Pool header line with type and health
let pool_label = if pool.pool_type == "drive" {
// For physical drives, show the drive name with temperature and wear percentage if available
// Look for any drive with temp/wear data (physical drives may have drives named after the pool)
let drive_info = pool.drives.iter()
.find(|d| d.name == pool.name)
.or_else(|| pool.drives.first());
if let Some(drive) = drive_info {
// Physical drives only have one drive entry
if let Some(drive) = pool.drives.first() {
let mut drive_details = Vec::new();
if let Some(temp) = drive.temperature {
drive_details.push(format!("T: {}°C", temp as i32));
@@ -254,18 +384,18 @@ impl SystemWidget {
if let Some(wear) = drive.wear_percent {
drive_details.push(format!("W: {}%", wear as i32));
}
if !drive_details.is_empty() {
format!("{} {}", pool.name, drive_details.join(" "))
format!("{} {}", drive.name, drive_details.join(" "))
} else {
pool.name.clone()
drive.name.clone()
}
} else {
pool.name.clone()
}
} else {
// For mergerfs pools, show pool name with format
format!("{} ({})", pool.mount_point, pool.pool_type)
// For mergerfs pools, show pool type with mount point
format!("mergerfs {}:", pool.mount_point)
};
let pool_spans = StatusIcons::create_status_spans(pool.status.clone(), &pool_label);
@@ -294,28 +424,78 @@ impl SystemWidget {
lines.push(Line::from(fs_spans));
}
} else {
// For mergerfs pools, show data drives and parity drives in tree structure
if !pool.drives.is_empty() {
// Group drives by type based on naming conventions or show all as data drives
let (data_drives, parity_drives): (Vec<_>, Vec<_>) = pool.drives.iter()
.partition(|d| !d.name.contains("parity") && !d.name.starts_with("sdc"));
// For mergerfs pools, show structure matching CLAUDE.md format:
// ● mergerfs (2+1):
// ├─ Total: ● 63% 2355.2GB/3686.4GB
// ├─ Data Disks:
// ├─ ● sdb T: 24°C W: 5%
// │ └─ ● sdd T: 27°C W: 5%
// ├─ Parity: ● sdc T: 24°C W: 5%
// └─ Mount: /srv/media
// Pool total usage
let total_text = format!("{:.0}% {:.1}GB/{:.1}GB",
pool.usage_percent.unwrap_or(0.0),
pool.used_gb.unwrap_or(0.0),
pool.total_gb.unwrap_or(0.0)
);
let mut total_spans = vec![
Span::styled(" ├─ ", Typography::tree()),
];
total_spans.extend(StatusIcons::create_status_spans(Status::Ok, &total_text));
lines.push(Line::from(total_spans));
if !data_drives.is_empty() {
lines.push(Line::from(vec![
Span::styled(" ├─ Data Disks:", Typography::secondary())
]));
for (i, drive) in data_drives.iter().enumerate() {
render_pool_drive(drive, i == data_drives.len() - 1 && parity_drives.is_empty(), &mut lines);
}
// Data drives - at same level as parity
let has_parity = !pool.parity_drives.is_empty();
for (i, drive) in pool.data_drives.iter().enumerate() {
let is_last_data = i == pool.data_drives.len() - 1;
let mut drive_details = Vec::new();
if let Some(temp) = drive.temperature {
drive_details.push(format!("T: {}°C", temp as i32));
}
if let Some(wear) = drive.wear_percent {
drive_details.push(format!("W: {}%", wear as i32));
}
if !parity_drives.is_empty() {
lines.push(Line::from(vec![
Span::styled(" └─ Parity:", Typography::secondary())
]));
for (i, drive) in parity_drives.iter().enumerate() {
render_pool_drive(drive, i == parity_drives.len() - 1, &mut lines);
let drive_text = if !drive_details.is_empty() {
format!("Data_{}: {} {}", i + 1, drive.name, drive_details.join(" "))
} else {
format!("Data_{}: {}", i + 1, drive.name)
};
// Last data drive uses └─ if there's no parity, otherwise ├─
let tree_symbol = if is_last_data && !has_parity { " └─ " } else { " ├─ " };
let mut data_spans = vec![
Span::styled(tree_symbol, Typography::tree()),
];
data_spans.extend(StatusIcons::create_status_spans(drive.status.clone(), &drive_text));
lines.push(Line::from(data_spans));
}
// Parity drives - last item(s)
if !pool.parity_drives.is_empty() {
for (i, drive) in pool.parity_drives.iter().enumerate() {
let is_last = i == pool.parity_drives.len() - 1;
let mut drive_details = Vec::new();
if let Some(temp) = drive.temperature {
drive_details.push(format!("T: {}°C", temp as i32));
}
if let Some(wear) = drive.wear_percent {
drive_details.push(format!("W: {}%", wear as i32));
}
let drive_text = if !drive_details.is_empty() {
format!("Parity: {} {}", drive.name, drive_details.join(" "))
} else {
format!("Parity: {}", drive.name)
};
let tree_symbol = if is_last { " └─ " } else { " ├─ " };
let mut parity_spans = vec![
Span::styled(tree_symbol, Typography::tree()),
];
parity_spans.extend(StatusIcons::create_status_spans(drive.status.clone(), &drive_text));
lines.push(Line::from(parity_spans));
}
}
}
@@ -325,35 +505,280 @@ impl SystemWidget {
}
}
/// Helper function to render a drive in a storage pool
fn render_pool_drive(drive: &StorageDrive, is_last: bool, lines: &mut Vec<Line<'_>>) {
let tree_symbol = if is_last { " └─" } else { " ├─" };
let mut drive_details = Vec::new();
if let Some(temp) = drive.temperature {
drive_details.push(format!("T: {}°C", temp as i32));
}
if let Some(wear) = drive.wear_percent {
drive_details.push(format!("W: {}%", wear as i32));
}
let drive_text = if !drive_details.is_empty() {
format!("{} {}", drive.name, drive_details.join(" "))
/// Truncate serial number to last 8 characters
fn truncate_serial(serial: &str) -> String {
let len = serial.len();
if len > 8 {
serial[len - 8..].to_string()
} else {
format!("{}", drive.name)
};
let mut drive_spans = vec![
Span::styled(tree_symbol, Typography::tree()),
Span::raw(" "),
];
drive_spans.extend(StatusIcons::create_status_spans(drive.status.clone(), &drive_text));
lines.push(Line::from(drive_spans));
serial.to_string()
}
}
impl SystemWidget {
/// Render system widget
pub fn render(&mut self, frame: &mut Frame, area: Rect, hostname: &str, config: Option<&crate::config::DashboardConfig>) {
/// Render backup section for display
fn render_backup(&self) -> Vec<Line<'_>> {
let mut lines = Vec::new();
// First line: serial number with temperature and wear
if let Some(serial) = &self.backup_disk_serial {
let truncated_serial = truncate_serial(serial);
let mut details = Vec::new();
if let Some(temp) = self.backup_disk_temperature {
details.push(format!("T: {}°C", temp as i32));
}
if let Some(wear) = self.backup_disk_wear_percent {
details.push(format!("W: {}%", wear as i32));
}
let disk_text = if !details.is_empty() {
format!("{} {}", truncated_serial, details.join(" "))
} else {
truncated_serial
};
let backup_status = match self.backup_status.as_str() {
"completed" | "success" => Status::Ok,
"running" => Status::Pending,
"failed" => Status::Critical,
_ => Status::Unknown,
};
let disk_spans = StatusIcons::create_status_spans(backup_status, &disk_text);
lines.push(Line::from(disk_spans));
// Show backup time from TOML if available
if let Some(start_time) = &self.backup_start_time_raw {
let time_text = if let Some(size) = self.backup_last_size_gb {
format!("Time: {} ({:.1}GB)", start_time, size)
} else {
format!("Time: {}", start_time)
};
lines.push(Line::from(vec![
Span::styled(" ├─ ", Typography::tree()),
Span::styled(time_text, Typography::secondary())
]));
}
// Usage information
if let (Some(used), Some(total), Some(usage_percent)) = (
self.backup_disk_used_gb,
self.backup_disk_total_gb,
self.backup_disk_usage_percent
) {
let usage_text = format!("Usage: {:.0}% {:.0}GB/{:.0}GB", usage_percent, used, total);
let usage_spans = StatusIcons::create_status_spans(Status::Ok, &usage_text);
let mut full_spans = vec![
Span::styled(" └─ ", Typography::tree()),
];
full_spans.extend(usage_spans);
lines.push(Line::from(full_spans));
}
}
lines
}
/// Compress IPv4 addresses from same subnet
/// Example: "192.168.30.1, 192.168.30.100" -> "192.168.30.1, 100"
fn compress_ipv4_addresses(addresses: &[String]) -> String {
if addresses.is_empty() {
return String::new();
}
if addresses.len() == 1 {
return addresses[0].clone();
}
let mut result = Vec::new();
let mut last_prefix = String::new();
for addr in addresses {
let parts: Vec<&str> = addr.split('.').collect();
if parts.len() == 4 {
let prefix = format!("{}.{}.{}", parts[0], parts[1], parts[2]);
if prefix == last_prefix {
// Same subnet, show only last octet
result.push(parts[3].to_string());
} else {
// Different subnet, show full IP
result.push(addr.clone());
last_prefix = prefix;
}
} else {
// Invalid IP format, show as-is
result.push(addr.clone());
}
}
result.join(", ")
}
/// Render network section for display with physical/virtual grouping
fn render_network(&self) -> Vec<Line<'_>> {
let mut lines = Vec::new();
if self.network_interfaces.is_empty() {
return lines;
}
// Separate physical and virtual interfaces
let physical: Vec<_> = self.network_interfaces.iter().filter(|i| i.is_physical).collect();
let virtual_interfaces: Vec<_> = self.network_interfaces.iter().filter(|i| !i.is_physical).collect();
// Find standalone virtual interfaces (those without a parent)
let mut standalone_virtual: Vec<_> = virtual_interfaces.iter()
.filter(|i| i.parent_interface.is_none())
.collect();
// Sort standalone virtual: VLANs first (by VLAN ID), then others alphabetically
standalone_virtual.sort_by(|a, b| {
match (a.vlan_id, b.vlan_id) {
(Some(vlan_a), Some(vlan_b)) => vlan_a.cmp(&vlan_b),
(Some(_), None) => std::cmp::Ordering::Less,
(None, Some(_)) => std::cmp::Ordering::Greater,
(None, None) => a.name.cmp(&b.name),
}
});
// Render physical interfaces with their children
for (phy_idx, interface) in physical.iter().enumerate() {
let is_last_physical = phy_idx == physical.len() - 1 && standalone_virtual.is_empty();
// Physical interface header with status icon
let mut header_spans = vec![];
header_spans.extend(StatusIcons::create_status_spans(
interface.link_status.clone(),
&format!("{}:", interface.name)
));
lines.push(Line::from(header_spans));
// Find child interfaces for this physical interface
let mut children: Vec<_> = virtual_interfaces.iter()
.filter(|vi| {
if let Some(parent) = &vi.parent_interface {
parent == &interface.name
} else {
false
}
})
.collect();
// Sort children: VLANs first (by VLAN ID), then others alphabetically
children.sort_by(|a, b| {
match (a.vlan_id, b.vlan_id) {
(Some(vlan_a), Some(vlan_b)) => vlan_a.cmp(&vlan_b),
(Some(_), None) => std::cmp::Ordering::Less,
(None, Some(_)) => std::cmp::Ordering::Greater,
(None, None) => a.name.cmp(&b.name),
}
});
// Count total items under this physical interface (IPs + children)
let ip_count = interface.ipv4_addresses.len() + interface.ipv6_addresses.len();
let total_children = ip_count + children.len();
let mut child_index = 0;
// IPv4 addresses on the physical interface itself
for ipv4 in &interface.ipv4_addresses {
child_index += 1;
let is_last = child_index == total_children && is_last_physical;
let tree_symbol = if is_last { " └─ " } else { " ├─ " };
lines.push(Line::from(vec![
Span::styled(tree_symbol, Typography::tree()),
Span::styled(format!("ip: {}", ipv4), Typography::secondary()),
]));
}
// IPv6 addresses on the physical interface itself
for ipv6 in &interface.ipv6_addresses {
child_index += 1;
let is_last = child_index == total_children && is_last_physical;
let tree_symbol = if is_last { " └─ " } else { " ├─ " };
lines.push(Line::from(vec![
Span::styled(tree_symbol, Typography::tree()),
Span::styled(format!("ip: {}", ipv6), Typography::secondary()),
]));
}
// Child virtual interfaces (VLANs, etc.)
for child in children {
child_index += 1;
let is_last = child_index == total_children && is_last_physical;
let tree_symbol = if is_last { " └─ " } else { " ├─ " };
let ip_text = if !child.ipv4_addresses.is_empty() {
Self::compress_ipv4_addresses(&child.ipv4_addresses)
} else if !child.ipv6_addresses.is_empty() {
child.ipv6_addresses.join(", ")
} else {
String::new()
};
// Format: "name (vlan X): IP" or "name: IP"
let child_text = if let Some(vlan_id) = child.vlan_id {
if !ip_text.is_empty() {
format!("{} (vlan {}): {}", child.name, vlan_id, ip_text)
} else {
format!("{} (vlan {}):", child.name, vlan_id)
}
} else {
if !ip_text.is_empty() {
format!("{}: {}", child.name, ip_text)
} else {
format!("{}:", child.name)
}
};
lines.push(Line::from(vec![
Span::styled(tree_symbol, Typography::tree()),
Span::styled(child_text, Typography::secondary()),
]));
}
}
// Render standalone virtual interfaces (those without a parent)
for (virt_idx, interface) in standalone_virtual.iter().enumerate() {
let is_last = virt_idx == standalone_virtual.len() - 1;
let tree_symbol = if is_last { " └─ " } else { " ├─ " };
// Virtual interface with IPs
let ip_text = if !interface.ipv4_addresses.is_empty() {
Self::compress_ipv4_addresses(&interface.ipv4_addresses)
} else if !interface.ipv6_addresses.is_empty() {
interface.ipv6_addresses.join(", ")
} else {
String::new()
};
// Format: "name (vlan X): IP" or "name: IP"
let interface_text = if let Some(vlan_id) = interface.vlan_id {
if !ip_text.is_empty() {
format!("{} (vlan {}): {}", interface.name, vlan_id, ip_text)
} else {
format!("{} (vlan {}):", interface.name, vlan_id)
}
} else {
if !ip_text.is_empty() {
format!("{}: {}", interface.name, ip_text)
} else {
format!("{}:", interface.name)
}
};
lines.push(Line::from(vec![
Span::styled(tree_symbol, Typography::tree()),
Span::styled(interface_text, Typography::secondary()),
]));
}
lines
}
/// Render system widget
pub fn render(&mut self, frame: &mut Frame, area: Rect, hostname: &str, _config: Option<&crate::config::DashboardConfig>) {
let mut lines = Vec::new();
// NixOS section
@@ -370,30 +795,19 @@ impl SystemWidget {
lines.push(Line::from(vec![
Span::styled(format!("Agent: {}", agent_version_text), Typography::secondary())
]));
// Display detected connection IP
if let Some(config) = config {
if let Some(host_details) = config.hosts.get(hostname) {
let detected_ip = host_details.get_connection_ip(hostname);
lines.push(Line::from(vec![
Span::styled(format!("IP: {}", detected_ip), Typography::secondary())
]));
}
}
// CPU section
lines.push(Line::from(vec![
Span::styled("CPU:", Typography::widget_title())
]));
let load_text = self.format_cpu_load();
let cpu_spans = StatusIcons::create_status_spans(
self.cpu_status.clone(),
&format!("Load: {}", load_text)
);
lines.push(Line::from(cpu_spans));
let freq_text = self.format_cpu_frequency();
lines.push(Line::from(vec![
Span::styled(" └─ ", Typography::tree()),
@@ -404,7 +818,7 @@ impl SystemWidget {
lines.push(Line::from(vec![
Span::styled("RAM:", Typography::widget_title())
]));
let memory_text = self.format_memory_usage();
let memory_spans = StatusIcons::create_status_spans(
self.memory_status.clone(),
@@ -416,16 +830,16 @@ impl SystemWidget {
for (i, tmpfs) in self.tmpfs_mounts.iter().enumerate() {
let is_last = i == self.tmpfs_mounts.len() - 1;
let tree_symbol = if is_last { " └─ " } else { " ├─ " };
let usage_text = if tmpfs.total_gb > 0.0 {
format!("{:.0}% {:.1}GB/{:.1}GB",
tmpfs.usage_percent,
tmpfs.used_gb,
format!("{:.0}% {:.1}GB/{:.1}GB",
tmpfs.usage_percent,
tmpfs.used_gb,
tmpfs.total_gb)
} else {
"— —/—".to_string()
};
let mut tmpfs_spans = vec![
Span::styled(tree_symbol, Typography::tree()),
];
@@ -436,6 +850,16 @@ impl SystemWidget {
lines.push(Line::from(tmpfs_spans));
}
// Network section
if !self.network_interfaces.is_empty() {
lines.push(Line::from(vec![
Span::styled("Network:", Typography::widget_title())
]));
let network_lines = self.render_network();
lines.extend(network_lines);
}
// Storage section
lines.push(Line::from(vec![
Span::styled("Storage:", Typography::widget_title())
@@ -445,6 +869,16 @@ impl SystemWidget {
let storage_lines = self.render_storage();
lines.extend(storage_lines);
// Backup section (if available)
if self.backup_status != "unavailable" && self.backup_status != "unknown" {
lines.push(Line::from(vec![
Span::styled("Backup:", Typography::widget_title())
]));
let backup_lines = self.render_backup();
lines.extend(backup_lines);
}
// Apply scroll offset
let total_lines = lines.len();
let available_height = area.height as usize;

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard-shared"
version = "0.1.141"
version = "0.1.181"
edition = "2021"
[dependencies]

View File

@@ -16,11 +16,30 @@ pub struct AgentData {
/// System-level monitoring data
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SystemData {
pub network: NetworkData,
pub cpu: CpuData,
pub memory: MemoryData,
pub storage: StorageData,
}
/// Network interface monitoring data
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct NetworkData {
pub interfaces: Vec<NetworkInterfaceData>,
}
/// Individual network interface data
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct NetworkInterfaceData {
pub name: String,
pub ipv4_addresses: Vec<String>,
pub ipv6_addresses: Vec<String>,
pub is_physical: bool,
pub link_status: Status,
pub parent_interface: Option<String>,
pub vlan_id: Option<u16>,
}
/// CPU monitoring data
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CpuData {
@@ -66,6 +85,7 @@ pub struct StorageData {
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DriveData {
pub name: String,
pub serial_number: Option<String>,
pub health: String,
pub temperature_celsius: Option<f32>,
pub wear_percent: Option<f32>,
@@ -96,35 +116,69 @@ pub struct PoolData {
pub total_gb: f32,
pub data_drives: Vec<PoolDriveData>,
pub parity_drives: Vec<PoolDriveData>,
pub health_status: Status,
pub usage_status: Status,
}
/// Drive in a storage pool
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PoolDriveData {
pub name: String,
pub serial_number: Option<String>,
pub temperature_celsius: Option<f32>,
pub wear_percent: Option<f32>,
pub health: String,
pub health_status: Status,
pub temperature_status: Status,
}
/// Service monitoring data
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ServiceData {
pub name: String,
pub status: String, // "active", "inactive", "failed"
pub memory_mb: f32,
pub disk_gb: f32,
pub user_stopped: bool,
pub service_status: Status,
pub sub_services: Vec<SubServiceData>,
}
/// Sub-service data (nginx sites, docker containers, etc.)
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SubServiceData {
pub name: String,
pub service_status: Status,
pub metrics: Vec<SubServiceMetric>,
}
/// Individual metric for a sub-service
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SubServiceMetric {
pub label: String,
pub value: f32,
pub unit: Option<String>,
}
/// Backup system data
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct BackupData {
pub status: String,
pub last_run: Option<u64>,
pub next_scheduled: Option<u64>,
pub total_size_gb: Option<f32>,
pub repository_health: Option<String>,
pub repository_disk: Option<BackupDiskData>,
pub last_backup_size_gb: Option<f32>,
pub start_time_raw: Option<String>,
}
/// Backup repository disk information
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct BackupDiskData {
pub serial: String,
pub usage_percent: f32,
pub used_gb: f32,
pub total_gb: f32,
pub wear_percent: Option<f32>,
pub temperature_celsius: Option<f32>,
}
impl AgentData {
@@ -136,6 +190,9 @@ impl AgentData {
build_version: None,
timestamp: chrono::Utc::now().timestamp() as u64,
system: SystemData {
network: NetworkData {
interfaces: Vec::new(),
},
cpu: CpuData {
load_1min: 0.0,
load_5min: 0.0,
@@ -163,10 +220,11 @@ impl AgentData {
services: Vec::new(),
backup: BackupData {
status: "unknown".to_string(),
last_run: None,
next_scheduled: None,
total_size_gb: None,
repository_health: None,
repository_disk: None,
last_backup_size_gb: None,
start_time_raw: None,
},
}
}