Add real-time monitoring of torrent copy operations when completed
downloads are copied from SSD to HDD storage.
Changes:
- Add marker file tracking during rsync operations
- Monitor active copy operations via /tmp/torrent-copy-active
- Display copy status as sub-service under openvpn-vpn-download
- Show currently copying torrent name in dashboard
The copy status appears as an informational sub-service while rsync
is actively copying completed torrents to permanent storage, providing
visibility into potentially long-running file transfer operations.
Run sudo nft directly without timeout wrapper to preserve capabilities.
The timeout -> sudo chain was preventing nft from accessing netlink
with proper permissions.
- Change from 'timeout 3 sudo nft' to 'sudo nft'
- Allows CAP_NET_ADMIN to pass through correctly
- Update version to v0.1.256
Use explicit path /run/current-system/sw/bin/nft to match sudoers
configuration. Previously using 'nft' without path was resolving to
wrong location and failing permission checks.
- Change from 'sudo nft' to 'sudo /run/current-system/sw/bin/nft'
- Matches sudoers entry for passwordless execution
- Update version to v0.1.255
- Log exit status code when nft command fails
- Log stderr output to diagnose issues
- Distinguish between command execution failure and non-zero exit
- Update version to v0.1.253
Consolidate VPN-related information under openvpn-vpn-download service.
Now shows both VPN route and torrent statistics as sub-services.
- Remove route from openvpn-vpn-connection
- Add route to openvpn-vpn-download (displayed first)
- Torrent stats displayed second
- Update version to v0.1.251
Change nftables port parser to specifically look for 'chain input_wan'
instead of any chain with 'input' in the name. This ensures we only
collect WAN/external ports, not LAN or other internal chains.
- Look for 'chain input_wan' specifically
- Remove internal network filters (no longer needed)
- Update version to v0.1.249
Update nftables port collector to use sudo when querying ruleset.
Requires corresponding sudoers configuration in NixOS.
- Change nft command to use sudo
- Update version to v0.1.248
Display open external ports from nftables firewall rules as sub-services
grouped by protocol. Only shows WAN incoming ports by filtering input chain
rules and excluding private network sources.
- Parse nftables ruleset for accept rules with dport in input chain
- Filter out internal network traffic (192.168.x, 10.x, 172.16.x, loopback)
- Extract single ports and port sets from rules
- Group and display as "TCP: 22, 80, 443" and "UDP: 53, 123"
- Update version to v0.1.247
Update collector to use qBittorrent Web API instead of Transmission RPC.
Query qBittorrent through VPN namespace using existing passwordless sudo
permissions for ip netns exec commands.
- Change service name from transmission-vpn to openvpn-vpn-download
- Replace get_transmission_stats() with get_qbittorrent_stats()
- Use curl through VPN namespace to access qBittorrent API at localhost:8080
- Parse qBittorrent JSON response for state, dlspeed, upspeed
- Count active torrents (downloading, uploading, stalledDL, stalledUP)
- Update version to v0.1.246
Change docker images to use name field for all data instead of metrics,
matching the pattern used by torrent stats and VPN routes. Increase display
width for Status::Info sub-services from 18 to 50 characters to accommodate
longer informational text without truncation.
- Docker images now show: "image-name size: 994.0 MB" in name field
- Torrent stats show: "17 active, ↓ 2.5 MB/s, ↑ 1.2 MB/s" in name field
- Remove fixed-width padding for Info status sub-services
- Update version to v0.1.245
Implement aggregate torrent statistics display for transmission-vpn service
via Transmission RPC API. Shows active torrent count and total download/upload
speeds. Change VPN route label from "ip:" to "route:" for clarity.
- Add get_transmission_stats() method to query Transmission RPC
- Display format: "X active, ↓ MB/s, ↑ MB/s"
- Update version to v0.1.244
Docker images now use Status::Info like VPN IP.
No "D" prefix, no status icon - just name and metrics.
All informational sub-services handled consistently.
Version: v0.1.239
Agent uses Status enum to control display:
- Status::Info: no icon, no status text (VPN IP)
- Other statuses: icon + text (containers, nginx sites)
Dashboard checks status, no hardcoded service_type exceptions.
Version: v0.1.237
Display VPN external IP as sub-service under openvpn-vpn-connection.
Query external IP through openvpn-namespace using curl ifconfig.me.
Version: v0.1.232
Collectors now clear their target vectors (tmpfs, drives, pools, services)
before populating to prevent duplicates when updating cached AgentData.
- Clear tmpfs list in memory collector
- Clear drives and pools in disk collector
- Clear services in systemd collector
- Bump version to v0.1.231
The service_status_cache from discovery only has active_state with
all detailed metrics set to None. During collection, get_service_status()
was returning cached data instead of fetching fresh systemctl show data.
Now always fetch fresh data to populate memory_bytes, restart_count,
and uptime_seconds properly.
Shared:
- Add memory_bytes, restart_count, uptime_seconds to ServiceData
Agent:
- Add new fields to ServiceStatusInfo struct
- Fetch MemoryCurrent, NRestarts, ExecMainStartTimestamp from systemctl show
- Calculate uptime from start timestamp
- Parse and populate new fields in ServiceData
- Remove unused load_state and sub_state fields
Dashboard:
- Add memory_bytes, restart_count, uptime_seconds to ServiceInfo
- Update header: Service, Status, RAM, Uptime, ↻ (restarts)
- Format memory as MB/GB
- Format uptime as Xd Xh, Xh Xm, or Xm
- Show restart count with ! prefix if > 0 to indicate instability
All metrics obtained from single systemctl show call - zero overhead.
Complete removal of service resource metrics:
Agent:
- Remove memory_mb and disk_gb fields from ServiceData struct
- Remove get_service_memory_usage() method
- Remove get_service_disk_usage() method
- Remove get_directory_size() method
- Remove unused warn import
Dashboard:
- Remove memory_mb and disk_gb from ServiceInfo struct
- Remove memory/disk display from format_parent_service_line
- Remove memory/disk parsing in legacy metric path
- Remove unused format_disk_size() function
Service resource metrics were slow, unreliable, and never worked
properly since structured data migration. Will be handled differently
in the future.
Changes:
- Rename docker images from 'image_node:18...' to 'I node:18...' for conciseness
- Change image status from 'active' to 'inactive' for neutral informational display
- Images now show with gray empty circle ○ instead of green filled circle ●
Docker images are static artifacts without meaningful operational status, so using inactive status provides neutral gray display that won't trigger alerts or affect service status aggregation.
Fixes random host disconnections caused by blocking operations preventing timely ZMQ packet transmission.
Changes:
- Add run_command_with_timeout() wrapper using tokio for async command execution
- Apply 10s timeout to smartctl (prevents 30+ second hangs on failing drives)
- Apply 5s timeout to du, lsblk, systemctl list commands
- Apply 3s timeout to systemctl show/is-active, df, ip commands
- Apply 2s timeout to hostname command
- Use system 'timeout' command for sync operations where async not needed
Critical fixes:
- smartctl: Failing drives could block for 30+ seconds per drive
- du: Large directories (Docker, PostgreSQL) could block 10-30+ seconds
- systemctl/docker: Commands could block indefinitely during system issues
With 1-second collection interval and 10-second heartbeat timeout, any blocking operation >10s causes false "host offline" alerts. These timeouts ensure collection completes quickly even during system degradation.
Agent changes:
- Changed docker ps and docker images commands to run without sudo
- cm-agent user is already in docker group, so sudo is not needed
- Fixes "unable to change to root gid: Operation not permitted" error
- Systemd security restrictions were blocking sudo gid changes
This fixes Docker container and image collection on systems with
systemd security hardening enabled.
Updated to version 0.1.178
Agent changes:
- Log stderr output when docker images command fails
- This will show the actual error message (e.g., permission denied, docker not found)
- Helps diagnose why docker images collection is failing
Updated to version 0.1.177
Agent changes:
- Changed debug!() to info!() for Docker collection logs
- This allows logs to show with default RUST_LOG=info setting
- Added info import to tracing use statement
Now logs will be visible in journalctl without needing to change log level:
- "Collecting Docker sub-services for service: docker"
- "Found X Docker containers"
- "Found X Docker images"
- "Total Docker sub-services added: X"
Updated to version 0.1.176
Agent changes:
- Added debug logging to Docker images collection function
- Log when Docker sub-services are being collected for a service
- Log count of containers and images found
- Log total sub-services added
- Show command failure details instead of silently returning empty vec
This will help diagnose why Docker images aren't showing up as sub-services
on some hosts. The logs will show if the docker commands are failing or if
the collection is working but data isn't being transmitted properly.
Updated to version 0.1.175
Agent changes:
- Added get_docker_images() function to list all Docker images
- Use docker images to show stored images with repository:tag and size
- Display images as sub-services under docker service with size in parentheses
- Skip dangling images (<none>:<none>)
- Images shown with active status (always present when listed)
Example display:
● docker active 139M 1MB
├─ ● docker_gitea active
├─ ○ docker_old-app inactive
├─ ● image_nginx:latest (142MB)
├─ ● image_postgres:15 (379MB)
└─ ● image_gitea:latest (256MB)
Updated to version 0.1.174
Agent changes:
- Use docker ps -a to show ALL containers (running and stopped)
- Map container status: Up -> active, Exited/Created -> inactive, other -> failed
- Display Docker containers as sub-services under the docker service
- Each container shown with proper status indicator
Example display:
● docker active 139M 1MB
├─ ● docker_gitea active
├─ ○ docker_old-app inactive
└─ ● docker_immich active
Updated to version 0.1.173
Agent changes:
- Changed docker ps to docker ps -a to show ALL containers (running and stopped)
- Map container status: Up -> active, Exited/Created -> inactive, other -> failed
- Display Docker containers as individual top-level services instead of sub-services
- Each container shown as "docker_{container_name}" in service list
This provides better visibility of all containers and their status directly in the
services panel, making it easier to see stopped containers at a glance.
Updated to version 0.1.172
Update systemd collector to use sudo for docker ps command to resolve
permission issues when cm-agent user lacks docker group membership.
This ensures Docker containers are properly discovered and displayed
as sub-services under the docker service.
Version: 0.1.160
Resolves nginx sites appearing only briefly during collection cycles by implementing proper caching of complete service data including sub-services.
Changes:
- Add cached_service_data field to store complete ServiceData with sub-services
- Modify collection logic to cache full service objects instead of basic ServiceInfo
- Update cache retrieval to use complete cached data preserving nginx site metrics
- Eliminate flickering of nginx sites between collection cycles
Version bump to v0.1.148
- Remove nginx_ prefix from site names in hierarchical structure
- Fix get_nginx_site_metrics to call correct internal method
- Implement same caching functionality as old working version
- Sites now stay visible continuously with 30s latency updates
- Preserve cached results between refresh cycles
- Remove duplicate status string fields from ServiceData and SubServiceData
- Use only Status enum as single source of truth for service status
- Agent calculates Status enum using calculate_service_status()
- Dashboard converts Status enum to display text for UI
- Implement flexible metrics system for sub-services with label/value/unit
- Fix status icon/text mismatches (inactive services now show gray circles)
- Ensure perfect alignment between service icons and status text
- Add nginx site metrics caching with configurable intervals matching original
- Implement complex nginx config parsing with brace counting and redirect detection
- Replace curl with reqwest HTTP client for proper timeout and redirect handling
- Fix docker container parsing to use comma format with proper status mapping
- Add sudo to directory size command for permission handling
- Change nginx URLs to use https protocol matching original
- Add advanced NixOS ExecStart parsing for argv[] format support
- Add nginx -T fallback functionality for config discovery
- Implement proper server block parsing with domain validation and brace tracking
- Add get_service_memory function matching original signature
All functionality now matches pre-refactor implementation architecture.
- Enhanced directory size logic with minimum 0.001GB visibility and permission error logging
- Added nginx site monitoring with latency checks and NixOS config discovery
- Added docker container monitoring as sub-services
- Integrated sub-service collection for active nginx and docker services
- All missing features from original implementation now restored
Fixes missing services and 0B disk usage issues by restoring:
- Wildcard pattern matching for service filters (gitea*, redis*)
- Service disk usage calculation from directories and WorkingDirectory
- Proper Status::Inactive for inactive services
Services now properly discovered and show actual disk usage.
Implements clean structured data collection eliminating all string metric
parsing bugs. Collectors now populate AgentData directly with type-safe
field access.
Key improvements:
- Mount points preserved correctly (/ and /boot instead of root/boot)
- Tmpfs discovery added to memory collector
- Temperature data flows as typed f32 fields
- Zero string parsing overhead
- Complete removal of MetricCollectionManager bridge
- Direct ZMQ transmission of structured JSON
All functionality maintained: service tracking, notifications, status
evaluation, and multi-host monitoring.
- Remove scroll offset fields from HostWidgets struct
- Replace scrolling with simple "X more below" indicators in all widgets
- Remove user-stopped service tracking from agent (now uses SSH control)
- Inactive services now consistently show Status::Inactive with empty circles
- Simplify widget render methods by removing scroll parameters
- Clean up unused imports and legacy scrolling infrastructure
- Fix journalctl command to use -fu for proper log following