cm-dashboard

Author	SHA1	Message	Date
Christoffer Martinsson	c62c7fa698	Remove debug logging from disk collector All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details Removed all debug! statements from disk collector to reduce log noise. Bump version to v0.1.226	2025-11-30 00:44:38 +01:00
Christoffer Martinsson	0b1d8c0a73	Fix Data_3 showing as unknown by handling smartctl warning exit codes All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details Root cause: sda's temperature exceeded threshold in the past, causing smartctl to return exit code 32 (warning: "Attributes have been <= threshold in the past"). The agent checked output.status.success() and rejected the entire output as failed, even though the data (serial, temperature, health) was perfectly valid. Smartctl exit codes are bit flags for informational warnings: - Exit 0: No warnings - Exit 32 (bit 5): Attributes were at/below threshold in past - Exit 64 (bit 6): Error log has entries - etc. The output data is valid regardless of these warning flags. Solution: Parse output as long as it's not empty, ignore exit code. Only return UNKNOWN if output is actually empty (command truly failed). Result: Data_3 will now show "ZDZ4VE0B T: 31°C" instead of "? Data_3: sda" Bump version to v0.1.225	2025-11-30 00:35:19 +01:00
Christoffer Martinsson	c77aa6eaaa	Fix Data_3 timeout by removing sequential SMART during pool detection All checks were successful Build and Release / build-and-release (push) Successful in 1m34s Details Root cause: SMART data was collected TWICE: 1. Sequential collection during pool detection in get_drive_info_for_path() using problematic tokio::task::block_in_place() nesting 2. Parallel collection in get_smart_data_for_drives() (v0.1.223) The sequential collection happened FIRST during pool detection, causing sda (Data_3) to timeout due to: - Bad async nesting: block_in_place() wrapping block_on() - Sequential execution causing runtime issues - sda being third in sequence, runtime degraded by then Solution: Remove SMART collection from get_drive_info_for_path(). Pool drive temperatures are populated later from the parallel SMART collection which properly uses futures::join_all. Benefits: - Eliminates problematic async nesting - All SMART queries happen once in parallel only - sda/Data_3 should now show serial (ZDZ4VE0B) and temperature Bump version to v0.1.224	2025-11-30 00:14:25 +01:00
Christoffer Martinsson	8a0e68f0e3	Fix Data_3 timeout by parallelizing SMART collection All checks were successful Build and Release / build-and-release (push) Successful in 1m10s Details Root cause: SMART data was collected sequentially, one drive at a time. With 5 drives taking ~500ms each, total collection time was 2.5+ seconds. When disk collector runs every 1 second, this caused overlapping collections creating resource contention. The last drive (sda/Data_3) would timeout due to the drive being accessed by the previous collection. Solution: Query all drives in parallel using futures::join_all. Now all drives get their SMART data collected simultaneously with independent 3-second timeouts, eliminating contention and reducing total collection time from 2.5+ seconds to ~500ms (the slowest single drive). Benefits: - All drives complete in ~500ms instead of 2.5+ seconds - No overlapping collections causing resource contention - Each drive gets full 3-second timeout window - sda/Data_3 should now show temperature and serial number Bump version to v0.1.223	2025-11-29 23:51:43 +01:00
Christoffer Martinsson	2d653fe9ae	Fix empty Storage section by configuring stdio pipes All checks were successful Build and Release / build-and-release (push) Successful in 1m15s Details Root cause: run_command_with_timeout() was calling cmd.spawn() without configuring stdout/stderr pipes. This caused command output to go to journald instead of being captured by wait_with_output(). The disk collector received empty output and failed silently. Solution: Configure stdout(Stdio::piped()) and stderr(Stdio::piped()) before spawning commands. This ensures wait_with_output() can properly capture command output. Fixes: Empty Storage section, lsblk output appearing in journald Bump version to v0.1.222	2025-11-29 23:25:17 +01:00
Christoffer Martinsson	caba78004e	Fix empty Storage section by properly aliasing command types All checks were successful Build and Release / build-and-release (push) Successful in 2m6s Details v0.1.220 broke disk collector by changing the import from std::process::Command to tokio::process::Command, but lines 193 and 767 explicitly used std::process::Command::new() which silently failed. Solution: Import both as aliases (TokioCommand/StdCommand) and use appropriate type for each operation - async commands use TokioCommand with run_command_with_timeout, sync commands use StdCommand with system timeout wrapper. Fixes: Empty Storage section after v0.1.220 deployment Bump version to v0.1.221	2025-11-29 21:29:33 +01:00
Christoffer Martinsson	77bf08a978	Fix blocking smartctl commands with proper async/timeout handling All checks were successful Build and Release / build-and-release (push) Successful in 2m2s Details - Changed disk collector to use tokio::process::Command instead of std::process::Command - Updated run_command_with_timeout to properly kill processes on timeout - Fixes issue where smartctl hangs on problematic drives (/dev/sda) freezing entire agent - Timeout now force-kills hung processes using kill -9, preventing orphaned smartctl processes This resolves the issue where Data_3 showed unknown status because smartctl was hanging indefinitely trying to read from a problematic drive, blocking the entire collector. Bump version to v0.1.220 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 21:09:04 +01:00
Christoffer Martinsson	929870f8b6	Bump version to v0.1.219 All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details	2025-11-29 18:35:14 +01:00
Christoffer Martinsson	7aae852b7b	Bump version to v0.1.218 All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-29 17:59:33 +01:00
Christoffer Martinsson	40f3ff66d8	Show archive count range to detect inconsistencies - Display single number if all services have same count - Display min-max range if counts differ (indicates problem)	2025-11-29 17:59:24 +01:00
Christoffer Martinsson	1c1beddb55	Bump version to v0.1.217 All checks were successful Build and Release / build-and-release (push) Successful in 1m20s Details	2025-11-29 17:51:13 +01:00
Christoffer Martinsson	a0d571a40e	Bump version to v0.1.216 All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-29 17:44:12 +01:00
Christoffer Martinsson	977200fff3	Move archive count to Usage line in backup display	2025-11-29 17:44:05 +01:00
Christoffer Martinsson	d692de5f83	Bump version to v0.1.215 All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details	2025-11-29 17:41:49 +01:00
Christoffer Martinsson	f5913dbd43	Add archive count to backup disk display	2025-11-29 17:41:11 +01:00
Christoffer Martinsson	faa30a7839	Sort backup repositories and disks for stable display All checks were successful Build and Release / build-and-release (push) Successful in 1m21s Details - Sort repositories alphabetically before rendering - Sort backup disks by serial number - Prevents display jumping between different orderings on updates - Consistent display order across refreshes Bump version to v0.1.214 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 17:15:17 +01:00
Christoffer Martinsson	6e4a42799f	Bump version to v0.1.213 All checks were successful Build and Release / build-and-release (push) Successful in 1m22s Details	2025-11-29 16:46:16 +01:00
Christoffer Martinsson	afb8d68e03	Implement multi-disk backup support - Update BackupData structure to support multiple backup disks - Scan /var/lib/backup/status/ directory for all status files - Calculate status icons for backup and disk usage - Aggregate repository status from all disks - Update dashboard to display all backup disks with per-disk status - Display repository list with count and aggregated status	2025-11-29 16:44:50 +01:00
Christoffer Martinsson	5e08b34280	Move C-state name cleaning to agent for smaller JSON All checks were successful Build and Release / build-and-release (push) Successful in 1m32s Details - Agent now extracts "C" + digits pattern (C3, C10) using char parsing - Removes suffixes like "_ACPI", "_MWAIT" at source - Reduces JSON payload size over ZMQ - No regex dependency - uses fast char iteration (~1μs overhead) - Robust fallback to original name if pattern not found - Dashboard simplified to use clean names directly Bump version to v0.1.212 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 14:05:55 +01:00
Christoffer Martinsson	0d8284b69c	Clean C-state display to show only CX format All checks were successful Build and Release / build-and-release (push) Successful in 1m18s Details - Strip suffixes like "_ACPI" from C-state names - Display changes from "C3_ACPI:51%" to "C3:51%" - Cleaner, more concise presentation Bump version to v0.1.211 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 13:34:01 +01:00
Christoffer Martinsson	d84690cb3b	Move transmission interval to ZMQ config section All checks were successful Build and Release / build-and-release (push) Successful in 1m43s Details - Changed code to use zmq.transmission_interval_seconds instead of top-level collection_interval_seconds - Removed collection_interval_seconds from AgentConfig - Updated validation to check zmq.transmission_interval_seconds - Improves config organization by grouping all ZMQ settings together Bump version to v0.1.210 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-29 13:31:39 +01:00
Christoffer Martinsson	7c030b33d6	Show top 3 C-states with usage percentages All checks were successful Build and Release / build-and-release (push) Successful in 1m21s Details - Changed CpuData.cstate from String to Vec<CStateInfo> - Added CStateInfo struct with name and percent fields - Collector calculates percentage for each C-state based on accumulated time - Sorts and returns top 3 C-states by usage - Dashboard displays: "C10:79% C8:10% C6:8%" Provides better visibility into CPU idle state distribution. Bump version to v0.1.209 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-28 23:45:46 +01:00
Christoffer Martinsson	c6817537a8	Replace CPU frequency with C-state monitoring All checks were successful Build and Release / build-and-release (push) Successful in 1m20s Details - Changed CpuData.frequency_mhz to CpuData.cstate (String) - Implemented collect_cstate() to read CPU idle depth from sysfs - Finds deepest C-state with most accumulated time (C0-C10) - Updated dashboard to display C-state instead of frequency - More accurate indicator of CPU activity vs power management Bump version to v0.1.208 Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-28 23:30:14 +01:00
Christoffer Martinsson	5deb8cf8d8	Bump version to v0.1.206 All checks were successful Build and Release / build-and-release (push) Successful in 1m10s Details	2025-11-28 23:07:20 +01:00
Christoffer Martinsson	0e01813ff5	Add service metrics from systemctl (memory, uptime, restarts) Shared: - Add memory_bytes, restart_count, uptime_seconds to ServiceData Agent: - Add new fields to ServiceStatusInfo struct - Fetch MemoryCurrent, NRestarts, ExecMainStartTimestamp from systemctl show - Calculate uptime from start timestamp - Parse and populate new fields in ServiceData - Remove unused load_state and sub_state fields Dashboard: - Add memory_bytes, restart_count, uptime_seconds to ServiceInfo - Update header: Service, Status, RAM, Uptime, ↻ (restarts) - Format memory as MB/GB - Format uptime as Xd Xh, Xh Xm, or Xm - Show restart count with ! prefix if > 0 to indicate instability All metrics obtained from single systemctl show call - zero overhead.	2025-11-28 23:06:13 +01:00
Christoffer Martinsson	c3c9507a42	Bump version to v0.1.205 All checks were successful Build and Release / build-and-release (push) Successful in 1m22s Details	2025-11-28 22:37:28 +01:00
Christoffer Martinsson	4d77ffe17e	Remove RAM and Disk columns from services widget header Changed header from 4 columns to 2 columns: - Before: Service, Status, RAM, Disk - After: Service, Status Matches the removal of memory_mb and disk_gb fields.	2025-11-28 22:37:14 +01:00
Christoffer Martinsson	14f74b4cac	Bump version to v0.1.204 All checks were successful Build and Release / build-and-release (push) Successful in 1m20s Details	2025-11-28 14:33:19 +01:00
Christoffer Martinsson	67b686f8c7	Remove RAM and disk collection for services Complete removal of service resource metrics: Agent: - Remove memory_mb and disk_gb fields from ServiceData struct - Remove get_service_memory_usage() method - Remove get_service_disk_usage() method - Remove get_directory_size() method - Remove unused warn import Dashboard: - Remove memory_mb and disk_gb from ServiceInfo struct - Remove memory/disk display from format_parent_service_line - Remove memory/disk parsing in legacy metric path - Remove unused format_disk_size() function Service resource metrics were slow, unreliable, and never worked properly since structured data migration. Will be handled differently in the future.	2025-11-28 14:25:12 +01:00
Christoffer Martinsson	f94ca60e69	Bump version to v0.1.203 Some checks failed Build and Release / build-and-release (push) Failing after 1m36s Details	2025-11-28 12:53:56 +01:00
Christoffer Martinsson	fe2f604703	Bump version to v0.1.202 All checks were successful Build and Release / build-and-release (push) Successful in 1m8s Details	2025-11-28 12:45:25 +01:00
Christoffer Martinsson	833010e270	Bump version to v0.1.192 All checks were successful Build and Release / build-and-release (push) Successful in 1m8s Details	2025-11-27 18:34:53 +01:00
Christoffer Martinsson	549d9d1c72	Replace whale emoji with ASCII 'D' for performance Emoji rendering in terminals can be very slow, especially when rendered in the hot path (every frame for every docker image). The whale emoji 🐋 was causing significant rendering delays. Temporary change to ASCII 'D' to test if emoji was the performance issue.	2025-11-27 18:34:27 +01:00
Christoffer Martinsson	9b84b70581	Bump version to v0.1.191 All checks were successful Build and Release / build-and-release (push) Successful in 1m8s Details	2025-11-27 18:16:49 +01:00
Christoffer Martinsson	92c3ee3f2a	Add Docker whale icon for docker images Docker images now display with distinctive 🐋 whale icon in blue (highlight color) instead of status icons. This provides clear visual identification that these are docker images while not implying operational status.	2025-11-27 18:16:33 +01:00
Christoffer Martinsson	1be55f765d	Bump version to v0.1.190 All checks were successful Build and Release / build-and-release (push) Successful in 1m9s Details	2025-11-27 18:09:49 +01:00
Christoffer Martinsson	2f94a4b853	Add service_type field to separate data from presentation Changes: - Add service_type field to SubServiceData: 'nginx_site', 'container', 'image' - Agent sends pure data without display formatting - Dashboard checks service_type to decide presentation - Docker images now display without status icon (service_type='image') - Remove unused image_size_str from docker images tuple Clean separation: agent provides data, dashboard handles display logic.	2025-11-27 18:09:20 +01:00
Christoffer Martinsson	ff2b43827a	Bump version to v0.1.189 All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-27 17:57:38 +01:00
Christoffer Martinsson	6bb350f016	Bump version to v0.1.188 All checks were successful Build and Release / build-and-release (push) Successful in 1m8s Details	2025-11-27 16:39:46 +01:00
Christoffer Martinsson	76c04633b5	Bump version to v0.1.187 All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-27 16:34:42 +01:00
Christoffer Martinsson	9a2df906ea	Add ZMQ communication statistics tracking and display All checks were successful Build and Release / build-and-release (push) Successful in 1m10s Details	2025-11-27 16:14:45 +01:00
Christoffer Martinsson	6d6beb207d	Parse Docker image sizes to MB and sort services alphabetically All checks were successful Build and Release / build-and-release (push) Successful in 1m18s Details	2025-11-27 15:57:38 +01:00
Christoffer Martinsson	7a68da01f5	Remove debug logging for NVMe SMART collection All checks were successful Build and Release / build-and-release (push) Successful in 1m9s Details	2025-11-27 15:40:16 +01:00
Christoffer Martinsson	5be67fed64	Add debug logging for NVMe SMART data collection All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-27 15:00:48 +01:00
Christoffer Martinsson	cac836601b	Add NVMe device type flag for SMART data collection All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-27 13:34:30 +01:00
Christoffer Martinsson	bd22ce265b	Use direct smartctl with CAP_SYS_RAWIO instead of sudo All checks were successful Build and Release / build-and-release (push) Successful in 1m9s Details	2025-11-27 13:22:13 +01:00
Christoffer Martinsson	bbc8b7b1cb	Add info-level logging for SMART data collection debugging All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-27 13:15:53 +01:00
Christoffer Martinsson	5dd8cadef3	Remove debug logging from Docker collection code All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details	2025-11-27 12:50:20 +01:00
Christoffer Martinsson	fefe30ec51	Remove sudo from docker commands - use docker group membership instead All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details Agent changes: - Changed docker ps and docker images commands to run without sudo - cm-agent user is already in docker group, so sudo is not needed - Fixes "unable to change to root gid: Operation not permitted" error - Systemd security restrictions were blocking sudo gid changes This fixes Docker container and image collection on systems with systemd security hardening enabled. Updated to version 0.1.178	2025-11-27 12:35:38 +01:00
Christoffer Martinsson	fb40cce748	Add stderr logging for Docker images command failure All checks were successful Build and Release / build-and-release (push) Successful in 1m9s Details Agent changes: - Log stderr output when docker images command fails - This will show the actual error message (e.g., permission denied, docker not found) - Helps diagnose why docker images collection is failing Updated to version 0.1.177	2025-11-27 12:28:55 +01:00

1 2 3 4 5 ...

325 Commits