cm-dashboard

Author	SHA1	Message	Date
Christoffer Martinsson	192eea6e0c	Integrate SnapRAID parity drives into mergerfs pools All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details - Add SnapRAID parity drive detection to mergerfs discovery - Remove Pool Status health line as discussed - Update drive display to always show wear data when available - Include /mnt/parity drives as part of mergerfs pool structure	2025-11-23 18:05:19 +01:00
Christoffer Martinsson	e47803b705	Fix mergerfs pool consolidation and naming All checks were successful Build and Release / build-and-release (push) Successful in 1m18s Details - Improve pool name extraction in dashboard parsing - Use consistent mergerfs pool naming in agent - Add mount_point metric parsing to use actual mount paths - Fix pool consolidation to prevent duplicate entries	2025-11-23 17:35:23 +01:00
Christoffer Martinsson	439d0d9af6	Fix mergerfs numeric reference parsing for proper pool detection All checks were successful Build and Release / build-and-release (push) Successful in 2m11s Details Add support for numeric mergerfs references like "1:2" by mapping them to actual mount points (/mnt/disk1, /mnt/disk2). This enables proper mergerfs pool detection and hides individual member drives as intended.	2025-11-23 17:27:45 +01:00
Christoffer Martinsson	2242b5ddfe	Make mergerfs detection more robust to prevent discovery failures All checks were successful Build and Release / build-and-release (push) Successful in 2m9s Details Skip mergerfs pools with numeric device references (e.g., "1:2") instead of crashing. This allows regular drive detection to work even when mergerfs uses non-standard mount formats. Preserves existing functionality for standard mergerfs setups.	2025-11-23 17:19:15 +01:00
Christoffer Martinsson	9d0f42d55c	Fix filesystem usage_percent parsing and remove hardcoded status All checks were successful Build and Release / build-and-release (push) Successful in 1m8s Details 1. Add missing _fs_ filter to usage_percent parsing in dashboard 2. Fix agent to use calculated fs_status instead of hardcoded Status::Ok This completes the disk collector auto-discovery by ensuring filesystem usage percentages and status indicators display correctly.	2025-11-23 16:47:20 +01:00
Christoffer Martinsson	006f27f7d9	Fix lsblk parsing for filesystem discovery All checks were successful Build and Release / build-and-release (push) Successful in 1m9s Details Remove unused debug code and fix device name parsing to properly handle lsblk tree characters. This resolves the issue where only /boot filesystem was discovered instead of both /boot and /.	2025-11-23 16:09:48 +01:00
Christoffer Martinsson	07422cd0a7	Add debug logging for filesystem discovery All checks were successful Build and Release / build-and-release (push) Successful in 1m18s Details	2025-11-23 15:26:49 +01:00
Christoffer Martinsson	7d96ca9fad	Fix disk collector filesystem discovery with debug logging All checks were successful Build and Release / build-and-release (push) Successful in 1m9s Details Add debug logging to filesystem usage collection to identify why some mount points are being dropped during discovery. This should resolve the issue where total capacity shows incorrect values.	2025-11-23 15:15:56 +01:00
Christoffer Martinsson	1e7f1616aa	Complete disk collector rewrite with clean architecture All checks were successful Build and Release / build-and-release (push) Successful in 2m8s Details Replaced complex disk collector with simple lsblk → df → group workflow. Supports both physical drives and mergerfs pools with unified metrics. Eliminates configuration complexity through pure auto-discovery. - Clean discovery pipeline using lsblk and df commands - Physical drive grouping with filesystem children - MergerFS pool detection with parity heuristics - Unified metric generation for consistent dashboard display - SMART data collection for temperature, wear, and health	2025-11-23 14:22:19 +01:00
Christoffer Martinsson	7a3ee3d5ba	Fix physical drive grouping logic for unified pool visualization All checks were successful Build and Release / build-and-release (push) Successful in 2m11s Details Updated filesystem grouping to use extract_base_device method for proper partition-to-drive mapping. This ensures nvme0n1p1 and nvme0n1p2 are correctly grouped under nvme0n1 drive pool instead of separate pools.	2025-11-23 13:54:33 +01:00
Christoffer Martinsson	d68ecfbc64	Complete unified pool visualization with filesystem children All checks were successful Build and Release / build-and-release (push) Successful in 2m17s Details - Implement filesystem children display under physical drive pools - Agent generates individual filesystem metrics for each mount point - Dashboard parses filesystem metrics and displays as tree children - Add filesystem usage, total, and available space metrics - Support target format: drive info + filesystem children hierarchy - Fix compilation warnings by properly using available_bytes calculation	2025-11-23 12:48:24 +01:00
Christoffer Martinsson	d1272a6c13	Implement unified pool visualization for single drives All checks were successful Build and Release / build-and-release (push) Successful in 1m19s Details - Group single disk filesystems by physical drive during auto-discovery - Create physical drive pools with filesystem children - Display temperature, wear, and health at drive level - Provide consistent hierarchical storage visualization - Fix borrow checker issues in create_physical_drive_pool method - Add PhysicalDrive case to all StoragePoolType match statements	2025-11-23 12:10:42 +01:00
Christoffer Martinsson	33b3beb342	Implement storage auto-discovery system All checks were successful Build and Release / build-and-release (push) Successful in 1m49s Details - Add automatic detection of mergerfs pools by parsing /proc/mounts - Implement smart heuristics for parity disk identification - Store discovered topology at agent startup for efficient monitoring - Eliminate need for manual storage pool configuration - Support zero-config storage visualization with backward compatibility - Clean up mount parsing and remove unused fields	2025-11-23 11:44:57 +01:00
Christoffer Martinsson	f9384d9df6	Implement enhanced storage pool visualization All checks were successful Build and Release / build-and-release (push) Successful in 2m34s Details - Add support for mergerfs pool grouping with data and parity disk separation - Implement pool health monitoring (healthy/degraded/critical status) - Create hierarchical tree view for multi-disk storage arrays - Add automatic pool type detection and member disk association - Maintain backward compatibility for single disk configurations - Support future extension for RAID and ZFS pool types	2025-11-23 11:18:21 +01:00
Christoffer Martinsson	dc1a2e3a0f	Add disk wear monitoring and fix storage overflow display All checks were successful Build and Release / build-and-release (push) Successful in 1m15s Details - Add disk wear percentage collection from SMART data in backup script - Add backup_disk_wear_percent metric to backup collector with thresholds - Display wear percentage in backup widget disk section - Fix storage section overflow handling to use consistent "X more below" logic - Update maintenance mode to return pending status instead of unknown	2025-11-20 20:36:45 +01:00
Christoffer Martinsson	c0f7a97a6f	Remove all scrolling code and user-stopped tracking logic All checks were successful Build and Release / build-and-release (push) Successful in 2m36s Details - Remove scroll offset fields from HostWidgets struct - Replace scrolling with simple "X more below" indicators in all widgets - Remove user-stopped service tracking from agent (now uses SSH control) - Inactive services now consistently show Status::Inactive with empty circles - Simplify widget render methods by removing scroll parameters - Clean up unused imports and legacy scrolling infrastructure - Fix journalctl command to use -fu for proper log following	2025-11-19 08:32:42 +01:00
Christoffer Martinsson	d11aa11f99	Add Status::Inactive for inactive services with empty circle display All checks were successful Build and Release / build-and-release (push) Successful in 1m12s Details - Add new Status::Inactive variant to enum for better service state representation - Agent now assigns Status::Inactive instead of Status::Warning for inactive services - Dashboard displays inactive services with empty circle (○) icon in gray color - User-stopped services still show as Status::Ok with green filled circle - Inactive services treated as OK for host status aggregation - Improves visual clarity between active (●), inactive (○), and warning (◐) states	2025-11-18 17:54:51 +01:00
Christoffer Martinsson	6179bd51a7	Implement WakeOnLAN functionality with simplified configuration All checks were successful Build and Release / build-and-release (push) Successful in 2m32s Details - Add Status::Offline enum variant for disconnected hosts - All configured hosts now always visible showing offline status when disconnected - Add WakeOnLAN support using wake-on-lan Rust crate - Implement w key binding to wake offline hosts with MAC addresses - Simplify configuration to single [hosts] section with MAC addresses only - Change critical status icon from ◯ to ! for better visibility - Add proper MAC address parsing and error handling - Silent WakeOnLAN operation with logging for success/failure Configuration format: [hosts] hostname = { mac_address = "AA:BB:CC:DD:EE:FF" }	2025-10-31 09:03:01 +01:00
Christoffer Martinsson	bd20f0cae1	Fix user-stopped flag timing and service transition handling All checks were successful Build and Release / build-and-release (push) Successful in 2m9s Details Correct user-stopped service behavior during startup transitions: User-Stopped Flag Timing Fix: - Clear user-stopped flag only when service actually becomes active, not when start command succeeds - Remove premature flag clearing from service control handler - Add automatic flag clearing when service status metrics show active state - Services retain user-stopped status during activating/transitioning states Service Transition Handling: - User-stopped services in activating state now report Status::OK instead of Status::Pending - Prevents host warnings during legitimate service startup transitions - Maintains accurate status reporting throughout service lifecycle - Failed service starts preserve user-stopped flags correctly Journalctl Popup Fix: - Fix terminal corruption when using J key for service logs - Correct command quoting to prevent tmux popup interference - Stable popup display without dashboard interface corruption Result: Clean service startup experience with no false warnings and proper user-stopped tracking throughout the entire service lifecycle. Bump version to v0.1.47	2025-10-30 12:05:54 +01:00
Christoffer Martinsson	c56e9d7be2	Implement user-stopped service tracking system All checks were successful Build and Release / build-and-release (push) Successful in 2m34s Details Add comprehensive tracking for services stopped via dashboard to prevent false alerts when users intentionally stop services. Features: - User-stopped services report Status::Ok instead of Warning - Persistent storage survives agent restarts - Dashboard sends UserStart/UserStop commands - Agent tracks and syncs user-stopped state globally - Systemd collector respects user-stopped flags Implementation: - New service_tracker module with persistent JSON storage - Enhanced ServiceAction enum with UserStart/UserStop variants - Global singleton tracker accessible by collectors - Service status logic updated to check user-stopped flag - Dashboard version now uses CARGO_PKG_VERSION automatically Bump version to v0.1.43	2025-10-30 10:42:56 +01:00
Christoffer Martinsson	c8f800a1e5	Implement git commit hash tracking for build display All checks were successful Build and Release / build-and-release (push) Successful in 1m24s Details - Add get_git_commit() method to read /var/lib/cm-dashboard/git-commit - Replace NixOS build version with actual git commit hash - Show deployed commit hash as 'Build:' value for accurate tracking - Enable verification of which exact commit is deployed per host - Update version to 0.1.42	2025-10-29 15:29:02 +01:00
Christoffer Martinsson	6509a2b91a	Make nginx site latency thresholds configurable and simplify status logic All checks were successful Build and Release / build-and-release (push) Successful in 4m25s Details - Replace hardcoded 500ms/2000ms thresholds with configurable nginx_latency_critical_ms - Simplify status logic to only OK or Critical (no Warning status) - Add validation for nginx latency threshold configuration - Re-enable nginx site collection with configurable thresholds - Resolves issue where sites showed critical at 2000ms despite 30s timeout setting - Bump version to v0.1.38	2025-10-28 21:24:34 +01:00
Christoffer Martinsson	e890c5e810	Fix service status detection with combined discovery and status approach All checks were successful Build and Release / build-and-release (push) Successful in 2m9s Details Enhanced service discovery to properly show status for all services: Changes: - Use systemctl list-unit-files for complete service discovery (finds all services) - Use systemctl list-units --all for batch runtime status fetching - Combine both datasets to get comprehensive service list with correct status - Services found in unit-files but not runtime are marked as inactive (Warning status) - Eliminates 'unknown' status issue while maintaining complete service visibility Now inactive services show as Warning (yellow ◐) and active services show as Ok (green ●) instead of all services showing as unknown (? icon).	2025-10-28 15:56:47 +01:00
Christoffer Martinsson	078c30a592	Fix service discovery to show all configured services regardless of state All checks were successful Build and Release / build-and-release (push) Successful in 2m7s Details Changed service discovery from 'systemctl list-units --all' to 'systemctl list-unit-files' to ensure ALL service unit files are discovered, including services that have never been started. Changes: - Updated systemctl command to use list-unit-files instead of list-units --all - Modified parsing logic to handle unit file format (2 fields vs 4 fields) - Set placeholder values in discovery cache, actual runtime status fetched during collection - This ensures all configured services (like inactive ARK servers) appear in dashboard The issue was that list-units --all only shows services systemd has loaded/attempted to load, but list-unit-files shows ALL service unit files regardless of their runtime state.	2025-10-28 15:41:58 +01:00
Christoffer Martinsson	2910b7d875	Update version to 0.1.22 and fix system metric status calculation All checks were successful Build and Release / build-and-release (push) Successful in 1m11s Details - Fix /tmp usage status to use proper thresholds instead of hardcoded Ok status - Fix wear level status to use configurable thresholds instead of hardcoded values - Add dedicated tmp_status field to SystemWidget for proper /tmp status display - Remove host-level hourglass icon during service operations - Implement immediate service status updates after start/stop/restart commands - Remove active users display and collection from NixOS section - Fix immediate host status aggregation transmission to dashboard	2025-10-28 13:21:56 +01:00
Christoffer Martinsson	91f037aa3e	Update to v0.1.19 with event-driven status aggregation All checks were successful Build and Release / build-and-release (push) Successful in 2m4s Details Major architectural improvements: CORE CHANGES: - Remove notification_interval_seconds - status aggregation now immediate - Status calculation moved to collection phase instead of transmission - Event-driven transmission triggers immediately on status changes - Dual transmission strategy: immediate on change + periodic backup - Real-time notifications without batching delays TECHNICAL IMPROVEMENTS: - process_metric() now returns bool indicating status change - Immediate ZMQ broadcast when status changes detected - Status aggregation happens during metric collection, not later - Legacy get_nixos_build_info() method removed (unused) - All compilation warnings fixed BEHAVIOR CHANGES: - Critical alerts sent instantly instead of waiting for intervals - Dashboard receives real-time status updates - Notifications triggered immediately on status transitions - Backup periodic transmission every 1s ensures heartbeat This provides much more responsive monitoring with instant alerting while maintaining the reliability of periodic transmission as backup.	2025-10-28 10:36:34 +01:00
Christoffer Martinsson	627c533724	Update to v0.1.18 with per-collector intervals and tmux check All checks were successful Build and Release / build-and-release (push) Successful in 2m7s Details - Implement per-collector interval timing respecting NixOS config - Remove all hardcoded timeout/interval values and make configurable - Add tmux session requirement check for TUI mode (bypassed for headless) - Update agent to send config hash in Build field instead of nixos version - Add nginx check interval, HTTP timeouts, and ZMQ transmission interval configs - Update NixOS configuration with new configurable values Breaking changes: - Build field now shows nix store config hash (8 chars) instead of nixos version - All intervals now follow individual collector configuration instead of global New configuration fields: - systemd.nginx_check_interval_seconds - systemd.http_timeout_seconds - systemd.http_connect_timeout_seconds - zmq.transmission_interval_seconds	2025-10-28 10:08:25 +01:00
Christoffer Martinsson	8dffe18a23	Improve SATA SSD wear level calculation Some checks failed Build and Release / build-and-release (push) Failing after 1m24s Details - Support multiple SATA SSD wear attributes (SSD_Life_Left, Media_Wearout_Indicator, etc.) - Handle manufacturer differences in wear reporting - Proper parsing of SMART table format with VALUE column - Covers Samsung, Intel, Crucial and other common SSD types - NVMe Percentage Used support maintained	2025-10-25 22:32:09 +02:00
Christoffer Martinsson	0c544753f9	Move SMART configuration into disk config - Consolidate SMART thresholds into DiskConfig structure - Remove separate SmartConfig - disk collector handles all drive data - Update NixOS configuration to use disk.temperature_* settings - Remove hardcoded temperature thresholds in disk collector - Logical grouping: disk collector owns all disk/drive configuration	2025-10-25 22:29:26 +02:00
Christoffer Martinsson	c8e26b9bac	Remove redundant smart collector - consolidate SMART into disk collector - Remove separate smart collector implementation - Disk collector already handles SMART data for drives - Eliminates duplicate smartctl calls causing performance issues - SMART functionality remains in logical place with disk monitoring - Fixes infinite smartctl loop issue	2025-10-25 22:25:22 +02:00
Christoffer Martinsson	9160fac80b	Fix smart collector compilation errors - Update to match current Metric structure - Use correct Status enum and collector interface - Fix MetricValue types and constructor usage - Builds successfully with warnings only	2025-10-25 17:13:04 +02:00
Christoffer Martinsson	83cb43bcf1	Restore missing smart collector implementation Some checks failed Build and Release / build-and-release (push) Failing after 1m24s Details - Rewrite smart collector to match current architecture - Add back to mod.rs exports - Fixes infinite smartctl loop issue - Uses simple health and temperature monitoring	2025-10-25 16:59:09 +02:00
Christoffer Martinsson	4b54a59e35	Remove unused code and eliminate compiler warnings - Remove unused fields from CommandStatus variants - Clean up unused methods and unused collector fields - Fix lifetime syntax warning in SystemWidget - Delete unused cache module completely - Remove redundant render methods from widgets All agent and dashboard warnings eliminated while preserving panel switching and scrolling functionality.	2025-10-25 14:15:52 +02:00
Christoffer Martinsson	fb6ee6d7ae	Fix config hash to show actual deployed nix store hash - Replace git commit hash with nix store hash extraction - Read from /run/current-system symlink target - Extract first 8 characters of nix store hash: d8ivwiar - Shows actual deployed configuration, not just source - Enables proper rebuild completion detection - Accurate deployment verification	2025-10-25 12:22:17 +02:00
Christoffer Martinsson	2d3844b5dd	Add configuration hash display to system panel - Collect config hash from cloned nixos-config git repository - Display "Config: xxxxx" after "Build: xxxxx" in NixOS section - Uses /var/lib/cm-dashboard/nixos-config directory - Shows actual configuration hash vs nixpkgs build hash	2025-10-25 01:30:46 +02:00
Christoffer Martinsson	967244064f	Fix command execution permissions and eliminate backup error spam - Add sudo permissions for systemctl and nixos-rebuild commands - Use sudo in agent command execution for proper privileges - Fix backup collector to handle missing status files gracefully - Eliminate backup error spam when no backup system is configured	2025-10-23 23:07:52 +02:00
Christoffer Martinsson	d193b90ba1	Fix device detection to properly parse lsblk output - Handle lsblk tree symbols (├─, └─) in device parsing - Extract base device names from partitions (nvme0n1p2 -> nvme0n1) - Support both NVMe and traditional device naming schemes - Fixes missing device lines in storage display	2025-10-23 19:16:33 +02:00
Christoffer Martinsson	ad298ac70c	Fix device detection, tree indentation, and hide Single storage type - Replace findmnt with lsblk for efficient device name detection - Fix tree indentation to align consistently with status icon text - Hide '(Single)' label for single disk storage pools - Device detection returns actual names (nvme0n1, sda) not UUID paths	2025-10-23 19:06:52 +02:00
Christoffer Martinsson	9f34c67bfa	Fix debug log reference to removed underlying_devices field	2025-10-23 18:56:16 +02:00
Christoffer Martinsson	5134c5320a	Fix disk collector to use dynamic device detection - Remove underlying_devices field from FilesystemConfig - Add device detection at startup using findmnt command - Store detected devices in HashMap for reuse during collection - Keep all existing functionality (StoragePool, DriveInfo, SMART data) - Detect devices only once at initialization, not every collection cycle - Fixes agent startup failure due to missing underlying_devices config	2025-10-23 18:50:40 +02:00
Christoffer Martinsson	c5ec529210	Add agent hash display to system panel Implement agent version tracking to diagnose deployment issues: - Add get_agent_hash() method to extract Nix store hash from executable path - Collect system_agent_hash metric in NixOS collector - Display "Agent Hash" in system panel under NixOS section - Update metric filtering to include agent hash This helps identify which version of the agent is actually running when troubleshooting deployment or metric collection issues.	2025-10-23 17:33:45 +02:00
Christoffer Martinsson	3b1bda741b	Remove codename from NixOS build display - Strip codename part (e.g., '(Warbler)') from nixos-version output - Display clean version format: '25.05.20251004.3bcc93c' - Simplify parsing to use raw nixos-version output as requested	2025-10-23 14:55:18 +02:00
Christoffer Martinsson	64af24dc40	Update NixOS display format to show build hash and timestamp - Change from showing version to build format: 'hash dd/mm/yy H:M:S' - Parse nixos-version output to extract short hash and format date - Update system widget to display 'Build:' instead of 'Version:' - Remove version/build_date fields in favor of single build string - Follow TODO.md specification for NixOS section layout	2025-10-23 14:48:25 +02:00
Christoffer Martinsson	9e80d6b654	Remove hardcoded /tmp autodetection and implement proper tmpfs monitoring - Remove /tmp autodetection from disk collector (57 lines removed) - Add tmpfs monitoring to memory collector with get_tmpfs_metrics() method - Generate memory_tmp_* metrics for proper RAM-based tmpfs monitoring - Fix type annotations in tmpfs parsing for compilation - System widget now correctly displays tmpfs usage in RAM section	2025-10-23 14:26:15 +02:00
Christoffer Martinsson	39fc9cd22f	Implement unified system widget with NixOS info, CPU, RAM, and Storage - Create NixOS collector for version and active users detection - Add SystemWidget combining all system information in TODO.md layout - Replace separate CPU/Memory widgets with unified system display - Add tree structure for storage with drive temperature/wear info - Support NixOS version, active users, load averages, memory usage - Follow exact decimal formatting from specification	2025-10-23 14:01:14 +02:00
Christoffer Martinsson	c99e0bd8ee	Remove hardcoded discovery interval in systemd collector - Use config.interval_seconds instead of hardcoded 300 seconds - Discovery now happens every 10 seconds (configurable) instead of 5 minutes - Follows configuration-driven architecture requirements	2025-10-23 13:20:48 +02:00
Christoffer Martinsson	0f12438ab4	Fix RwLock deadlock in systemd collector Phase 4 - Restructure get_monitored_services to avoid nested write locks - Split discover_services into discover_services_internal that returns data - Update state in separate scope to prevent deadlock - Fix borrow checker errors with clone() for status cache	2025-10-23 13:12:53 +02:00
Christoffer Martinsson	7607e971b8	Add debug logging to diagnose Phase 4 service discovery issue Add detailed debug logging to track: - Service discovery start - Individual service parsing - Final service count and list - Empty results indication This will help identify why cmbox disappeared from dashboard.	2025-10-23 12:57:10 +02:00
Christoffer Martinsson	da6f3c3855	Phase 4: Cache service status from discovery to eliminate per-service calls Major performance optimization: - Parse and cache service status during discovery from systemctl list-units - Eliminate per-service systemctl is-active and show calls - Reduce systemctl calls from 1+2N to just 1 call total - For 10 services: 21 calls → 1 call (95% reduction) - Add fallback to systemctl for cache misses This completes the major systemctl call reduction goal from TODO.md.	2025-10-23 12:51:17 +02:00
Christoffer Martinsson	174b27f31a	Phase 3: Add wildcard support for service pattern matching Implement glob pattern matching for service filters: - nginx* matches nginx, nginx-config-reload, etc. - backup matches any service ending with 'backup' - dockerprune matches docker-weekly-prune, etc. - Exact matches still work as before (backward compatible) Addresses TODO.md requirement for '*' filtering support.	2025-10-23 12:37:16 +02:00

1 2 3

150 Commits