cm-dashboard

Author	SHA1	Message	Date
Christoffer Martinsson	14aae90954	Fix storage display and improve UI formatting - Fix duplicate storage pool issue by clearing cache on agent startup - Change storage pool header text to normal color for better readability - Improve services panel tree icons with proper └─ symbols for last items - Ensure fresh metrics data on each agent restart	2025-10-22 23:02:16 +02:00
Christoffer Martinsson	52d630a2e5	Remove legacy indexed disk metrics parsing Eliminate duplicate storage entries by removing old disk_count dependency. Dashboard now uses pure auto-discovery of disk_{pool}_usage_percent metrics. Fixes multiple storage instances (Storage 0, Storage 1, Storage root) showing only proper tree structure format.	2025-10-22 21:27:11 +02:00
Christoffer Martinsson	b1f294cf2f	Implement storage widget tree structure with themed status icons Add proper hierarchical tree display for storage pools and drives: - Pool headers with status icons and type indication (Single/multi-drive) - Individual drive lines with ├─ tree symbols and health status - Usage summary with └─ end symbol and capacity status - T: and W: prefixes for temperature and wear level metrics - Themed status icons using StatusIcons::get_icon() with proper colors - 2-space indentation for clean tree structure appearance Replace flat storage display with beautiful tree format: ● Storage steampool (multi-drive): ├─ ● sdb T:35°C W:12% ├─ ● sdc T:38°C W:8% └─ ● 78.1% 1250.3GB/1600.0GB Uses agent-calculated status from NixOS-configured thresholds. Update CLAUDE.md with complete implementation specification.	2025-10-22 21:17:33 +02:00
Christoffer Martinsson	1591565b1b	Update storage widget for enhanced disk collector metrics Restructure storage display to handle new individual metrics architecture: - Parse disk_{pool}_* metrics instead of indexed disk_{index}_* format - Support individual drive metrics disk_{pool}_{drive}_health/temperature/wear - Display tree structure: "Storage {pool} ({type}): drive details" - Show pool usage summary with individual drive health/temp/wear status - Auto-discover storage pools and drives from metric patterns - Maintain proper status aggregation from individual metrics The dashboard now correctly displays the new enhanced disk collector output with storage pools containing multiple drives and their individual metrics.	2025-10-22 20:40:24 +02:00
Christoffer Martinsson	a6c2983f65	Add automatic config file detection for dashboard TUI - Dashboard now automatically looks for /etc/cm-dashboard/dashboard.toml - No need to specify --config flag when using standard NixOS deployment - Fallback to manual config path if default not found - Update help text to reflect optional config parameter - Simplifies dashboard usage - just run 'cm-dashboard' without arguments	2025-10-21 22:11:35 +02:00
Christoffer Martinsson	3d2b37b26c	Remove hardcoded defaults and migrate dashboard config to NixOS - Remove all unused configuration options from dashboard config module - Eliminate hardcoded defaults - dashboard now requires config file like agent - Keep only actually used config: zmq.subscriber_ports and hosts.predefined_hosts - Remove unused get_host_metrics function from metric store - Clean up missing module imports (hosts, utils) - Make dashboard fail fast if no configuration provided - Align dashboard config approach with agent configuration pattern	2025-10-21 21:54:23 +02:00
Christoffer Martinsson	a6d2a2f086	Code cleanup	2025-10-21 21:19:21 +02:00
Christoffer Martinsson	338c4457a5	Remove legacy notification code and fix all warnings	2025-10-21 19:48:55 +02:00
Christoffer Martinsson	f4b5bb814d	Fix dashboard UI: correct pending color (blue) and use host_status_summary metric	2025-10-21 19:32:37 +02:00
Christoffer Martinsson	98e3ecb0ea	Clean up warnings and add Status::Pending support to dashboard UI	2025-10-21 18:27:11 +02:00
Christoffer Martinsson	eb268922bd	Remove all unused code and fix build warnings - Remove unused struct fields: tier, config_name, last_collection_time - Remove unused structs: PerformanceMetrics, PerfMonitor - Remove unused methods: get_performance_metrics, get_collector_names, get_stats - Remove unused utility functions and system helpers - Remove unused config fields from CPU and Memory collectors - Keep config fields that are actually used (DiskCollector, etc.) - Remove unused proxy_pass_url variable and assignments - Fix duplicate hostname variable declaration - Achieve zero build warnings without functionality changes	2025-10-20 20:20:47 +02:00
Christoffer Martinsson	00a8ed3da2	Implement hysteresis for metric status changes to prevent flapping Add comprehensive hysteresis support to prevent status oscillation near threshold boundaries while maintaining responsive alerting. Key Features: - HysteresisThresholds with configurable upper/lower limits - StatusTracker for per-metric status history - Default gaps: CPU load 10%, memory 5%, disk temp 5°C Updated Components: - CPU load collector (5-minute average with hysteresis) - Memory usage collector (percentage-based thresholds) - Disk temperature collector (SMART data monitoring) - All collectors updated to support StatusTracker interface Cache Interval Adjustments: - Service status: 60s → 10s (faster response) - Disk usage: 300s → 60s (more frequent checks) - Backup status: 900s → 60s (quicker updates) - SMART data: moved to 600s tier (10 minutes) Architecture: - Individual metric status calculation in collectors - Centralized StatusTracker in MetricCollectionManager - Status aggregation preserved in dashboard widgets	2025-10-20 18:45:41 +02:00
Christoffer Martinsson	2ccfc4256a	Fix nginx monitoring and services panel alignment - Add support for both proxied and static nginx sites - Proxied sites show 'P' prefix and check backend URLs - Static sites check external HTTPS URLs - Fix services panel column alignment for main services - Keep 10-second timeout for all site checks	2025-10-20 14:56:26 +02:00
Christoffer Martinsson	ecaf3aedb5	Add space between archive count and 'archives' in backup panel	2025-10-20 13:24:23 +02:00
Christoffer Martinsson	959745b51b	Fix host navigation to work with alphabetical host ordering - Fix host_index calculation for localhost to use actual position in sorted list - Remove incorrect assumption that localhost is always at index 0 - Host navigation (Tab key) now works correctly with all hosts in alphabetical order Fixes issue where only 3 of 5 hosts were accessible via Tab navigation.	2025-10-20 13:12:39 +02:00
Christoffer Martinsson	d349e2742d	Fix dashboard title host ordering to use alphabetical sort - Remove predefined host order that was causing random display order - Sort hosts alphabetically for consistent title display - Localhost is still auto-selected at startup but doesn't affect display order - Title will now show: cmbox ● labbox ● simonbox ● srv01 ● srv02 ● steambox Eliminates confusing random host order in dashboard title bar.	2025-10-20 13:07:10 +02:00
Christoffer Martinsson	d4531ef2e8	Hide backup panel when no backup data is present - Add has_data() method to BackupWidget to check if backup metrics exist - Modify dashboard layout to conditionally show backup panel only when data exists - When no backup data: system panel takes full left side height - When backup data exists: system and backup panels share left side equally Prevents empty backup panel from taking up screen space unnecessarily.	2025-10-20 13:01:42 +02:00
Christoffer Martinsson	8023da2c1e	Fix dashboard disk widget flickering by sorting disks consistently - Sort physical devices by name to prevent random HashMap iteration order - Sort partitions within each device by disk index for consistency - Eliminates flickering caused by disks changing positions randomly The dashboard storage section now maintains stable disk order across updates.	2025-10-20 11:25:45 +02:00
Christoffer Martinsson	ca160c9627	Fix tab navigation to respect user choice and prevent jumping back to localhost - Add user_navigated_away flag to track manual navigation - Only auto-switch to localhost if user hasn't manually navigated away - Reset flag when host disconnects to allow auto-selection - Preserves user's tab navigation choices while still prioritizing localhost initially	2025-10-19 11:21:59 +02:00
Christoffer Martinsson	bf2f066029	Fix localhost prioritization to always switch when localhost connects - Dashboard now switches to localhost even if another host is already selected - Ensures localhost is always preferred regardless of connection order - Resolves issue where srv01 connecting first would prevent localhost selection	2025-10-19 11:12:05 +02:00
Christoffer Martinsson	07633e4e0e	Implement localhost prioritization and status display in dashboard - Always select localhost as default host at startup - Order hosts with localhost first, then predefined sequence - Display hostname status colors in title bar based on metric aggregation - Add gethostname dependency for localhost detection	2025-10-19 10:56:42 +02:00
Christoffer Martinsson	0141a6e111	Remove unused code and eliminate build warnings Removed unused widget subscription system, cache utilities, error variants, theme functions, and struct fields. Replaced subscription-based widgets with direct metric filtering. Build now completes with zero warnings.	2025-10-18 23:50:15 +02:00
Christoffer Martinsson	7f85a6436e	Clean up unused imports and fix build warnings - Remove unused imports (Duration, HashMap, SharedError, DateTime, etc.) - Fix unused variables by prefixing with underscore - Remove redundant dashboard.toml config file - Update theme imports to use only needed components - Maintain all functionality while reducing warnings - Add srv02 to predefined hosts configuration - Remove unused broadcast_command methods	2025-10-18 23:12:07 +02:00
Christoffer Martinsson	8cf8d37556	Add srv02 to predefined host list	2025-10-18 20:43:25 +02:00
Christoffer Martinsson	792ad066c9	Fix per-host widget cache to prevent overwriting cached data Only update widgets when metrics are available for the current host, preventing immediate overwrite of cached widget states when switching hosts.	2025-10-18 20:20:58 +02:00
Christoffer Martinsson	4b7d08153c	Implement per-host widget cache for instant host switching Resolves widget data persistence issue where switching hosts left stale data from the previous host displayed in widgets. Key improvements: - Add Clone derives to all widget structs (CpuWidget, MemoryWidget, ServicesWidget, BackupWidget) - Create HostWidgets struct to cache widget states per hostname - Update TuiApp with HashMap<String, HostWidgets> for per-host storage - Fix borrowing issues by cloning hostname before mutable self borrow - Implement instant widget state restoration when switching hosts Tab key host switching now displays cached widget data for each host without stale information persistence between switches.	2025-10-18 19:54:08 +02:00
Christoffer Martinsson	46cc813a68	Implement Tab key host switching functionality - Add KeyCode::Tab support to main dashboard event loop - Add Tab key handling to TuiApp handle_input method - Tab key now cycles to next host using existing navigate_host logic - Host switching infrastructure was already implemented, just needed Tab key support - Current host displayed in bold in title bar, other hosts shown normally - Metrics filtered by selected host, full navigation working	2025-10-18 19:26:58 +02:00
Christoffer Martinsson	125111ee99	Implement comprehensive backup monitoring and fix timestamp issues - Add BackupCollector for reading TOML status files with disk space metrics - Implement BackupWidget with disk usage display and service status details - Fix backup script disk space parsing by adding missing capture_output=True - Update backup widget to show actual disk usage instead of repository size - Fix timestamp parsing to use backup completion time instead of start time - Resolve timezone issues by using UTC timestamps in backup script - Add disk identification metrics (product name, serial number) to backup status - Enhance UI layout with proper backup monitoring integration	2025-10-18 18:33:41 +02:00
Christoffer Martinsson	8a36472a3d	Implement real-time process monitoring and fix UI hardcoded data This commit addresses several key issues identified during development: Major Changes: - Replace hardcoded top CPU/RAM process display with real system data - Add intelligent process monitoring to CpuCollector using ps command - Fix disk metrics permission issues in systemd collector - Optimize service collection to focus on status, memory, and disk only - Update dashboard widgets to display live process information Process Monitoring Implementation: - Added collect_top_cpu_process() and collect_top_ram_process() methods - Implemented ps-based monitoring with accurate CPU percentages - Added filtering to prevent self-monitoring artifacts (ps commands) - Enhanced error handling and validation for process data - Dashboard now shows realistic values like "claude (PID 2974) 11.0%" Service Collection Optimization: - Removed CPU monitoring from systemd collector for efficiency - Enhanced service directory permission error logging - Simplified services widget to show essential metrics only - Fixed service-to-directory mapping accuracy UI and Dashboard Improvements: - Reorganized dashboard layout with btop-inspired multi-panel design - Updated system panel to include real top CPU/RAM process display - Enhanced widget formatting and data presentation - Removed placeholder/hardcoded data throughout the interface Technical Details: - Updated agent/src/collectors/cpu.rs with process monitoring - Modified dashboard/src/ui/mod.rs for real-time process display - Enhanced systemd collector error handling and disk metrics - Updated CLAUDE.md documentation with implementation details	2025-10-16 23:55:05 +02:00
Christoffer Martinsson	7a664ef0fb	Remove refresh functionality that causes dashboard to hang - Remove 'r' key handler that was causing hang on refresh - Remove RefreshRequested event and check_refresh_request method - Remove send_refresh_commands function and ZMQ command protocol - Remove refresh_requested field from App struct - Clean up status line text (refresh -> tick) The refresh functionality was causing the dashboard to become unresponsive when pressing 'r' key. This removes all refresh-related code to fix the issue.	2025-10-16 01:00:39 +02:00
Christoffer Martinsson	6bc7f97375	Add refresh shortkey 'r' for on-demand metrics refresh Implements ZMQ command protocol for dashboard-to-agent communication: - Agents listen on port 6131 for REQ/REP commands - Dashboard sends "refresh" command when 'r' key is pressed - Agents force immediate collection of all metrics via force_refresh_all() - Fresh data is broadcast immediately to dashboard - Updated help text to show "r: Refresh all metrics" Also includes metric-level caching architecture foundation for future granular control over individual metric update frequencies.	2025-10-15 22:30:04 +02:00
Christoffer Martinsson	efdd713f62	Improve dashboard display and fix service issues - Remove unreachable descriptions from failed nginx sites - Show complete site URLs instead of truncating at first dot - Implement service-specific disk quotas (docker: 4GB, immich: 4GB, others: 1-2GB) - Truncate process names to show only executable name without full path - Display only highest C-state instead of all C-states for cleaner output - Format system RAM as xxxMB/GB (totalGB) to match services format	2025-10-15 09:36:03 +02:00
Christoffer Martinsson	a64464142c	Remove nginx site accessibility filtering to monitor all sites - Remove check_site_accessibility function and filtering logic - Monitor ALL nginx sites from config regardless of current status - Site status determined by measure_site_latency, not accessibility filter - Fixes missing git.cmtec.se when backend is down (502 errors) - Sites with errors now show as failed instead of being filtered out	2025-10-14 22:46:06 +02:00
Christoffer Martinsson	0cb69ea8fa	Consolidate HTTP checking and improve display formatting - Change site latency timeout from 5s to 2s for faster error detection - Replace curl with reqwest for external connectivity checks (consistent timeouts) - Remove unused gitea-specific monitoring functionality - Update dashboard: show 'unreachable' for latency > 2000ms, add arrows (→) between site and latency - Add percentage signs to CPU metrics display - All HTTP requests now use reqwest with 2-second timeouts	2025-10-14 22:24:22 +02:00
Christoffer Martinsson	f3b6d12f68	Add top CPU and RAM process monitoring to System widget - Implement get_top_cpu_process() and get_top_ram_process() functions in SystemCollector - Add top_cpu_process and top_ram_process fields to SystemSummary data structure - Update System widget to display top processes as description rows - Show process name and percentage usage for highest CPU and RAM consumers - Skip kernel threads and filter out processes with minimal usage (<0.1%)	2025-10-14 21:47:52 +02:00
Christoffer Martinsson	77795c44d3	Implement nginx site status monitoring with unreachable detection - Show 'unreachable' status for nginx sites that fail connection tests - Set service status to error (red) for unreachable sites - Display latency in milliseconds for responsive sites - Properly count failed sites in service summary statistics - Improve nginx site monitoring reliability and visibility	2025-10-14 20:19:39 +02:00
Christoffer Martinsson	fd8aa0678e	Implement nginx site latency monitoring and improve disk usage display Agent improvements: - Add reqwest dependency for HTTP latency testing - Implement measure_site_latency() function for nginx sites - Add latency_ms field to ServiceData structure - Measure response times for nginx sites using HEAD requests - Handle connection failures gracefully with 5-second timeout - Use HTTPS for external sites, HTTP for localhost Dashboard improvements: - Add latency_ms field to ServiceInfo structure - Display latency for nginx sites: "docker.cmtec.se 134ms" - Only show latency for nginx sub-services, not other services - Change disk usage "0" to "<1MB" for better readability The Services widget now shows: - Nginx sites with response times when measurable - Cleaner disk usage formatting for small values - Improved user experience with meaningful latency data	2025-10-14 19:38:36 +02:00
Christoffer Martinsson	c6e8749ddd	Implement logged-in users monitoring and improve widget formatting Agent improvements: - Add get_logged_in_users() function to SystemCollector using 'who' command - Collect unique, sorted list of currently logged-in users - Include logged_in_users field in system metrics JSON output - Change C-state formatting to show 2 states per row instead of 4 Dashboard improvements: - Update Backups widget to show "Archives: XX, ..." format - System widget ready to display logged-in users with proper formatting The System widget will now show: - C-states formatted as 2 per row for better readability - Logged-in users displayed as "Logged in: user" or "Logged in: X users (user1, user2)"	2025-10-14 19:23:26 +02:00
Christoffer Martinsson	1ee398e648	Improve widget formatting and add logged-in users support Services widget: - Fix disk quota formatting with proper rounding instead of truncation - Remove decimals from RAM quotas and use GB instead of G - Change quota display to use GB consistently Backups widget: - Change GiB to GB for consistency - Remove spaces between numbers and units - Update disk usage format to match other widgets: used (totalGB) - Remove percentage display for cleaner format System widget: - Add support for logged-in users in description lines - Format C-states with "C-State:" prefix on first line, indent subsequent lines - Add logged_in_users field to SystemSummary data structure Documentation: - Add example hash error output to NixOS update instructions	2025-10-14 18:59:31 +02:00
Christoffer Martinsson	3e5e91f078	Remove SB column and improve widget formatting Services widget: - Remove SB (sandbox) column and related formatting function - Fix quota formatting to show decimals when needed (1.5G not 1G) - Remove spaces in unit display (128MB not 128 MB) Storage widget: - Change usage format to 23GB (932GB) for better readability Documentation: - Add NixOS configuration update process to CLAUDE.md	2025-10-14 18:40:12 +02:00
Christoffer Martinsson	b0d3d85fb9	Improve services widget column headers and value formatting - Update column headers to be more concise: RAM (GB) → RAM, CPU (%) → CPU, Disk (GB) → Disk - Change sandbox column "no(ok)" to "-" for excluded services - Implement smart unit formatting for memory and disk values (kB/MB/GB) - Display quotas as (XG) format without decimals when limits exist - Add format_bytes() helper for consistent unit display across metrics	2025-10-14 18:21:45 +02:00
Christoffer Martinsson	c6d5a3f2a5	Add sandbox exclusion list for system services Implement exclusion list for services that don't require sandboxing due to their nature (SSH, Docker, system services). These services now show "no(ok)" in SB column and maintain green status instead of warning. Changes: - Add is_sandbox_excluded field to ServiceData and ServiceInfo structs - Add is_sandbox_excluded() method with system service exclusions: - sshd/ssh (needs system access for auth/shell) - docker (needs broad system access) - systemd services, dbus, NetworkManager, etc. - Update status determination to accept excluded services as ok - Update format_sandbox_value to show "no(ok)" for excluded services - Update all ServiceData constructors with exclusion field Service status logic: - Sandboxed: Status=Running, SB="yes" - Excluded: Status=Running, SB="no(ok)" - Should be sandboxed but isn't: Status=Degraded, SB="no" This provides clear distinction between services that legitimately don't need sandboxing vs. those requiring security attention.	2025-10-14 11:35:42 +02:00
Christoffer Martinsson	4fa2b079f1	Add sandbox column and security-based service status Add new "SB" column to services widget showing systemd sandboxing status. Service status now reflects security posture with unsandboxed services showing as degraded/warning status. Changes: - Add is_sandboxed field to ServiceData and ServiceInfo structs - Add check_service_sandbox method detecting systemd hardening features - Add format_sandbox_value function showing "yes"/"no" for sandboxing - Update service status determination to consider sandbox status: - Sandboxed + Running = "Running" (green/ok) - Unsandboxed + Running = "Degraded" (yellow/warning) - Failed services = "Stopped" (red/critical) - Add "SB" column header to services widget Services without proper NixOS hardening (PrivateTmp, ProtectSystem, etc.) now show warning status to highlight security concerns.	2025-10-14 11:18:07 +02:00
Christoffer Martinsson	17dda1ae67	Implement proper disk and memory quota detection Replace misleading system total quotas with actual service-specific quota detection. Services now only show quotas when real limits exist. Changes: - Add get_service_disk_quota method with filesystem quota detection - Add check_filesystem_quota and docker storage quota helpers - Remove automatic assignment of system totals as fake quotas - Update dashboard formatting to show usage only when no quota exists Display behavior: - Services with real limits: "2.1/8.0" (usage/quota) - Services without limits: "2.1" (usage only) This provides accurate monitoring instead of misleading system capacity values that suggested all services had massive quotas.	2025-10-14 11:01:04 +02:00
Christoffer Martinsson	8de3d2ba79	Clean up services widget column headers and units Standardize all services widget columns to show units in headers and remove units from metric values for cleaner display. Changes: - Update column headers: "RAM (GB)", "CPU (%)", "Disk (GB)" - Remove units from metric values: - RAM: "5.2/32.0" (no GB) - CPU: "2.5" (no %) - Disk: "1.5/500.0" (no GB) - Simplify disk formatting to always show GB format All columns now consistently display units in headers with clean, uncluttered metric values.	2025-10-14 10:36:38 +02:00
Christoffer Martinsson	4be1223a8d	Update services widget memory display and formatting Improve services widget to show consistent usage/total format for both RAM and Disk columns, using system totals when no service quotas exist. Changes: - Change column header from "Memory (GB)" to "RAM (GB)" - Remove "GB" units from memory values (units now in header) - Add system memory total detection from /proc/meminfo - Use system memory total as default quota for services without limits - Services now show "5.2/32.0" format for both RAM and disk Both RAM and Disk columns now consistently display usage/quota format where quota is either service-specific limit or system total capacity.	2025-10-14 10:23:16 +02:00
Christoffer Martinsson	630d2ff674	Add disk quota display to services widget Implement disk quota/total display in services widget showing usage/quota format. When services don't have specific disk quotas configured, use system total disk capacity as the quota value. Changes: - Add disk_quota_gb field to ServiceData struct in agent - Add disk_quota_gb field to ServiceInfo struct in dashboard - Update format_disk_value to show usage/quota format - Use system disk total capacity as default quota for services - Rename DiskUsage.total_gb to total_capacity_gb for clarity Services will now display disk usage as "5.2/500.0 GB" format where 500.0 GB is either the service's specific quota or system total capacity.	2025-10-14 10:14:24 +02:00
Christoffer Martinsson	dca3642e46	Implement multi-host autoconnect with consolidated host configuration - Add DEFAULT_HOSTS constant in config.rs for centralized host management - Update ZMQ endpoint generation to connect to all configured hosts - Implement graceful connection handling for unreachable endpoints - Dashboard now auto-discovers and connects to available agents on cmbox, labbox, simonbox, steambox, srv01	2025-10-14 00:44:38 +02:00
Christoffer Martinsson	c8655bf852	Update dashboard ZMQ endpoints to use hostnames for all CMTEC hosts	2025-10-13 23:39:46 +02:00
Christoffer Martinsson	cd4764596f	Implement comprehensive dashboard improvements and maintenance mode - Storage widget: Restructure with Name/Temp/Wear/Usage columns, SMART details as descriptions - Host navigation: Only cycle through connected hosts, no disconnected hosts - Auto-discovery: Skip config files, use predefined CMTEC host list - Maintenance mode: Suppress notifications during backup via /tmp/cm-maintenance file - CPU thresholds: Update to warning ≥9.0, critical ≥10.0 for production use - Agent-dashboard separation: Agent provides descriptions, dashboard displays only	2025-10-13 11:18:23 +02:00

1 2 3

121 Commits