cm-dashboard

Author	SHA1	Message	Date
Christoffer Martinsson	b85bd6b153	Fix agent collector timing to prevent intermittent data gaps All checks were successful Build and Release / build-and-release (push) Successful in 1m42s Details Update last_collection timestamp even when collectors fail to prevent immediate retry loops that cause data transmission gaps every 5 seconds. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-24 13:42:29 +01:00
Christoffer Martinsson	43242debce	Update version to 0.1.21 and fix dashboard data caching All checks were successful Build and Release / build-and-release (push) Successful in 1m13s Details - Separate dashboard updates from email notifications for immediate status aggregation - Add metric caching to MetricCollectionManager for instant dashboard updates - Dashboard now receives cached data every 1 second instead of waiting for collection intervals - Fix transmission to use cached metrics rather than triggering fresh collection - Email notifications maintain separate 60-second batching interval - Update configurable email notification aggregation interval	2025-10-28 12:16:31 +01:00
Christoffer Martinsson	627c533724	Update to v0.1.18 with per-collector intervals and tmux check All checks were successful Build and Release / build-and-release (push) Successful in 2m7s Details - Implement per-collector interval timing respecting NixOS config - Remove all hardcoded timeout/interval values and make configurable - Add tmux session requirement check for TUI mode (bypassed for headless) - Update agent to send config hash in Build field instead of nixos version - Add nginx check interval, HTTP timeouts, and ZMQ transmission interval configs - Update NixOS configuration with new configurable values Breaking changes: - Build field now shows nix store config hash (8 chars) instead of nixos version - All intervals now follow individual collector configuration instead of global New configuration fields: - systemd.nginx_check_interval_seconds - systemd.http_timeout_seconds - systemd.http_connect_timeout_seconds - zmq.transmission_interval_seconds	2025-10-28 10:08:25 +01:00
Christoffer Martinsson	c8e26b9bac	Remove redundant smart collector - consolidate SMART into disk collector - Remove separate smart collector implementation - Disk collector already handles SMART data for drives - Eliminates duplicate smartctl calls causing performance issues - SMART functionality remains in logical place with disk monitoring - Fixes infinite smartctl loop issue	2025-10-25 22:25:22 +02:00
Christoffer Martinsson	59d260680e	Integrate smart collector into metrics manager All checks were successful Build and Release / build-and-release (push) Successful in 1m54s Details - Add SmartCollector import and initialization - Enable in both normal and benchmark modes - Fixes infinite smartctl loop issue by properly managing collector - Smart collector now active when config.smart.enabled = true	2025-10-25 17:14:54 +02:00
Christoffer Martinsson	4b54a59e35	Remove unused code and eliminate compiler warnings - Remove unused fields from CommandStatus variants - Clean up unused methods and unused collector fields - Fix lifetime syntax warning in SystemWidget - Delete unused cache module completely - Remove redundant render methods from widgets All agent and dashboard warnings eliminated while preserving panel switching and scrolling functionality.	2025-10-25 14:15:52 +02:00
Christoffer Martinsson	39fc9cd22f	Implement unified system widget with NixOS info, CPU, RAM, and Storage - Create NixOS collector for version and active users detection - Add SystemWidget combining all system information in TODO.md layout - Replace separate CPU/Memory widgets with unified system display - Add tree structure for storage with drive temperature/wear info - Support NixOS version, active users, load averages, memory usage - Follow exact decimal formatting from specification	2025-10-23 14:01:14 +02:00
Christoffer Martinsson	08d3454683	Enhance disk collector with individual drive health monitoring - Add StoragePool and DriveInfo structures for grouping drives by mount point - Implement SMART data collection for individual drives (health, temperature, wear) - Support for ext4, zfs, xfs, mergerfs, btrfs filesystem types - Generate individual drive metrics: disk_[pool]_[drive]_health/temperature/wear - Add storage_type and underlying_devices to filesystem configuration - Move hardcoded service directory mappings to NixOS configuration - Move hardcoded host-to-user mapping to NixOS configuration - Remove all unused code and fix compilation warnings - Clean implementation with zero warnings and no dead code Individual drives now show health status per storage pool: Storage root (ext4): nvme0n1 PASSED 42°C 5% wear Storage steampool (mergerfs): sda/sdb/sdc with individual health data	2025-10-22 19:59:25 +02:00
Christoffer Martinsson	a08670071c	Implement simple persistent cache with automatic saving on status changes	2025-10-21 20:12:19 +02:00
Christoffer Martinsson	eb268922bd	Remove all unused code and fix build warnings - Remove unused struct fields: tier, config_name, last_collection_time - Remove unused structs: PerformanceMetrics, PerfMonitor - Remove unused methods: get_performance_metrics, get_collector_names, get_stats - Remove unused utility functions and system helpers - Remove unused config fields from CPU and Memory collectors - Keep config fields that are actually used (DiskCollector, etc.) - Remove unused proxy_pass_url variable and assignments - Fix duplicate hostname variable declaration - Achieve zero build warnings without functionality changes	2025-10-20 20:20:47 +02:00
Christoffer Martinsson	00a8ed3da2	Implement hysteresis for metric status changes to prevent flapping Add comprehensive hysteresis support to prevent status oscillation near threshold boundaries while maintaining responsive alerting. Key Features: - HysteresisThresholds with configurable upper/lower limits - StatusTracker for per-metric status history - Default gaps: CPU load 10%, memory 5%, disk temp 5°C Updated Components: - CPU load collector (5-minute average with hysteresis) - Memory usage collector (percentage-based thresholds) - Disk temperature collector (SMART data monitoring) - All collectors updated to support StatusTracker interface Cache Interval Adjustments: - Service status: 60s → 10s (faster response) - Disk usage: 300s → 60s (more frequent checks) - Backup status: 900s → 60s (quicker updates) - SMART data: moved to 600s tier (10 minutes) Architecture: - Individual metric status calculation in collectors - Centralized StatusTracker in MetricCollectionManager - Status aggregation preserved in dashboard widgets	2025-10-20 18:45:41 +02:00
Christoffer Martinsson	66a79574e0	Implement comprehensive monitoring improvements - Add full email notifications with lettre and Stockholm timezone - Add status persistence to prevent notification spam on restart - Change nginx monitoring to check backend proxy_pass URLs instead of frontend domains - Increase nginx site timeout to 10 seconds for backend health checks - Fix cache intervals: disk (5min), backup (10min), systemd (30s), cpu/memory (5s) - Remove rate limiting for immediate notifications on all status changes - Store metric status in /var/lib/cm-dashboard/last-status.json	2025-10-20 14:32:44 +02:00
Christoffer Martinsson	e7200fb1b0	Implement UUID-based disk detection for CMTEC infrastructure Replace df-based auto-discovery with UUID-based detection using NixOS hardware configuration data. Each host now has predefined filesystem configurations with predictable metric names. - Add FilesystemConfig struct with UUID, mount point, and filesystem type - Remove auto_discover and devices fields from DiskConfig - Add host-specific UUID defaults for cmbox, srv01, srv02, simonbox, steambox - Remove legacy get_mounted_disks() df-based detection method - Update DiskCollector to use UUID resolution via /dev/disk/by-uuid/ - Generate predictable metric names: disk_root_, disk_boot_, etc. - Maintain fallback for labbox/wslbox (no UUIDs configured yet) Provides consistent metric names across reboots and reliable detection aligned with NixOS deployments without dependency on mount order.	2025-10-20 09:50:10 +02:00
Christoffer Martinsson	7f85a6436e	Clean up unused imports and fix build warnings - Remove unused imports (Duration, HashMap, SharedError, DateTime, etc.) - Fix unused variables by prefixing with underscore - Remove redundant dashboard.toml config file - Update theme imports to use only needed components - Maintain all functionality while reducing warnings - Add srv02 to predefined hosts configuration - Remove unused broadcast_command methods	2025-10-18 23:12:07 +02:00
Christoffer Martinsson	125111ee99	Implement comprehensive backup monitoring and fix timestamp issues - Add BackupCollector for reading TOML status files with disk space metrics - Implement BackupWidget with disk usage display and service status details - Fix backup script disk space parsing by adding missing capture_output=True - Update backup widget to show actual disk usage instead of repository size - Fix timestamp parsing to use backup completion time instead of start time - Resolve timezone issues by using UTC timestamps in backup script - Add disk identification metrics (product name, serial number) to backup status - Enhance UI layout with proper backup monitoring integration	2025-10-18 18:33:41 +02:00
Christoffer Martinsson	8a36472a3d	Implement real-time process monitoring and fix UI hardcoded data This commit addresses several key issues identified during development: Major Changes: - Replace hardcoded top CPU/RAM process display with real system data - Add intelligent process monitoring to CpuCollector using ps command - Fix disk metrics permission issues in systemd collector - Optimize service collection to focus on status, memory, and disk only - Update dashboard widgets to display live process information Process Monitoring Implementation: - Added collect_top_cpu_process() and collect_top_ram_process() methods - Implemented ps-based monitoring with accurate CPU percentages - Added filtering to prevent self-monitoring artifacts (ps commands) - Enhanced error handling and validation for process data - Dashboard now shows realistic values like "claude (PID 2974) 11.0%" Service Collection Optimization: - Removed CPU monitoring from systemd collector for efficiency - Enhanced service directory permission error logging - Simplified services widget to show essential metrics only - Fixed service-to-directory mapping accuracy UI and Dashboard Improvements: - Reorganized dashboard layout with btop-inspired multi-panel design - Updated system panel to include real top CPU/RAM process display - Enhanced widget formatting and data presentation - Removed placeholder/hardcoded data throughout the interface Technical Details: - Updated agent/src/collectors/cpu.rs with process monitoring - Modified dashboard/src/ui/mod.rs for real-time process display - Enhanced systemd collector error handling and disk metrics - Updated CLAUDE.md documentation with implementation details	2025-10-16 23:55:05 +02:00

16 Commits