This commit is contained in:
2025-10-13 00:16:24 +02:00
parent 9e344fb66d
commit 57b676ad25
17 changed files with 729 additions and 411 deletions

View File

@@ -232,12 +232,12 @@ Agent (calculations + thresholds) → Status → Dashboard (display only) → Ta
- Rate limiting: configurable (set to 0 for testing, 30 minutes for production)
**Monitored Components:**
- system.cpu (load status)
- system.cpu_temp (temperature status)
- system.memory (usage status)
- system.services (service health status)
- storage.smart (drive health)
- backup.overall (backup status)
- system.cpu (load status) - SystemCollector
- system.memory (usage status) - SystemCollector
- system.cpu_temp (temperature status) - SystemCollector (disabled)
- system.services (service health status) - ServiceCollector
- storage.smart (drive health) - SmartCollector
- backup.overall (backup status) - BackupCollector
### Pure Auto-Discovery Implementation
@@ -262,10 +262,24 @@ Agent (calculations + thresholds) → Status → Dashboard (display only) → Ta
- [x] CPU temperature monitoring and notifications
- [x] ZMQ message format standardization
- [x] Removed all hardcoded dashboard thresholds
- [x] CPU thresholds restored to production values (5.0/8.0)
- [x] All collectors output standardized status strings (ok/warning/critical/unknown)
- [x] Dashboard connection loss detection with 5-second keep-alive
- [x] Removed excessive logging from agent
- [x] Fixed all compiler warnings in both agent and dashboard
- [x] **SystemCollector architecture refactoring completed (2025-10-12)**
- [x] Created SystemCollector for CPU load, memory, temperature, C-states
- [x] Moved system metrics from ServiceCollector to SystemCollector
- [x] Updated dashboard to parse and display SystemCollector data
- [x] Enhanced service notifications to include specific failure details
- [x] CPU temperature thresholds set to 100°C (effectively disabled)
**Testing Configuration (REVERT FOR PRODUCTION):**
- CPU thresholds lowered to 2.0/4.0 for easy testing
- Email rate limiting disabled (0 minutes)
**Production Configuration:**
- CPU load thresholds: Warning ≥ 5.0, Critical ≥ 8.0
- CPU temperature thresholds: Warning ≥ 100°C, Critical ≥ 100°C (effectively disabled)
- Memory usage thresholds: Warning ≥ 80%, Critical ≥ 95%
- Connection timeout: 15 seconds (agents send data every 5 seconds)
- Email rate limiting: 30 minutes (set to 0 for testing)
### Development Guidelines