Implement hysteresis for metric status changes to prevent flapping

Add comprehensive hysteresis support to prevent status oscillation near
threshold boundaries while maintaining responsive alerting.

Key Features:
- HysteresisThresholds with configurable upper/lower limits
- StatusTracker for per-metric status history
- Default gaps: CPU load 10%, memory 5%, disk temp 5°C

Updated Components:
- CPU load collector (5-minute average with hysteresis)
- Memory usage collector (percentage-based thresholds)
- Disk temperature collector (SMART data monitoring)
- All collectors updated to support StatusTracker interface

Cache Interval Adjustments:
- Service status: 60s → 10s (faster response)
- Disk usage: 300s → 60s (more frequent checks)
- Backup status: 900s → 60s (quicker updates)
- SMART data: moved to 600s tier (10 minutes)

Architecture:
- Individual metric status calculation in collectors
- Centralized StatusTracker in MetricCollectionManager
- Status aggregation preserved in dashboard widgets
This commit is contained in:
2025-10-20 18:45:41 +02:00
parent e998679901
commit 00a8ed3da2
34 changed files with 1037 additions and 770 deletions

View File

@@ -1,18 +1,19 @@
use anyhow::{Context, Result};
use std::path::Path;
use std::fs;
use crate::config::AgentConfig;
use anyhow::{Context, Result};
use std::fs;
use std::path::Path;
pub fn load_config<P: AsRef<Path>>(path: P) -> Result<AgentConfig> {
let path = path.as_ref();
let content = fs::read_to_string(path)
.with_context(|| format!("Failed to read config file: {}", path.display()))?;
let config: AgentConfig = toml::from_str(&content)
.with_context(|| format!("Failed to parse config file: {}", path.display()))?;
config.validate()
config
.validate()
.with_context(|| format!("Invalid configuration in file: {}", path.display()))?;
Ok(config)
}
}