Implement hysteresis for metric status changes to prevent flapping
Add comprehensive hysteresis support to prevent status oscillation near threshold boundaries while maintaining responsive alerting. Key Features: - HysteresisThresholds with configurable upper/lower limits - StatusTracker for per-metric status history - Default gaps: CPU load 10%, memory 5%, disk temp 5°C Updated Components: - CPU load collector (5-minute average with hysteresis) - Memory usage collector (percentage-based thresholds) - Disk temperature collector (SMART data monitoring) - All collectors updated to support StatusTracker interface Cache Interval Adjustments: - Service status: 60s → 10s (faster response) - Disk usage: 300s → 60s (more frequent checks) - Backup status: 900s → 60s (quicker updates) - SMART data: moved to 600s tier (10 minutes) Architecture: - Individual metric status calculation in collectors - Centralized StatusTracker in MetricCollectionManager - Status aggregation preserved in dashboard widgets
This commit is contained in:
@@ -4,10 +4,10 @@ use thiserror::Error;
|
||||
pub enum SharedError {
|
||||
#[error("Serialization error: {message}")]
|
||||
Serialization { message: String },
|
||||
|
||||
|
||||
#[error("Invalid metric value: {message}")]
|
||||
InvalidMetric { message: String },
|
||||
|
||||
|
||||
#[error("Protocol error: {message}")]
|
||||
Protocol { message: String },
|
||||
}
|
||||
@@ -18,4 +18,4 @@ impl From<serde_json::Error> for SharedError {
|
||||
message: err.to_string(),
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user