Christoffer Martinsson 8a36472a3d Implement real-time process monitoring and fix UI hardcoded data
This commit addresses several key issues identified during development:

Major Changes:
- Replace hardcoded top CPU/RAM process display with real system data
- Add intelligent process monitoring to CpuCollector using ps command
- Fix disk metrics permission issues in systemd collector
- Optimize service collection to focus on status, memory, and disk only
- Update dashboard widgets to display live process information

Process Monitoring Implementation:
- Added collect_top_cpu_process() and collect_top_ram_process() methods
- Implemented ps-based monitoring with accurate CPU percentages
- Added filtering to prevent self-monitoring artifacts (ps commands)
- Enhanced error handling and validation for process data
- Dashboard now shows realistic values like "claude (PID 2974) 11.0%"

Service Collection Optimization:
- Removed CPU monitoring from systemd collector for efficiency
- Enhanced service directory permission error logging
- Simplified services widget to show essential metrics only
- Fixed service-to-directory mapping accuracy

UI and Dashboard Improvements:
- Reorganized dashboard layout with btop-inspired multi-panel design
- Updated system panel to include real top CPU/RAM process display
- Enhanced widget formatting and data presentation
- Removed placeholder/hardcoded data throughout the interface

Technical Details:
- Updated agent/src/collectors/cpu.rs with process monitoring
- Modified dashboard/src/ui/mod.rs for real-time process display
- Enhanced systemd collector error handling and disk metrics
- Updated CLAUDE.md documentation with implementation details
2025-10-16 23:55:05 +02:00

89 lines
2.7 KiB
Rust

use super::ConfigurableCache;
use cm_dashboard_shared::{CacheConfig, Metric};
use std::sync::Arc;
use tokio::time::{interval, Duration};
use tracing::{debug, info};
/// Manages metric caching with background tasks
pub struct MetricCacheManager {
cache: Arc<ConfigurableCache>,
config: CacheConfig,
}
impl MetricCacheManager {
pub fn new(config: CacheConfig) -> Self {
let cache = Arc::new(ConfigurableCache::new(config.clone()));
Self {
cache,
config,
}
}
/// Start background cache management tasks
pub async fn start_background_tasks(&self) {
// Temporarily disabled to isolate CPU usage issue
info!("Cache manager background tasks disabled for debugging");
}
/// Check if metric should be collected
pub async fn should_collect_metric(&self, metric_name: &str) -> bool {
self.cache.should_collect(metric_name).await
}
/// Store metric in cache
pub async fn cache_metric(&self, metric: Metric) {
self.cache.store_metric(metric).await;
}
/// Get cached metric if valid
pub async fn get_cached_metric(&self, metric_name: &str) -> Option<Metric> {
self.cache.get_cached_metric(metric_name).await
}
/// Get all valid cached metrics
pub async fn get_all_valid_metrics(&self) -> Vec<Metric> {
self.cache.get_all_valid_metrics().await
}
/// Cache warm-up: collect and cache high-priority metrics
pub async fn warm_cache<F>(&self, collector_fn: F)
where
F: Fn(&str) -> Option<Metric>,
{
if !self.config.enabled {
return;
}
let high_priority_patterns = ["cpu_load_*", "memory_usage_*"];
let mut warmed_count = 0;
for pattern in &high_priority_patterns {
// This is a simplified warm-up - in practice, you'd iterate through
// known metric names or use a registry
if pattern.starts_with("cpu_load_") {
for suffix in &["1min", "5min", "15min"] {
let metric_name = format!("cpu_load_{}", suffix);
if let Some(metric) = collector_fn(&metric_name) {
self.cache_metric(metric).await;
warmed_count += 1;
}
}
}
}
if warmed_count > 0 {
info!("Cache warmed with {} metrics", warmed_count);
}
}
/// Get cache configuration
pub fn get_config(&self) -> &CacheConfig {
&self.config
}
/// Get cache tier interval for a metric
pub fn get_cache_interval(&self, metric_name: &str) -> u64 {
self.config.get_cache_interval(metric_name)
}
}