Compare commits

..

7 Commits

Author SHA1 Message Date
bd20f0cae1 Fix user-stopped flag timing and service transition handling
All checks were successful
Build and Release / build-and-release (push) Successful in 2m9s
Correct user-stopped service behavior during startup transitions:

User-Stopped Flag Timing Fix:
- Clear user-stopped flag only when service actually becomes active, not when start command succeeds
- Remove premature flag clearing from service control handler
- Add automatic flag clearing when service status metrics show active state
- Services retain user-stopped status during activating/transitioning states

Service Transition Handling:
- User-stopped services in activating state now report Status::OK instead of Status::Pending
- Prevents host warnings during legitimate service startup transitions
- Maintains accurate status reporting throughout service lifecycle
- Failed service starts preserve user-stopped flags correctly

Journalctl Popup Fix:
- Fix terminal corruption when using J key for service logs
- Correct command quoting to prevent tmux popup interference
- Stable popup display without dashboard interface corruption

Result: Clean service startup experience with no false warnings and proper
user-stopped tracking throughout the entire service lifecycle.

Bump version to v0.1.47
2025-10-30 12:05:54 +01:00
11c9a5f9d2 Add service logs feature and improve tmux popup sizing
All checks were successful
Build and Release / build-and-release (push) Successful in 2m9s
New Features:
- Add journalctl service logs viewer via Shift+J key
- Opens tmux popup with real-time log streaming using journalctl -f
- Shows last 50 lines and follows new log entries for selected service
- Popup titled 'Logs: service.service' for clear context

Improvements:
- Increase tmux popup size to 80% width and height for better readability
- Applies to both rebuild (R) and logs (J) popups
- Compact status line text to fit new J: Logs shortcut
- Updated documentation with new key binding

Navigation Updates:
- J: Show service logs (journalctl in tmux popup)
- Status line: Tab: Host • ↑↓/jk: Select • r: Rebuild • s/S: Start/Stop • J: Logs • q: Quit

Bump version to v0.1.46
2025-10-30 11:21:14 +01:00
aeae60146d Fix user-stopped service display and flag timing issues
All checks were successful
Build and Release / build-and-release (push) Successful in 2m10s
Improve user-stopped service tracking behavior:

Service Display Fix:
- Services widget now shows actual systemctl status (active/inactive)
- Use info.status instead of hardcoded text based on widget_status
- User-stopped services correctly display 'inactive' with green OK icon
- Prevents misleading 'active' display for stopped services

User-Stopped Flag Timing Fix:
- Clear user-stopped flag AFTER successful service start, not when command sent
- Prevents warnings during service startup transition period
- Service remains Status::OK during 'activating' state for user-stopped services
- Flag only cleared when systemctl start command actually succeeds
- Failed start attempts preserve user-stopped flag

Result: Clean service state tracking with accurate display and no false alerts
during intentional user operations.

Bump version to v0.1.45
2025-10-30 11:11:39 +01:00
a82c81e8e3 Fix service control by adding .service suffix to systemctl commands
All checks were successful
Build and Release / build-and-release (push) Successful in 2m8s
Service stop/start operations were failing because systemctl commands
were missing the .service suffix. This caused the new user-stopped
tracking feature to mark services but not actually control them.

Changes:
- Add .service suffix to systemctl commands in service control handler
- Matches pattern used throughout systemd collector
- Fixes service start/stop functionality via dashboard

Clean up legacy documentation:
- Remove outdated TODO.md, AGENTS.md, and test files
- Update CLAUDE.md with current architecture and rules only
- Comprehensive README.md rewrite with technical documentation
- Document user-stopped service tracking feature

Bump version to v0.1.44
2025-10-30 11:00:36 +01:00
c56e9d7be2 Implement user-stopped service tracking system
All checks were successful
Build and Release / build-and-release (push) Successful in 2m34s
Add comprehensive tracking for services stopped via dashboard to prevent
false alerts when users intentionally stop services.

Features:
- User-stopped services report Status::Ok instead of Warning
- Persistent storage survives agent restarts
- Dashboard sends UserStart/UserStop commands
- Agent tracks and syncs user-stopped state globally
- Systemd collector respects user-stopped flags

Implementation:
- New service_tracker module with persistent JSON storage
- Enhanced ServiceAction enum with UserStart/UserStop variants
- Global singleton tracker accessible by collectors
- Service status logic updated to check user-stopped flag
- Dashboard version now uses CARGO_PKG_VERSION automatically

Bump version to v0.1.43
2025-10-30 10:42:56 +01:00
c8f800a1e5 Implement git commit hash tracking for build display
All checks were successful
Build and Release / build-and-release (push) Successful in 1m24s
- Add get_git_commit() method to read /var/lib/cm-dashboard/git-commit
- Replace NixOS build version with actual git commit hash
- Show deployed commit hash as 'Build:' value for accurate tracking
- Enable verification of which exact commit is deployed per host
- Update version to 0.1.42
2025-10-29 15:29:02 +01:00
fc6b3424cf Add hostname to NixOS title and make dashboard title bold
All checks were successful
Build and Release / build-and-release (push) Successful in 2m46s
- Change system panel title from 'NixOS:' to 'NixOS hostname:'
- Make main dashboard title 'cm-dashboard' bold in top bar
- Remove unused Typography::title() function to fix warnings
- Update SystemWidget::render_with_scroll to accept hostname parameter
- Update version to 0.1.41 in all Cargo.toml files and dashboard code
2025-10-29 14:24:17 +01:00
25 changed files with 676 additions and 977 deletions

View File

@@ -1,3 +0,0 @@
# Agent Guide
Agents working in this repo must follow the instructions in `CLAUDE.md`.

434
CLAUDE.md
View File

@@ -2,277 +2,59 @@
## Overview
A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built to replace Glance with a custom solution tailored for our specific monitoring needs and ZMQ-based metric collection.
A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built with ZMQ-based metric collection and individual metrics architecture.
## Implementation Strategy
## Current Features
### Current Implementation Status
### Core Functionality
- **Real-time Monitoring**: CPU, RAM, Storage, and Service status
- **Service Management**: Start/stop services with user-stopped tracking
- **Multi-host Support**: Monitor multiple servers from single dashboard
- **NixOS Integration**: System rebuild via SSH + tmux popup
- **Backup Monitoring**: Borgbackup status and scheduling
**System Panel Enhancement - COMPLETED**
### User-Stopped Service Tracking
- Services stopped via dashboard are marked as "user-stopped"
- User-stopped services report Status::OK instead of Warning
- Prevents false alerts during intentional maintenance
- Persistent storage survives agent restarts
- Automatic flag clearing when services are restarted via dashboard
All system panel features successfully implemented:
- **NixOS Collector**: Created collector for version and active users
- **System Widget**: Unified widget combining NixOS, CPU, RAM, and Storage
- **Build Display**: Shows NixOS build information without codename
- **Active Users**: Displays currently logged in users
- **Tmpfs Monitoring**: Added /tmp usage to RAM section
- **Agent Deployment**: NixOS collector working in production
### Service Management
- **Direct Control**: Arrow keys (↑↓) or vim keys (j/k) navigate services
- **Service Actions**:
- `s` - Start service (sends UserStart command)
- `S` - Stop service (sends UserStop command)
- `J` - Show service logs (journalctl in tmux popup)
- `R` - Rebuild current host
- **Visual Status**: Green ● (active), Yellow ◐ (inactive), Red ◯ (failed)
- **Transitional Icons**: Blue arrows during operations
**Simplified Navigation and Service Management - COMPLETED**
All navigation and service management features successfully implemented:
- **Direct Service Control**: Up/Down (or j/k) arrows directly control service selection
-**Always Visible Selection**: Service selection highlighting always visible (no panel focus needed)
-**Complete Service Discovery**: All configured services visible regardless of state
-**Transitional Visual Feedback**: Service operations show directional arrows (↑ ↓ ↻)
-**Simplified Interface**: Removed panel switching complexity, uniform appearance
-**Vi-style Navigation**: Added j/k keys for vim users alongside arrow keys
**Current Status - October 28, 2025:**
- All service discovery and display features working correctly ✅
- Simplified navigation system implemented ✅
- Service selection always visible with direct control ✅
- Complete service visibility (all configured services show regardless of state) ✅
- Transitional service icons working with proper color handling ✅
- Build display working: "Build: 25.05.20251004.3bcc93c" ✅
- Agent version display working: "Agent: v0.1.33" ✅
- Cross-host version comparison implemented ✅
- Automated binary release system working ✅
- SMART data consolidated into disk collector ✅
**RESOLVED - Remote Rebuild Functionality:**
-**System Rebuild**: Now uses simple SSH + tmux popup approach
-**Process Isolation**: Rebuild runs independently via SSH, survives agent/dashboard restarts
-**Configuration**: SSH user and rebuild alias configurable in dashboard config
-**Service Control**: Works correctly for start/stop/restart of services
**Solution Implemented:**
- Replaced complex SystemRebuild command infrastructure with direct tmux popup
- Uses `tmux display-popup "ssh -tt {user}@{hostname} 'bash -ic {alias}'"`
- Configurable SSH user and rebuild alias in dashboard config
- Eliminates all agent crashes during rebuilds
- Simple, reliable, and follows standard tmux interface patterns
**Current Layout:**
```
NixOS:
Build: 25.05.20251004.3bcc93c
Agent: v0.1.17 # Shows agent version from Cargo.toml
Active users: cm, simon
CPU:
● Load: 0.02 0.31 0.86 • 3000MHz
RAM:
● Usage: 33% 2.6GB/7.6GB
● /tmp: 0% 0B/2.0GB
Storage:
● root (Single):
├─ ● nvme0n1 W: 1%
└─ ● 18% 167.4GB/928.2GB
```
**System panel layout fully implemented with blue tree symbols ✅**
**Tree symbols now use consistent blue theming across all panels ✅**
**Overflow handling restored for all widgets ("... and X more") ✅**
**Agent version display working correctly ✅**
**Cross-host version comparison logging warnings ✅**
**Backup panel visibility fixed - only shows when meaningful data exists ✅**
**SSH-based rebuild system fully implemented and working ✅**
### Current Simplified Navigation Implementation
**Navigation Controls:**
- **Tab**: Switch between hosts (cmbox, srv01, srv02, steambox, etc.)
- **↑↓ or j/k**: Move service selection cursor (always works)
### Navigation
- **Tab**: Switch between hosts
- **↑↓ or j/k**: Select services
- **J**: Show service logs (journalctl)
- **q**: Quit dashboard
**Service Control:**
- **s**: Start selected service
- **S**: Stop selected service
- **R**: Rebuild current host (works from any context)
**Visual Features:**
- **Service Selection**: Always visible blue background highlighting current service
- **Status Icons**: Green ● (active), Yellow ◐ (inactive), Red ◯ (failed), ? (unknown)
- **Transitional Icons**: Blue ↑ (starting), ↓ (stopping), ↻ (restarting) when not selected
- **Transitional Icons**: Dark gray arrows when service is selected (for visibility)
- **Uniform Interface**: All panels have consistent appearance (no focus borders)
### Service Discovery and Display - WORKING ✅
**All Issues Resolved (as of 2025-10-28):**
-**Complete Service Discovery**: Uses `systemctl list-unit-files` + `list-units --all` for comprehensive service detection
-**All Services Visible**: Shows all configured services regardless of current state (active/inactive)
-**Proper Status Display**: Active services show green ●, inactive show yellow ◐, failed show red ◯
-**Transitional Icons**: Visual feedback during service operations with proper color handling
-**Simplified Navigation**: Removed panel complexity, direct service control always available
-**Service Control**: Start (s) and Stop (S) commands work from anywhere
-**System Rebuild**: SSH + tmux popup approach for reliable remote rebuilds
### Terminal Popup for Real-time Output - IMPLEMENTED ✅
**Status (as of 2025-10-26):**
-**Terminal Popup UI**: 80% screen coverage with terminal styling and color-coded output
-**ZMQ Streaming Protocol**: CommandOutputMessage for real-time output transmission
-**Keyboard Controls**: ESC/Q to close, ↑↓ to scroll, manual close (no auto-close)
-**Real-time Display**: Live streaming of command output as it happens
-**Version-based Agent Reporting**: Shows "Agent: v0.1.13" instead of nix store hash
**Current Implementation Issues:**
-**Agent Process Crashes**: Agent dies during nixos-rebuild execution
-**Inconsistent Output**: Different outputs each time 'R' is pressed
-**Limited Output Visibility**: Not capturing all nixos-rebuild progress
**PLANNED SOLUTION - Systemd Service Approach:**
**Problem**: Direct nixos-rebuild execution in agent causes process crashes and inconsistent output.
**Solution**: Create dedicated systemd service for rebuild operations.
**Implementation Plan:**
1. **NixOS Systemd Service**:
```nix
systemd.services.cm-rebuild = {
description = "CM Dashboard NixOS Rebuild";
serviceConfig = {
Type = "oneshot";
ExecStart = "${pkgs.nixos-rebuild}/bin/nixos-rebuild switch --flake . --option sandbox false";
WorkingDirectory = "/var/lib/cm-dashboard/nixos-config";
User = "root";
StandardOutput = "journal";
StandardError = "journal";
};
};
```
2. **Agent Modification**:
- Replace direct nixos-rebuild execution with: `systemctl start cm-rebuild`
- Stream output via: `journalctl -u cm-rebuild -f --no-pager`
- Monitor service status for completion detection
3. **Benefits**:
- **Process Isolation**: Service runs independently, won't crash agent
- **Consistent Output**: Always same deterministic rebuild process
- **Proper Logging**: systemd journal handles all output management
- **Resource Management**: systemd manages cleanup and resource limits
- **Status Tracking**: Can query service status (running/failed/success)
**Next Priority**: Implement systemd service approach for reliable rebuild operations.
**Keyboard Controls Status:**
- **Services Panel**:
- R (restart) ✅ Working
- s (start) ✅ Working
- S (stop) ✅ Working
- **System Panel**: R (nixos-rebuild) ✅ Working with --option sandbox false
- **Backup Panel**: B (trigger backup) ❓ Not implemented
**Visual Feedback Implementation - IN PROGRESS:**
Context-appropriate progress indicators for each panel:
**Services Panel** (Service status transitions):
```
● nginx active → ⏳ nginx restarting → ● nginx active
● docker active → ⏳ docker stopping → ● docker inactive
```
**System Panel** (Build progress in NixOS section):
```
NixOS:
Build: 25.05.20251004.3bcc93c → Build: [████████████ ] 65%
Active users: cm, simon Active users: cm, simon
```
**Backup Panel** (OnGoing status with progress):
```
Latest backup: → Latest backup:
● 2024-10-23 14:32:15 ● OnGoing
└─ Duration: 1.3m └─ [██████ ] 60%
```
**Critical Configuration Hash Fix - HIGH PRIORITY:**
**Problem:** Configuration hash currently shows git commit hash instead of actual deployed system hash.
**Current (incorrect):**
- Shows git hash: `db11f82` (source repository commit)
- Not accurate - doesn't reflect what's actually deployed
**Target (correct):**
- Show nix store hash: `d8ivwiar` (first 8 chars from deployed system)
- Source: `/nix/store/d8ivwiarhwhgqzskj6q2482r58z46qjf-nixos-system-cmbox-25.05.20251004.3bcc93c`
- Pattern: Extract hash from `/nix/store/HASH-nixos-system-HOSTNAME-VERSION`
**Benefits:**
1. **Deployment Verification:** Confirms rebuild actually succeeded
2. **Accurate Status:** Shows what's truly running, not just source
3. **Rebuild Completion Detection:** Hash change = rebuild completed
4. **Rollback Tracking:** Each deployment has unique identifier
**Implementation Required:**
1. Agent extracts nix store hash from `ls -la /run/current-system`
2. Reports this as `system_config_hash` metric instead of git hash
3. Dashboard displays first 8 characters: `Config: d8ivwiar`
**Next Session Priority Tasks:**
**Remaining Features:**
1. **Fix Configuration Hash Display (CRITICAL)**:
- Use nix store hash instead of git commit hash
- Extract from `/run/current-system` -> `/nix/store/HASH-nixos-system-*`
- Enables proper rebuild completion detection
2. **Command Response Protocol**:
- Agent sends command completion/failure back to dashboard via ZMQ
- Dashboard updates UI status from ⏳ to ● when commands complete
- Clear success/failure status after timeout
3. **Backup Panel Features**:
- Implement backup trigger functionality (B key)
- Complete visual feedback for backup operations
- Add backup progress indicators
**Enhancement Tasks:**
- Add confirmation dialogs for destructive actions (stop/restart/rebuild)
- Implement command history/logging
- Add keyboard shortcuts help overlay
**Future Enhanced Navigation:**
- Add Page Up/Down for faster scrolling through long service lists
- Implement search/filter functionality for services
- Add jump-to-service shortcuts (first letter navigation)
**Future Advanced Features:**
- Service dependency visualization
- Historical service status tracking
- Real-time log viewing integration
## Core Architecture Principles - CRITICAL
## Core Architecture Principles
### Individual Metrics Philosophy
**NEW ARCHITECTURE**: Agent collects individual metrics, dashboard composes widgets from those metrics.
- Agent collects individual metrics, dashboard composes widgets
- Each metric collected, transmitted, and stored individually
- Agent calculates status for each metric using thresholds
- Dashboard aggregates individual metric statuses for widget status
### Maintenance Mode
**Purpose:**
- Suppress email notifications during planned maintenance or backups
- Prevents false alerts when services are intentionally stopped
**Implementation:**
- Agent checks for `/tmp/cm-maintenance` file before sending notifications
- File presence suppresses all email notifications while continuing monitoring
- Dashboard continues to show real status, only notifications are blocked
**Usage:**
Usage:
```bash
# Enable maintenance mode
touch /tmp/cm-maintenance
# Run maintenance tasks (backups, service restarts, etc.)
# Run maintenance tasks
systemctl stop service
# ... maintenance work ...
systemctl start service
@@ -281,102 +63,29 @@ systemctl start service
rm /tmp/cm-maintenance
```
**NixOS Integration:**
- Borgbackup script automatically creates/removes maintenance file
- Automatic cleanup via trap ensures maintenance mode doesn't stick
- All cinfiguration are shall be done from nixos config
**ARCHITECTURE ENFORCEMENT**:
- **ZERO legacy code reuse** - Fresh implementation following ARCHITECT.md exactly
- **Individual metrics only** - NO grouped metric structures
- **Reference-only legacy** - Study old functionality, implement new architecture
- **Clean slate mindset** - Build as if legacy codebase never existed
**Implementation Rules**:
1. **Individual Metrics**: Each metric is collected, transmitted, and stored individually
2. **Agent Status Authority**: Agent calculates status for each metric using thresholds
3. **Dashboard Composition**: Dashboard widgets subscribe to specific metrics by name
4. **Status Aggregation**: Dashboard aggregates individual metric statuses for widget status
**Testing & Building**:
- **Workspace builds**: `cargo build --workspace` for all testing
- **Clean compilation**: Remove `target/` between architecture changes
- **ZMQ testing**: Test agent-dashboard communication independently
- **Widget testing**: Verify UI layout matches legacy appearance exactly
**NEVER in New Implementation**:
- Copy/paste ANY code from legacy backup
- Calculate status in dashboard widgets
- Hardcode metric names in widgets (use const arrays)
# Important Communication Guidelines
NEVER write that you have "successfully implemented" something or generate extensive summary text without first verifying with the user that the implementation is correct. This wastes tokens. Keep responses concise.
NEVER implement code without first getting explicit user agreement on the approach. Always ask for confirmation before proceeding with implementation.
## Commit Message Guidelines
**NEVER mention:**
- Claude or any AI assistant names
- Automation or AI-generated content
- Any reference to automated code generation
**ALWAYS:**
- Focus purely on technical changes and their purpose
- Use standard software development commit message format
- Describe what was changed and why, not how it was created
- Write from the perspective of a human developer
**Examples:**
- ❌ "Generated with Claude Code"
- ❌ "AI-assisted implementation"
- ❌ "Automated refactoring"
- ✅ "Implement maintenance mode for backup operations"
- ✅ "Restructure storage widget with improved layout"
- ✅ "Update CPU thresholds to production values"
## Development and Deployment Architecture
**CRITICAL:** Development and deployment paths are completely separate:
### Development Path
- **Location:** `~/projects/nixosbox`
- **Purpose:** Development workflow only - for committing new cm-dashboard code
- **Location:** `~/projects/cm-dashboard`
- **Purpose:** Development workflow only - for committing new code
- **Access:** Only for developers to commit changes
- **Code Access:** Running cm-dashboard code shall NEVER access this path
### Deployment Path
- **Location:** `/var/lib/cm-dashboard/nixos-config`
- **Purpose:** Production deployment only - agent clones/pulls from git
- **Access:** Only cm-dashboard agent for deployment operations
- **Workflow:** git pull → `/var/lib/cm-dashboard/nixos-config` → nixos-rebuild
### Git Flow
```
Development: ~/projects/nixosbox → git commit → git push
Development: ~/projects/cm-dashboard → git commit → git push
Deployment: git pull → /var/lib/cm-dashboard/nixos-config → rebuild
```
## Automated Binary Release System
**IMPLEMENTED:** cm-dashboard now uses automated binary releases instead of source builds.
CM Dashboard uses automated binary releases instead of source builds.
### Release Workflow
1. **Automated Release Creation**
- Gitea Actions workflow builds static binaries on tag push
- Creates release with `cm-dashboard-linux-x86_64.tar.gz` tarball
- No manual intervention required for binary generation
2. **Creating New Releases**
### Creating New Releases
```bash
cd ~/projects/cm-dashboard
git tag v0.1.X
@@ -388,7 +97,7 @@ Deployment: git pull → /var/lib/cm-dashboard/nixos-config → rebuild
- Creates GitHub-style release with tarball
- Uploads binaries via Gitea API
3. **NixOS Configuration Updates**
### NixOS Configuration Updates
Edit `~/projects/nixosbox/hosts/common/cm-dashboard.nix`:
```nix
@@ -399,7 +108,7 @@ Deployment: git pull → /var/lib/cm-dashboard/nixos-config → rebuild
};
```
4. **Get Release Hash**
### Get Release Hash
```bash
cd ~/projects/nixosbox
nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchurl {
@@ -408,18 +117,53 @@ Deployment: git pull → /var/lib/cm-dashboard/nixos-config → rebuild
}' 2>&1 | grep "got:"
```
5. **Commit and Deploy**
```bash
cd ~/projects/nixosbox
git add hosts/common/cm-dashboard.nix
git commit -m "Update cm-dashboard to v0.1.X with static binaries"
git push
```
### Building
### Benefits
**Testing & Building:**
- **Workspace builds**: `nix-shell -p openssl pkg-config --run "cargo build --workspace"`
- **Clean compilation**: Remove `target/` between major changes
- **No compilation overhead** on each host
- **Consistent static binaries** across all hosts
- **Faster deployments** - download vs compile
- **No library dependency issues** - static linking
- **Automated pipeline** - tag push triggers everything
## Important Communication Guidelines
Keep responses concise and focused. Avoid extensive implementation summaries unless requested.
## Commit Message Guidelines
**NEVER mention:**
- Claude or any AI assistant names
- Automation or AI-generated content
- Any reference to automated code generation
**ALWAYS:**
- Focus purely on technical changes and their purpose
- Use standard software development commit message format
- Describe what was changed and why, not how it was created
- Write from the perspective of a human developer
**Examples:**
- ❌ "Generated with Claude Code"
- ❌ "AI-assisted implementation"
- ❌ "Automated refactoring"
- ✅ "Implement maintenance mode for backup operations"
- ✅ "Restructure storage widget with improved layout"
- ✅ "Update CPU thresholds to production values"
## Implementation Rules
1. **Individual Metrics**: Each metric is collected, transmitted, and stored individually
2. **Agent Status Authority**: Agent calculates status for each metric using thresholds
3. **Dashboard Composition**: Dashboard widgets subscribe to specific metrics by name
4. **Status Aggregation**: Dashboard aggregates individual metric statuses for widget status
**NEVER:**
- Copy/paste ANY code from legacy implementations
- Calculate status in dashboard widgets
- Hardcode metric names in widgets (use const arrays)
- Create files unless absolutely necessary for achieving goals
- Create documentation files unless explicitly requested
**ALWAYS:**
- Prefer editing existing files to creating new ones
- Follow existing code conventions and patterns
- Use existing libraries and utilities
- Follow security best practices

6
Cargo.lock generated
View File

@@ -270,7 +270,7 @@ checksum = "a1d728cc89cf3aee9ff92b05e62b19ee65a02b5702cff7d5a377e32c6ae29d8d"
[[package]]
name = "cm-dashboard"
version = "0.1.39"
version = "0.1.46"
dependencies = [
"anyhow",
"chrono",
@@ -291,7 +291,7 @@ dependencies = [
[[package]]
name = "cm-dashboard-agent"
version = "0.1.39"
version = "0.1.46"
dependencies = [
"anyhow",
"async-trait",
@@ -314,7 +314,7 @@ dependencies = [
[[package]]
name = "cm-dashboard-shared"
version = "0.1.39"
version = "0.1.46"
dependencies = [
"chrono",
"serde",

499
README.md
View File

@@ -1,88 +1,106 @@
# CM Dashboard
A real-time infrastructure monitoring system with intelligent status aggregation and email notifications, built with Rust and ZMQ.
A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built with ZMQ-based metric collection and individual metrics architecture.
## Current Implementation
## Features
This is a complete rewrite implementing an **individual metrics architecture** where:
### Core Monitoring
- **Real-time metrics**: CPU, RAM, Storage, and Service status
- **Multi-host support**: Monitor multiple servers from single dashboard
- **Service management**: Start/stop services with intelligent status tracking
- **NixOS integration**: System rebuild via SSH + tmux popup
- **Backup monitoring**: Borgbackup status and scheduling
- **Email notifications**: Intelligent batching prevents spam
- **Agent** collects individual metrics (e.g., `cpu_load_1min`, `memory_usage_percent`) and calculates status
- **Dashboard** subscribes to specific metrics and composes widgets
- **Status Aggregation** provides intelligent email notifications with batching
- **Persistent Cache** prevents false notifications on restart
### User-Stopped Service Tracking
Services stopped via the dashboard are intelligently tracked to prevent false alerts:
## Dashboard Interface
- **Smart status reporting**: User-stopped services show as Status::OK instead of Warning
- **Persistent storage**: Tracking survives agent restarts via JSON storage
- **Automatic management**: Flags cleared when services restarted via dashboard
- **Maintenance friendly**: No false alerts during intentional service operations
## Architecture
### Individual Metrics Philosophy
- **Agent**: Collects individual metrics, calculates status using thresholds
- **Dashboard**: Subscribes to specific metrics, composes widgets from individual data
- **ZMQ Communication**: Efficient real-time metric transmission
- **Status Aggregation**: Host-level status calculated from all service metrics
### Components
```
┌─────────────────┐ ZMQ ┌─────────────────┐
│ │◄──────────►│ │
│ Agent │ Metrics │ Dashboard │
│ - Collectors │ │ - TUI │
│ - Status │ │ - Widgets │
│ - Tracking │ │ - Commands │
│ │ │ │
└─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ JSON Storage │ │ SSH + tmux │
│ - User-stopped │ │ - Remote rebuild│
│ - Cache │ │ - Process │
│ - State │ │ isolation │
└─────────────────┘ └─────────────────┘
```
### Service Control Flow
1. **User Action**: Dashboard sends `UserStart`/`UserStop` commands
2. **Agent Processing**:
- Marks service as user-stopped (if stopping)
- Executes `systemctl start/stop service`
- Syncs state to global tracker
3. **Status Calculation**:
- Systemd collector checks user-stopped flag
- Reports Status::OK for user-stopped inactive services
- Normal Warning status for system failures
## Interface
```
cm-dashboard • ● cmbox ● srv01 ● srv02 ● steambox
┌system──────────────────────────────┐┌services─────────────────────────────────────────┐
CPU: ││Service: Status: RAM: Disk: │
● Load: 0.10 0.52 0.88 • 400.0 MHz ││● docker active 27M 496MB │
RAM: ││● docker-registry active 19M 496MB │
● Used: 30% 2.3GB/7.6GB ││● gitea active 579M 2.6GB
● tmp: 0.0% 0B/2.0GB ││● gitea-runner-default active 11M 2.6GB
Disk nvme0n1: ││● haasp-core active 9M 1MB
● Health: PASSED ││● haasp-mqtt active 3M 1MB
│● Usage @root: 8.3% • 75.4/906.2 GB ││● haasp-webgrid active 10M 1MB
│● Usage @boot: 5.9% • 0.1/1.0 GB ││● immich-server active 240M 45.1GB
││● mosquitto active 1M 1MB
││● mysql active 38M 225MB
││● nginx active 28M 24MB
││ ├─ ● gitea.cmtec.se 51ms
│ ││ ├─ ● haasp.cmtec.se 43ms │
│ ││ ├─ ● haasp.net 43ms │
│ ││ ├─ ● pages.cmtec.se 45ms │
└────────────────────────────────────┘│ ├─ ● photos.cmtec.se 41ms │
┌backup──────────────────────────────┐│ ├─ ● unifi.cmtec.se 46ms │
│Latest backup: ││ ├─ ● vault.cmtec.se 47ms │
│● Status: OK ││ ├─ ● www.kryddorten.se 81ms │
│Duration: 54s • Last: 4h ago ││ ├─ ● www.mariehall2.se 86ms │
│Disk usage: 48.2GB/915.8GB ││● postgresql active 112M 357MB │
│P/N: Samsung SSD 870 QVO 1TB ││● redis-immich active 8M 45.1GB │
│S/N: S5RRNF0W800639Y ││● sshd active 2M 0 │
│● gitea 2 archives 2.7GB ││● unifi active 594M 495MB │
│● immich 2 archives 45.0GB ││● vaultwarden active 12M 1MB │
│● kryddorten 2 archives 67.6MB ││ │
│● mariehall2 2 archives 321.8MB ││ │
│● nixosbox 2 archives 4.5MB ││ │
│● unifi 2 archives 2.9MB ││ │
│● vaultwarden 2 archives 305kB ││ │
NixOS: ││Service: Status: RAM: Disk: │
Build: 25.05.20251004.3bcc93c ││● docker active 27M 496MB │
Agent: v0.1.43 ││● gitea active 579M 2.6GB │
Active users: cm, simon ││● nginx active 28M 24MB
CPU: ││ ├─ ● gitea.cmtec.se 51ms
● Load: 0.10 0.52 0.88 • 3000MHz ││ ├─ ● photos.cmtec.se 41ms
RAM: ││● postgresql active 112M 357MB
│● Usage: 33% 2.6GB/7.6GB ││● redis-immich user-stopped
│● /tmp: 0% 0B/2.0GB ││● sshd active 2M 0
Storage: ││● unifi active 594M 495MB
● root (Single): ││
├─ ● nvme0n1 W: 1% ││
└─ ● 18% 167.4GB/928.2GB ││
└────────────────────────────────────┘└─────────────────────────────────────────────────┘
```
**Navigation**: `←→` switch hosts, `r` refresh, `q` quit
### Navigation
- **Tab**: Switch between hosts
- **↑↓ or j/k**: Navigate services
- **s**: Start selected service (UserStart)
- **S**: Stop selected service (UserStop)
- **J**: Show service logs (journalctl in tmux popup)
- **R**: Rebuild current host
- **q**: Quit
## Features
- **Real-time monitoring** - Dashboard updates every 1-2 seconds
- **Individual metric collection** - Granular data for flexible dashboard composition
- **Intelligent status aggregation** - Host-level status calculated from all services
- **Smart email notifications** - Batched, detailed alerts with service groupings
- **Persistent state** - Prevents false notifications on restarts
- **ZMQ communication** - Efficient agent-to-dashboard messaging
- **Clean TUI** - Terminal-based dashboard with color-coded status indicators
## Architecture
### Core Components
- **Agent** (`cm-dashboard-agent`) - Collects metrics and sends via ZMQ
- **Dashboard** (`cm-dashboard`) - Real-time TUI display consuming metrics
- **Shared** (`cm-dashboard-shared`) - Common types and protocol
- **Status Aggregation** - Intelligent batching and notification management
- **Persistent Cache** - Maintains state across restarts
### Status Levels
- **🟢 Ok** - Service running normally
- **🔵 Pending** - Service starting/stopping/reloading
- **🟡 Warning** - Service issues (high load, memory, disk usage)
- **🔴 Critical** - Service failed or critical thresholds exceeded
- **❓ Unknown** - Service state cannot be determined
### Status Indicators
- **Green ●**: Active service
- **Yellow ◐**: Inactive service (system issue)
- **Red ◯**: Failed service
- **Blue arrows**: Service transitioning (↑ starting, ↓ stopping, ↻ restarting)
- **"user-stopped"**: Service stopped via dashboard (Status::OK)
## Quick Start
### Build
### Building
```bash
# With Nix (recommended)
@@ -93,21 +111,20 @@ sudo apt install libssl-dev pkg-config # Ubuntu/Debian
cargo build --workspace
```
### Run
### Running
```bash
# Start agent (requires configuration file)
# Start agent (requires configuration)
./target/debug/cm-dashboard-agent --config /etc/cm-dashboard/agent.toml
# Start dashboard
./target/debug/cm-dashboard --config /path/to/dashboard.toml
# Start dashboard (inside tmux session)
tmux
./target/debug/cm-dashboard --config /etc/cm-dashboard/dashboard.toml
```
## Configuration
### Agent Configuration (`agent.toml`)
The agent requires a comprehensive TOML configuration file:
### Agent Configuration
```toml
collection_interval_seconds = 2
@@ -116,50 +133,27 @@ collection_interval_seconds = 2
publisher_port = 6130
command_port = 6131
bind_address = "0.0.0.0"
timeout_ms = 5000
heartbeat_interval_ms = 30000
transmission_interval_seconds = 2
[collectors.cpu]
enabled = true
interval_seconds = 2
load_warning_threshold = 9.0
load_warning_threshold = 5.0
load_critical_threshold = 10.0
temperature_warning_threshold = 100.0
temperature_critical_threshold = 110.0
[collectors.memory]
enabled = true
interval_seconds = 2
usage_warning_percent = 80.0
usage_critical_percent = 95.0
[collectors.disk]
enabled = true
interval_seconds = 300
usage_warning_percent = 80.0
usage_critical_percent = 90.0
[[collectors.disk.filesystems]]
name = "root"
uuid = "4cade5ce-85a5-4a03-83c8-dfd1d3888d79"
mount_point = "/"
fs_type = "ext4"
monitor = true
[collectors.systemd]
enabled = true
interval_seconds = 10
memory_warning_mb = 1000.0
memory_critical_mb = 2000.0
service_name_filters = [
"nginx*", "postgresql*", "redis*", "docker*", "sshd*",
"gitea*", "immich*", "haasp*", "mosquitto*", "mysql*",
"unifi*", "vaultwarden*"
]
excluded_services = [
"nginx-config-reload", "sshd-keygen", "systemd-",
"getty@", "user@", "dbus-", "NetworkManager-"
]
service_name_filters = ["nginx*", "postgresql*", "docker*", "sshd*"]
excluded_services = ["nginx-config-reload", "systemd-", "getty@"]
nginx_latency_critical_ms = 1000.0
http_timeout_seconds = 10
[notifications]
enabled = true
@@ -167,251 +161,202 @@ smtp_host = "localhost"
smtp_port = 25
from_email = "{hostname}@example.com"
to_email = "admin@example.com"
rate_limit_minutes = 0
trigger_on_warnings = true
trigger_on_failures = true
recovery_requires_all_ok = true
suppress_individual_recoveries = true
[status_aggregation]
enabled = true
aggregation_method = "worst_case"
notification_interval_seconds = 30
[cache]
persist_path = "/var/lib/cm-dashboard/cache.json"
aggregation_interval_seconds = 30
```
### Dashboard Configuration (`dashboard.toml`)
### Dashboard Configuration
```toml
[zmq]
hosts = [
{ name = "server1", address = "192.168.1.100", port = 6130 },
{ name = "server2", address = "192.168.1.101", port = 6130 }
]
connection_timeout_ms = 5000
reconnect_interval_ms = 10000
subscriber_ports = [6130]
[hosts]
predefined_hosts = ["cmbox", "srv01", "srv02"]
[ui]
refresh_interval_ms = 1000
theme = "dark"
ssh_user = "cm"
rebuild_alias = "nixos-rebuild-cmtec"
```
## Collectors
## Technical Implementation
The agent implements several specialized collectors:
### Collectors
### CPU Collector (`cpu.rs`)
#### Systemd Collector
- **Service Discovery**: Uses `systemctl list-unit-files` + `list-units --all`
- **Status Calculation**: Checks user-stopped flag before assigning Warning status
- **Memory Tracking**: Per-service memory usage via `systemctl show`
- **Sub-services**: Nginx site latency, Docker containers
- **User-stopped Integration**: `UserStoppedServiceTracker::is_service_user_stopped()`
- Load average (1, 5, 15 minute)
- CPU temperature monitoring
- Real-time process monitoring (top CPU consumers)
- Status calculation with configurable thresholds
#### User-Stopped Service Tracker
- **Storage**: `/var/lib/cm-dashboard/user-stopped-services.json`
- **Thread Safety**: Global singleton with `Arc<Mutex<>>`
- **Persistence**: Automatic save on state changes
- **Global Access**: Static methods for collector integration
### Memory Collector (`memory.rs`)
#### Other Collectors
- **CPU**: Load average, temperature, frequency monitoring
- **Memory**: RAM/swap usage, tmpfs monitoring
- **Disk**: Filesystem usage, SMART health data
- **NixOS**: Build version, active users, agent version
- **Backup**: Borgbackup repository status and metrics
- RAM usage (total, used, available)
- Swap monitoring
- Real-time process monitoring (top RAM consumers)
- Memory pressure detection
### ZMQ Protocol
### Disk Collector (`disk.rs`)
```rust
// Metric Message
#[derive(Serialize, Deserialize)]
pub struct MetricMessage {
pub hostname: String,
pub timestamp: u64,
pub metrics: Vec<Metric>,
}
- Filesystem usage per mount point
- SMART health monitoring
- Temperature and wear tracking
- Configurable filesystem monitoring
// Service Commands
pub enum AgentCommand {
ServiceControl {
service_name: String,
action: ServiceAction,
},
SystemRebuild { /* SSH config */ },
CollectNow,
}
### Systemd Collector (`systemd.rs`)
pub enum ServiceAction {
Start, // System-initiated
Stop, // System-initiated
UserStart, // User via dashboard (clears user-stopped)
UserStop, // User via dashboard (marks user-stopped)
Status,
}
```
- Service status monitoring (`active`, `inactive`, `failed`)
- Memory usage per service
- Service filtering and exclusions
- Handles transitional states (`Status::Pending`)
### Maintenance Mode
### Backup Collector (`backup.rs`)
Suppress notifications during planned maintenance:
- Reads TOML status files from backup systems
- Archive age verification
- Disk usage tracking
- Repository health monitoring
```bash
# Enable maintenance mode
touch /tmp/cm-maintenance
# Perform maintenance
systemctl stop service
# ... work ...
systemctl start service
# Disable maintenance mode
rm /tmp/cm-maintenance
```
## Email Notifications
### Intelligent Batching
- **Real-time dashboard**: Immediate status updates
- **Batched emails**: Aggregated every 30 seconds
- **Smart grouping**: Services organized by severity
- **Recovery suppression**: Reduces notification spam
The system implements smart notification batching to prevent email spam:
- **Real-time dashboard updates** - Status changes appear immediately
- **Batched email notifications** - Aggregated every 30 seconds
- **Detailed groupings** - Services organized by severity
### Example Alert Email
### Example Alert
```
Subject: Status Alert: 2 critical, 1 warning, 15 started
Subject: Status Alert: 1 critical, 2 warnings, 0 recoveries
Status Summary (30s duration)
Host Status: Ok → Warning
🔴 CRITICAL ISSUES (2):
postgresql: Ok → Critical
nginx: Warning → Critical
🔴 CRITICAL ISSUES (1):
postgresql: Ok → Critical (memory usage 95%)
🟡 WARNINGS (1):
redis: Ok → Warning (memory usage 85%)
🟡 WARNINGS (2):
nginx: Ok → Warning (high load 8.5)
redis: user-stopped → Warning (restarted by system)
✅ RECOVERIES (0):
🟢 SERVICE STARTUPS (15):
docker: Unknown → Ok
sshd: Unknown → Ok
...
--
CM Dashboard Agent
Generated at 2025-10-21 19:42:42 CET
CM Dashboard Agent v0.1.43
```
## Individual Metrics Architecture
The system follows a **metrics-first architecture**:
### Agent Side
```rust
// Agent collects individual metrics
vec![
Metric::new("cpu_load_1min".to_string(), MetricValue::Float(2.5), Status::Ok),
Metric::new("memory_usage_percent".to_string(), MetricValue::Float(78.5), Status::Warning),
Metric::new("service_nginx_status".to_string(), MetricValue::String("active".to_string()), Status::Ok),
]
```
### Dashboard Side
```rust
// Widgets subscribe to specific metrics
impl Widget for CpuWidget {
fn update_from_metrics(&mut self, metrics: &[&Metric]) {
for metric in metrics {
match metric.name.as_str() {
"cpu_load_1min" => self.load_1min = metric.value.as_f32(),
"cpu_load_5min" => self.load_5min = metric.value.as_f32(),
"cpu_temperature_celsius" => self.temperature = metric.value.as_f32(),
_ => {}
}
}
}
}
```
## Persistent Cache
The cache system prevents false notifications:
- **Automatic saving** - Saves when service status changes
- **Persistent storage** - Maintains state across agent restarts
- **Simple design** - No complex TTL or cleanup logic
- **Status preservation** - Prevents duplicate notifications
## Development
### Project Structure
```
cm-dashboard/
├── agent/ # Metrics collection agent
│ ├── src/
│ │ ├── collectors/ # CPU, memory, disk, systemd, backup
│ │ ├── collectors/ # CPU, memory, disk, systemd, backup, nixos
│ │ ├── service_tracker.rs # User-stopped service tracking
│ │ ├── status/ # Status aggregation and notifications
│ │ ├── cache/ # Persistent metric caching
│ │ ├── config/ # TOML configuration loading
│ │ └── notifications/ # Email notification system
│ │ └── communication/ # ZMQ message handling
├── dashboard/ # TUI dashboard application
│ ├── src/
│ │ ├── ui/widgets/ # CPU, memory, services, backup widgets
│ │ ├── metrics/ # Metric storage and filtering
│ │ └── communication/ # ZMQ metric consumption
│ │ ├── ui/widgets/ # CPU, memory, services, backup, system
│ │ ├── communication/ # ZMQ consumption and commands
│ │ └── app.rs # Main application loop
├── shared/ # Shared types and utilities
│ └── src/
│ ├── metrics.rs # Metric, Status, and Value types
│ ├── metrics.rs # Metric, Status, StatusTracker types
│ ├── protocol.rs # ZMQ message format
│ └── cache.rs # Cache configuration
└── README.md # This file
└── CLAUDE.md # Development guidelines and rules
```
### Building
### Testing
```bash
# Debug build
cargo build --workspace
# Build and test
nix-shell -p openssl pkg-config --run "cargo build --workspace"
nix-shell -p openssl pkg-config --run "cargo test --workspace"
# Release build
cargo build --workspace --release
# Run tests
cargo test --workspace
# Check code formatting
cargo fmt --all -- --check
# Run clippy linter
# Code quality
cargo fmt --all
cargo clippy --workspace -- -D warnings
```
### Dependencies
## Deployment
- **tokio** - Async runtime
- **zmq** - Message passing between agent and dashboard
- **ratatui** - Terminal user interface
- **serde** - Serialization for metrics and config
- **anyhow/thiserror** - Error handling
- **tracing** - Structured logging
- **lettre** - SMTP email notifications
- **clap** - Command-line argument parsing
- **toml** - Configuration file parsing
### Automated Binary Releases
```bash
# Create new release
cd ~/projects/cm-dashboard
git tag v0.1.X
git push origin v0.1.X
```
## NixOS Integration
This triggers automated:
- Static binary compilation with `RUSTFLAGS="-C target-feature=+crt-static"`
- GitHub-style release creation
- Tarball upload to Gitea
This project is designed for declarative deployment via NixOS:
### Configuration Generation
The NixOS module automatically generates the agent configuration:
### NixOS Integration
Update `~/projects/nixosbox/hosts/common/cm-dashboard.nix`:
```nix
# hosts/common/cm-dashboard.nix
services.cm-dashboard-agent = {
enable = true;
port = 6130;
version = "v0.1.43";
src = pkgs.fetchurl {
url = "https://gitea.cmtec.se/cm/cm-dashboard/releases/download/${version}/cm-dashboard-linux-x86_64.tar.gz";
sha256 = "sha256-HASH";
};
```
### Deployment
Get hash via:
```bash
# Update NixOS configuration
git add hosts/common/cm-dashboard.nix
git commit -m "Update cm-dashboard configuration"
git push
# Rebuild system (user-performed)
sudo nixos-rebuild switch --flake .
cd ~/projects/nixosbox
nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchurl {
url = "URL_HERE";
sha256 = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
}' 2>&1 | grep "got:"
```
## Monitoring Intervals
- **CPU/Memory**: 2 seconds (real-time monitoring)
- **Disk usage**: 300 seconds (5 minutes)
- **Systemd services**: 10 seconds
- **SMART health**: 600 seconds (10 minutes)
- **Backup status**: 60 seconds (1 minute)
- **Email notifications**: 30 seconds (batched)
- **Dashboard updates**: 1 second (real-time display)
- **Metrics Collection**: 2 seconds (CPU, memory, services)
- **Metric Transmission**: 2 seconds (ZMQ publish)
- **Dashboard Updates**: 1 second (UI refresh)
- **Email Notifications**: 30 seconds (batched)
- **Disk Monitoring**: 300 seconds (5 minutes)
- **Service Discovery**: 300 seconds (5 minutes cache)
## License
MIT License - see LICENSE file for details
MIT License - see LICENSE file for details.

63
TODO.md
View File

@@ -1,63 +0,0 @@
# TODO
## Systemd filtering (agent)
- remove user systemd collection
- reduce number of systemctl call
- Cahnge so only services in include list are detected
- Filter on exact name
- Add support for "\*" in filtering
## System panel (agent/dashboard)
use following layout:
'''
NixOS:
Build: xxxxxx
Agen: xxxxxx
CPU:
● Load: 0.02 0.31 0.86
└─ Freq: 3000MHz
RAM:
● Usage: 33% 2.6GB/7.6GB
└─ ● /tmp: 0% 0B/2.0GB
Storage:
● /:
├─ ● nvme0n1 T: 40C • W: 4%
└─ ● 8% 75.0GB/906.2GB
'''
- Add support to show login/active users
- Add support to show timestamp/version for latest nixos rebuild
## Backup panel (dashboard)
use following layout:
'''
Latest backup:
● <timestamp>
└─ Duration: 1.3m
Disk:
● Samsung SSD 870 QVO 1TB
├─ S/N: S5RRNF0W800639Y
└─ Usage: 50.5GB/915.8GB
Repos:
● gitea (4) 5.1GB
● immich (4) 45.0GB
● kryddorten (4) 67.8MB
● mariehall2 (4) 322.7MB
● nixosbox (4) 5.5MB
● unifi (4) 5.7MB
● vaultwarden (4) 508kB
'''
## Keyboard navigation and scrolling (dashboard)
- Add keyboard navigation between panels "Shift-Tab"
- Add lower statusbar with dynamic updated shortcuts when switchng between panels
## Remote execution (agent/dashboard)
- Add support for send command via dashboard to agent to do nixos rebuid
- Add support for navigating services in dashboard and trigger start/stop/restart
- Add support for trigger backup

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard-agent"
version = "0.1.40"
version = "0.1.47"
edition = "2021"
[dependencies]

View File

@@ -8,6 +8,7 @@ use crate::communication::{AgentCommand, ServiceAction, ZmqHandler};
use crate::config::AgentConfig;
use crate::metrics::MetricCollectionManager;
use crate::notifications::NotificationManager;
use crate::service_tracker::UserStoppedServiceTracker;
use crate::status::HostStatusManager;
use cm_dashboard_shared::{Metric, MetricMessage, MetricValue, Status};
@@ -18,6 +19,7 @@ pub struct Agent {
metric_manager: MetricCollectionManager,
notification_manager: NotificationManager,
host_status_manager: HostStatusManager,
service_tracker: UserStoppedServiceTracker,
}
impl Agent {
@@ -50,6 +52,10 @@ impl Agent {
let host_status_manager = HostStatusManager::new(config.status_aggregation.clone());
info!("Host status manager initialized");
// Initialize user-stopped service tracker
let service_tracker = UserStoppedServiceTracker::init_global()?;
info!("User-stopped service tracker initialized");
Ok(Self {
hostname,
config,
@@ -57,6 +63,7 @@ impl Agent {
metric_manager,
notification_manager,
host_status_manager,
service_tracker,
})
}
@@ -173,6 +180,9 @@ impl Agent {
let version_metric = self.get_agent_version_metric();
metrics.push(version_metric);
// Check for user-stopped services that are now active and clear their flags
self.clear_user_stopped_flags_for_active_services(&metrics);
if metrics.is_empty() {
debug!("No metrics to broadcast");
return Ok(());
@@ -271,18 +281,34 @@ impl Agent {
/// Handle systemd service control commands
async fn handle_service_control(&mut self, service_name: &str, action: &ServiceAction) -> Result<()> {
let action_str = match action {
ServiceAction::Start => "start",
ServiceAction::Stop => "stop",
ServiceAction::Status => "status",
let (action_str, is_user_action) = match action {
ServiceAction::Start => ("start", false),
ServiceAction::Stop => ("stop", false),
ServiceAction::Status => ("status", false),
ServiceAction::UserStart => ("start", true),
ServiceAction::UserStop => ("stop", true),
};
info!("Executing systemctl {} {}", action_str, service_name);
info!("Executing systemctl {} {} (user action: {})", action_str, service_name, is_user_action);
// Handle user-stopped service tracking before systemctl execution (stop only)
match action {
ServiceAction::UserStop => {
info!("Marking service '{}' as user-stopped", service_name);
if let Err(e) = self.service_tracker.mark_user_stopped(service_name) {
error!("Failed to mark service as user-stopped: {}", e);
} else {
// Sync to global tracker
UserStoppedServiceTracker::update_global(&self.service_tracker);
}
}
_ => {}
}
let output = tokio::process::Command::new("sudo")
.arg("systemctl")
.arg(action_str)
.arg(service_name)
.arg(format!("{}.service", service_name))
.output()
.await?;
@@ -291,6 +317,9 @@ impl Agent {
if !output.stdout.is_empty() {
debug!("stdout: {}", String::from_utf8_lossy(&output.stdout));
}
// Note: User-stopped flag will be cleared by systemd collector
// when service actually reaches 'active' state, not here
} else {
let stderr = String::from_utf8_lossy(&output.stderr);
error!("Service {} {} failed: {}", service_name, action_str, stderr);
@@ -298,7 +327,7 @@ impl Agent {
}
// Force refresh metrics after service control to update service status
if matches!(action, ServiceAction::Start | ServiceAction::Stop) {
if matches!(action, ServiceAction::Start | ServiceAction::Stop | ServiceAction::UserStart | ServiceAction::UserStop) {
info!("Triggering immediate metric refresh after service control");
if let Err(e) = self.collect_metrics_only().await {
error!("Failed to refresh metrics after service control: {}", e);
@@ -310,4 +339,33 @@ impl Agent {
Ok(())
}
/// Check metrics for user-stopped services that are now active and clear their flags
fn clear_user_stopped_flags_for_active_services(&mut self, metrics: &[Metric]) {
for metric in metrics {
// Look for service status metrics that are active
if metric.name.starts_with("service_") && metric.name.ends_with("_status") {
if let MetricValue::String(status) = &metric.value {
if status == "active" {
// Extract service name from metric name (service_nginx_status -> nginx)
let service_name = metric.name
.strip_prefix("service_")
.and_then(|s| s.strip_suffix("_status"))
.unwrap_or("");
if !service_name.is_empty() && UserStoppedServiceTracker::is_service_user_stopped(service_name) {
info!("Service '{}' is now active - clearing user-stopped flag", service_name);
if let Err(e) = self.service_tracker.clear_user_stopped(service_name) {
error!("Failed to clear user-stopped flag for '{}': {}", service_name, e);
} else {
// Sync to global tracker
UserStoppedServiceTracker::update_global(&self.service_tracker);
debug!("Cleared user-stopped flag for service '{}'", service_name);
}
}
}
}
}
}
}
}

View File

@@ -37,6 +37,22 @@ impl NixOSCollector {
}
/// Get configuration hash from deployed nix store system
/// Get git commit hash from rebuild process
fn get_git_commit(&self) -> Result<String, Box<dyn std::error::Error>> {
let commit_file = "/var/lib/cm-dashboard/git-commit";
match std::fs::read_to_string(commit_file) {
Ok(content) => {
let commit_hash = content.trim();
if commit_hash.len() >= 7 {
Ok(commit_hash.to_string())
} else {
Err("Git commit hash too short".into())
}
}
Err(e) => Err(format!("Failed to read git commit file: {}", e).into())
}
}
fn get_config_hash(&self) -> Result<String, Box<dyn std::error::Error>> {
// Read the symlink target of /run/current-system to get nix store path
let output = Command::new("readlink")
@@ -74,25 +90,25 @@ impl Collector for NixOSCollector {
let mut metrics = Vec::new();
let timestamp = chrono::Utc::now().timestamp() as u64;
// Collect NixOS build information (config hash)
match self.get_config_hash() {
Ok(config_hash) => {
// Collect git commit information (shows what's actually deployed)
match self.get_git_commit() {
Ok(git_commit) => {
metrics.push(Metric {
name: "system_nixos_build".to_string(),
value: MetricValue::String(config_hash),
value: MetricValue::String(git_commit),
unit: None,
description: Some("NixOS deployed configuration hash".to_string()),
description: Some("Git commit hash of deployed configuration".to_string()),
status: Status::Ok,
timestamp,
});
}
Err(e) => {
debug!("Failed to get config hash: {}", e);
debug!("Failed to get git commit: {}", e);
metrics.push(Metric {
name: "system_nixos_build".to_string(),
value: MetricValue::String("unknown".to_string()),
unit: None,
description: Some("NixOS config hash (failed to detect)".to_string()),
description: Some("Git commit hash (failed to detect)".to_string()),
status: Status::Unknown,
timestamp,
});

View File

@@ -8,6 +8,7 @@ use tracing::debug;
use super::{Collector, CollectorError};
use crate::config::SystemdConfig;
use crate::service_tracker::UserStoppedServiceTracker;
/// Systemd collector for monitoring systemd services
pub struct SystemdCollector {
@@ -353,13 +354,37 @@ impl SystemdCollector {
Ok((active_status, detailed_info))
}
/// Calculate service status
fn calculate_service_status(&self, active_status: &str) -> Status {
/// Calculate service status, taking user-stopped services into account
fn calculate_service_status(&self, service_name: &str, active_status: &str) -> Status {
match active_status.to_lowercase().as_str() {
"active" => Status::Ok,
"inactive" | "dead" => Status::Warning,
"active" => {
// If service is now active and was marked as user-stopped, clear the flag
if UserStoppedServiceTracker::is_service_user_stopped(service_name) {
debug!("Service '{}' is now active - clearing user-stopped flag", service_name);
// Note: We can't directly clear here because this is a read-only context
// The agent will need to handle this differently
}
Status::Ok
},
"inactive" | "dead" => {
// Check if this service was stopped by user action
if UserStoppedServiceTracker::is_service_user_stopped(service_name) {
debug!("Service '{}' is inactive but marked as user-stopped - treating as OK", service_name);
Status::Ok
} else {
Status::Warning
}
},
"failed" | "error" => Status::Critical,
"activating" | "deactivating" | "reloading" | "start" | "stop" | "restart" => Status::Pending,
"activating" | "deactivating" | "reloading" | "start" | "stop" | "restart" => {
// For user-stopped services that are transitioning, keep them as OK during transition
if UserStoppedServiceTracker::is_service_user_stopped(service_name) {
debug!("Service '{}' is transitioning but was user-stopped - treating as OK", service_name);
Status::Ok
} else {
Status::Pending
}
},
_ => Status::Unknown,
}
}
@@ -480,7 +505,7 @@ impl Collector for SystemdCollector {
for service in &monitored_services {
match self.get_service_status(service) {
Ok((active_status, _detailed_info)) => {
let status = self.calculate_service_status(&active_status);
let status = self.calculate_service_status(service, &active_status);
// Individual service status metric
metrics.push(Metric {

View File

@@ -113,4 +113,6 @@ pub enum ServiceAction {
Start,
Stop,
Status,
UserStart, // User-initiated start (clears user-stopped flag)
UserStop, // User-initiated stop (marks as user-stopped)
}

View File

@@ -9,6 +9,7 @@ mod communication;
mod config;
mod metrics;
mod notifications;
mod service_tracker;
mod status;
use agent::Agent;

View File

@@ -0,0 +1,172 @@
use anyhow::Result;
use serde::{Deserialize, Serialize};
use std::collections::HashSet;
use std::fs;
use std::path::Path;
use std::sync::{Arc, Mutex, OnceLock};
use tracing::{debug, info, warn};
/// Shared instance for global access
static GLOBAL_TRACKER: OnceLock<Arc<Mutex<UserStoppedServiceTracker>>> = OnceLock::new();
/// Tracks services that have been stopped by user action
/// These services should be treated as OK status instead of Warning
#[derive(Debug)]
pub struct UserStoppedServiceTracker {
/// Set of services stopped by user action
user_stopped_services: HashSet<String>,
/// Path to persistent storage file
storage_path: String,
}
/// Serializable data structure for persistence
#[derive(Debug, Serialize, Deserialize)]
struct UserStoppedData {
services: Vec<String>,
}
impl UserStoppedServiceTracker {
/// Create new tracker with default storage path
pub fn new() -> Self {
Self::with_storage_path("/var/lib/cm-dashboard/user-stopped-services.json")
}
/// Initialize global instance (called by agent)
pub fn init_global() -> Result<Self> {
let tracker = Self::new();
// Set global instance
let global_instance = Arc::new(Mutex::new(tracker));
if GLOBAL_TRACKER.set(global_instance).is_err() {
warn!("Global service tracker was already initialized");
}
// Return a new instance for the agent to use
Ok(Self::new())
}
/// Check if a service is user-stopped (global access for collectors)
pub fn is_service_user_stopped(service_name: &str) -> bool {
if let Some(global) = GLOBAL_TRACKER.get() {
if let Ok(tracker) = global.lock() {
tracker.is_user_stopped(service_name)
} else {
debug!("Failed to lock global service tracker");
false
}
} else {
debug!("Global service tracker not initialized");
false
}
}
/// Update global tracker (called by agent when tracker state changes)
pub fn update_global(updated_tracker: &UserStoppedServiceTracker) {
if let Some(global) = GLOBAL_TRACKER.get() {
if let Ok(mut tracker) = global.lock() {
tracker.user_stopped_services = updated_tracker.user_stopped_services.clone();
} else {
debug!("Failed to lock global service tracker for update");
}
} else {
debug!("Global service tracker not initialized for update");
}
}
/// Create new tracker with custom storage path
pub fn with_storage_path<P: AsRef<Path>>(storage_path: P) -> Self {
let storage_path = storage_path.as_ref().to_string_lossy().to_string();
let mut tracker = Self {
user_stopped_services: HashSet::new(),
storage_path,
};
// Load existing data from storage
if let Err(e) = tracker.load_from_storage() {
warn!("Failed to load user-stopped services from storage: {}", e);
info!("Starting with empty user-stopped services list");
}
tracker
}
/// Mark a service as user-stopped
pub fn mark_user_stopped(&mut self, service_name: &str) -> Result<()> {
info!("Marking service '{}' as user-stopped", service_name);
self.user_stopped_services.insert(service_name.to_string());
self.save_to_storage()?;
debug!("Service '{}' marked as user-stopped and saved to storage", service_name);
Ok(())
}
/// Clear user-stopped flag for a service (when user starts it)
pub fn clear_user_stopped(&mut self, service_name: &str) -> Result<()> {
if self.user_stopped_services.remove(service_name) {
info!("Cleared user-stopped flag for service '{}'", service_name);
self.save_to_storage()?;
debug!("Service '{}' user-stopped flag cleared and saved to storage", service_name);
} else {
debug!("Service '{}' was not marked as user-stopped", service_name);
}
Ok(())
}
/// Check if a service is marked as user-stopped
pub fn is_user_stopped(&self, service_name: &str) -> bool {
let is_stopped = self.user_stopped_services.contains(service_name);
debug!("Service '{}' user-stopped status: {}", service_name, is_stopped);
is_stopped
}
/// Save current state to persistent storage
fn save_to_storage(&self) -> Result<()> {
// Create parent directory if it doesn't exist
if let Some(parent_dir) = Path::new(&self.storage_path).parent() {
if !parent_dir.exists() {
fs::create_dir_all(parent_dir)?;
debug!("Created parent directory: {}", parent_dir.display());
}
}
let data = UserStoppedData {
services: self.user_stopped_services.iter().cloned().collect(),
};
let json_data = serde_json::to_string_pretty(&data)?;
fs::write(&self.storage_path, json_data)?;
debug!(
"Saved {} user-stopped services to {}",
data.services.len(),
self.storage_path
);
Ok(())
}
/// Load state from persistent storage
fn load_from_storage(&mut self) -> Result<()> {
if !Path::new(&self.storage_path).exists() {
debug!("Storage file {} does not exist, starting fresh", self.storage_path);
return Ok(());
}
let json_data = fs::read_to_string(&self.storage_path)?;
let data: UserStoppedData = serde_json::from_str(&json_data)?;
self.user_stopped_services = data.services.into_iter().collect();
info!(
"Loaded {} user-stopped services from {}",
self.user_stopped_services.len(),
self.storage_path
);
if !self.user_stopped_services.is_empty() {
debug!("User-stopped services: {:?}", self.user_stopped_services);
}
Ok(())
}
}

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard"
version = "0.1.40"
version = "0.1.47"
edition = "2021"
[dependencies]

View File

@@ -295,18 +295,18 @@ impl Dashboard {
async fn execute_ui_command(&self, command: UiCommand) -> Result<()> {
match command {
UiCommand::ServiceStart { hostname, service_name } => {
info!("Sending start command for service {} on {}", service_name, hostname);
info!("Sending user start command for service {} on {}", service_name, hostname);
let agent_command = AgentCommand::ServiceControl {
service_name: service_name.clone(),
action: ServiceAction::Start,
action: ServiceAction::UserStart,
};
self.zmq_command_sender.send_command(&hostname, agent_command).await?;
}
UiCommand::ServiceStop { hostname, service_name } => {
info!("Sending stop command for service {} on {}", service_name, hostname);
info!("Sending user stop command for service {} on {}", service_name, hostname);
let agent_command = AgentCommand::ServiceControl {
service_name: service_name.clone(),
action: ServiceAction::Stop,
action: ServiceAction::UserStop,
};
self.zmq_command_sender.send_command(&hostname, agent_command).await?;
}

View File

@@ -36,6 +36,8 @@ pub enum ServiceAction {
Start,
Stop,
Status,
UserStart, // User-initiated start (clears user-stopped flag)
UserStop, // User-initiated stop (marks as user-stopped)
}
/// ZMQ consumer for receiving metrics from agents

View File

@@ -12,10 +12,6 @@ mod ui;
use app::Dashboard;
/// Get hardcoded version
fn get_version() -> &'static str {
"v0.1.33"
}
/// Check if running inside tmux session
fn check_tmux_session() {
@@ -42,7 +38,7 @@ fn check_tmux_session() {
#[derive(Parser)]
#[command(name = "cm-dashboard")]
#[command(about = "CM Dashboard TUI with individual metric consumption")]
#[command(version = get_version())]
#[command(version)]
struct Cli {
/// Increase logging verbosity (-v, -vv)
#[arg(short, long, action = clap::ArgAction::Count)]

View File

@@ -260,6 +260,10 @@ ssh -tt {}@{} 'bash -ic {}'",
std::process::Command::new("tmux")
.arg("display-popup")
.arg("-w")
.arg("80%")
.arg("-h")
.arg("80%")
.arg(&logo_and_rebuild)
.spawn()
.ok(); // Ignore errors, tmux will handle them
@@ -281,6 +285,29 @@ ssh -tt {}@{} 'bash -ic {}'",
}
}
}
KeyCode::Char('J') => {
// Show service logs via journalctl in tmux popup
if let (Some(service_name), Some(hostname)) = (self.get_selected_service(), self.current_host.clone()) {
let journalctl_command = format!(
"ssh -tt {}@{} 'journalctl -u {}.service -f --no-pager -n 50'",
self.config.ssh.rebuild_user,
hostname,
service_name
);
std::process::Command::new("tmux")
.arg("display-popup")
.arg("-w")
.arg("80%")
.arg("-h")
.arg("80%")
.arg("-t")
.arg(format!("Logs: {}", service_name))
.arg(&journalctl_command)
.spawn()
.ok(); // Ignore errors, tmux will handle them
}
}
KeyCode::Char('b') => {
// Trigger backup
if let Some(hostname) = self.current_host.clone() {
@@ -568,7 +595,7 @@ ssh -tt {}@{} 'bash -ic {}'",
// Left side: "cm-dashboard" text
let left_span = Span::styled(
" cm-dashboard",
Style::default().fg(Theme::background()).bg(background_color)
Style::default().fg(Theme::background()).bg(background_color).add_modifier(Modifier::BOLD)
);
let left_title = Paragraph::new(Line::from(vec![left_span]))
.style(Style::default().bg(background_color));
@@ -686,10 +713,11 @@ ssh -tt {}@{} 'bash -ic {}'",
let mut shortcuts = Vec::new();
// Global shortcuts
shortcuts.push("Tab: Switch Host".to_string());
shortcuts.push("↑↓/jk: Select Service".to_string());
shortcuts.push("r: Rebuild Host".to_string());
shortcuts.push("s/S: Start/Stop Service".to_string());
shortcuts.push("Tab: Host".to_string());
shortcuts.push("↑↓/jk: Select".to_string());
shortcuts.push("r: Rebuild".to_string());
shortcuts.push("s/S: Start/Stop".to_string());
shortcuts.push("J: Logs".to_string());
// Always show quit
shortcuts.push("q: Quit".to_string());
@@ -708,7 +736,7 @@ ssh -tt {}@{} 'bash -ic {}'",
host_widgets.system_scroll_offset
};
let host_widgets = self.get_or_create_host_widgets(&hostname);
host_widgets.system_widget.render_with_scroll(frame, inner_area, scroll_offset);
host_widgets.system_widget.render_with_scroll(frame, inner_area, scroll_offset, &hostname);
}
}

View File

@@ -292,12 +292,6 @@ impl Components {
}
impl Typography {
/// Main title style (dashboard header)
pub fn title() -> Style {
Style::default()
.fg(Theme::primary_text())
.bg(Theme::background())
}
/// Widget title style (panel headers) - bold bright white
pub fn widget_title() -> Style {

View File

@@ -113,13 +113,10 @@ impl ServicesWidget {
name.to_string()
};
// Parent services always show active/inactive status
// Parent services always show actual systemctl status
let status_str = match info.widget_status {
Status::Ok => "active".to_string(),
Status::Pending => "pending".to_string(),
Status::Warning => "inactive".to_string(),
Status::Critical => "failed".to_string(),
Status::Unknown => "unknown".to_string(),
_ => info.status.clone(), // Use actual status from agent (active/inactive/failed)
};
format!(

View File

@@ -439,12 +439,12 @@ impl Widget for SystemWidget {
impl SystemWidget {
/// Render with scroll offset support
pub fn render_with_scroll(&mut self, frame: &mut Frame, area: Rect, scroll_offset: usize) {
pub fn render_with_scroll(&mut self, frame: &mut Frame, area: Rect, scroll_offset: usize, hostname: &str) {
let mut lines = Vec::new();
// NixOS section
lines.push(Line::from(vec![
Span::styled("NixOS:", Typography::widget_title())
Span::styled(format!("NixOS {}:", hostname), Typography::widget_title())
]));
let build_text = self.nixos_build.as_deref().unwrap_or("unknown");

View File

@@ -1,88 +0,0 @@
# Hardcoded Values Removed - Configuration Summary
## ✅ All Hardcoded Values Converted to Configuration
### **1. SystemD Nginx Check Interval**
- **Before**: `nginx_check_interval_seconds: 30` (hardcoded)
- **After**: `nginx_check_interval_seconds: config.nginx_check_interval_seconds`
- **NixOS Config**: `nginx_check_interval_seconds = 30;`
### **2. ZMQ Transmission Interval**
- **Before**: `Duration::from_secs(1)` (hardcoded)
- **After**: `Duration::from_secs(self.config.zmq.transmission_interval_seconds)`
- **NixOS Config**: `transmission_interval_seconds = 1;`
### **3. HTTP Timeouts in SystemD Collector**
- **Before**:
```rust
.timeout(Duration::from_secs(10))
.connect_timeout(Duration::from_secs(10))
```
- **After**:
```rust
.timeout(Duration::from_secs(self.config.http_timeout_seconds))
.connect_timeout(Duration::from_secs(self.config.http_connect_timeout_seconds))
```
- **NixOS Config**:
```nix
http_timeout_seconds = 10;
http_connect_timeout_seconds = 10;
```
## **Configuration Structure Changes**
### **SystemdConfig** (agent/src/config/mod.rs)
```rust
pub struct SystemdConfig {
// ... existing fields ...
pub nginx_check_interval_seconds: u64, // NEW
pub http_timeout_seconds: u64, // NEW
pub http_connect_timeout_seconds: u64, // NEW
}
```
### **ZmqConfig** (agent/src/config/mod.rs)
```rust
pub struct ZmqConfig {
// ... existing fields ...
pub transmission_interval_seconds: u64, // NEW
}
```
## **NixOS Configuration Updates**
### **ZMQ Section** (hosts/common/cm-dashboard.nix)
```nix
zmq = {
# ... existing fields ...
transmission_interval_seconds = 1; # NEW
};
```
### **SystemD Section** (hosts/common/cm-dashboard.nix)
```nix
systemd = {
# ... existing fields ...
nginx_check_interval_seconds = 30; # NEW
http_timeout_seconds = 10; # NEW
http_connect_timeout_seconds = 10; # NEW
};
```
## **Benefits**
**No hardcoded values** - All timing/timeout values configurable
**Consistent configuration** - Everything follows NixOS config pattern
**Environment-specific tuning** - Can adjust timeouts per deployment
**Maintainability** - No magic numbers scattered in code
**Testing flexibility** - Can configure different values for testing
## **Runtime Behavior**
All previously hardcoded values now respect configuration:
- **Nginx latency checks**: Every 30s (configurable)
- **ZMQ transmission**: Every 1s (configurable)
- **HTTP requests**: 10s timeout (configurable)
- **HTTP connections**: 10s timeout (configurable)
The codebase is now **100% configuration-driven** with no hardcoded timing values.

View File

@@ -1,6 +1,6 @@
[package]
name = "cm-dashboard-shared"
version = "0.1.40"
version = "0.1.47"
edition = "2021"
[dependencies]

View File

@@ -1,42 +0,0 @@
#!/bin/bash
# Test script to verify collector intervals are working correctly
# Expected behavior:
# - CPU/Memory: Every 2 seconds
# - Systemd/Network: Every 10 seconds
# - Backup/NixOS: Every 60 seconds
# - Disk: Every 300 seconds (5 minutes)
echo "=== Testing Collector Interval Implementation ==="
echo "Expected intervals from NixOS config:"
echo " CPU: 2s, Memory: 2s"
echo " Systemd: 10s, Network: 10s"
echo " Backup: 60s, NixOS: 60s"
echo " Disk: 300s (5m)"
echo ""
# Note: Cannot run actual agent without proper config, but we can verify the code logic
echo "✅ Code Implementation Status:"
echo " - TimedCollector struct with interval tracking: IMPLEMENTED"
echo " - Individual collector intervals from config: IMPLEMENTED"
echo " - collect_metrics_timed() respects intervals: IMPLEMENTED"
echo " - Debug logging shows interval compliance: IMPLEMENTED"
echo ""
echo "🔍 Key Implementation Details:"
echo " - MetricCollectionManager now tracks last_collection time per collector"
echo " - Each collector gets Duration::from_secs(config.{collector}.interval_seconds)"
echo " - Only collectors with elapsed >= interval are called"
echo " - Debug logs show actual collection with interval info"
echo ""
echo "📊 Expected Runtime Behavior:"
echo " At 0s: All collectors run (startup)"
echo " At 2s: CPU, Memory run"
echo " At 4s: CPU, Memory run"
echo " At 10s: CPU, Memory, Systemd, Network run"
echo " At 60s: CPU, Memory, Systemd, Network, Backup, NixOS run"
echo " At 300s: All collectors run including Disk"
echo ""
echo "✅ CONCLUSION: Codebase now follows NixOS configuration intervals correctly!"

View File

@@ -1,32 +0,0 @@
#!/usr/bin/env rust-script
use std::process;
/// Check if running inside tmux session
fn check_tmux_session() {
// Check for TMUX environment variable which is set when inside a tmux session
if std::env::var("TMUX").is_err() {
eprintln!("╭─────────────────────────────────────────────────────────────╮");
eprintln!("│ ⚠️ TMUX REQUIRED │");
eprintln!("├─────────────────────────────────────────────────────────────┤");
eprintln!("│ CM Dashboard must be run inside a tmux session for proper │");
eprintln!("│ terminal handling and remote operation functionality. │");
eprintln!("│ │");
eprintln!("│ Please start a tmux session first: │");
eprintln!("│ tmux new-session -d -s dashboard cm-dashboard │");
eprintln!("│ tmux attach-session -t dashboard │");
eprintln!("│ │");
eprintln!("│ Or simply: │");
eprintln!("│ tmux │");
eprintln!("│ cm-dashboard │");
eprintln!("╰─────────────────────────────────────────────────────────────╯");
process::exit(1);
} else {
println!("✅ Running inside tmux session - OK");
}
}
fn main() {
println!("Testing tmux check function...");
check_tmux_session();
}

View File

@@ -1,53 +0,0 @@
#!/bin/bash
echo "=== TMUX Check Implementation Test ==="
echo ""
echo "📋 Testing tmux check logic:"
echo ""
echo "1. Current environment:"
if [ -n "$TMUX" ]; then
echo " ✅ Running inside tmux session"
echo " TMUX variable: $TMUX"
else
echo " ❌ NOT running inside tmux session"
echo " TMUX variable: (not set)"
fi
echo ""
echo "2. Simulating dashboard tmux check logic:"
echo ""
# Simulate the Rust check logic
if [ -z "$TMUX" ]; then
echo " Dashboard would show:"
echo " ╭─────────────────────────────────────────────────────────────╮"
echo " │ ⚠️ TMUX REQUIRED │"
echo " ├─────────────────────────────────────────────────────────────┤"
echo " │ CM Dashboard must be run inside a tmux session for proper │"
echo " │ terminal handling and remote operation functionality. │"
echo " │ │"
echo " │ Please start a tmux session first: │"
echo " │ tmux new-session -d -s dashboard cm-dashboard │"
echo " │ tmux attach-session -t dashboard │"
echo " │ │"
echo " │ Or simply: │"
echo " │ tmux │"
echo " │ cm-dashboard │"
echo " ╰─────────────────────────────────────────────────────────────╯"
echo " Then exit with code 1"
else
echo " ✅ Dashboard tmux check would PASS - continuing normally"
fi
echo ""
echo "3. Implementation status:"
echo " ✅ check_tmux_session() function added to dashboard/src/main.rs"
echo " ✅ Called early in main() but only for TUI mode (not headless)"
echo " ✅ Uses std::env::var(\"TMUX\") to detect tmux session"
echo " ✅ Shows helpful error message with usage instructions"
echo " ✅ Exits with code 1 if not in tmux"
echo ""
echo "✅ TMUX check implementation complete!"