- Replace source build instructions with release workflow - Document tag-based release process with Gitea Actions - Include NixOS config update process for releases - Highlight benefits of static binary approach
356 lines
13 KiB
Markdown
356 lines
13 KiB
Markdown
# CM Dashboard - Infrastructure Monitoring TUI
|
|
|
|
## Overview
|
|
|
|
A high-performance Rust-based TUI dashboard for monitoring CMTEC infrastructure. Built to replace Glance with a custom solution tailored for our specific monitoring needs and ZMQ-based metric collection.
|
|
|
|
## Implementation Strategy
|
|
|
|
### Current Implementation Status
|
|
|
|
**System Panel Enhancement - COMPLETED** ✅
|
|
|
|
All system panel features successfully implemented:
|
|
- ✅ **NixOS Collector**: Created collector for version and active users
|
|
- ✅ **System Widget**: Unified widget combining NixOS, CPU, RAM, and Storage
|
|
- ✅ **Build Display**: Shows NixOS build information without codename
|
|
- ✅ **Active Users**: Displays currently logged in users
|
|
- ✅ **Tmpfs Monitoring**: Added /tmp usage to RAM section
|
|
- ✅ **Agent Deployment**: NixOS collector working in production
|
|
|
|
**Keyboard Navigation and Service Management - COMPLETED** ✅
|
|
|
|
All keyboard navigation and service selection features successfully implemented:
|
|
- ✅ **Panel Navigation**: Shift+Tab cycles through visible panels only (System → Services → Backup)
|
|
- ✅ **Service Selection**: Up/Down arrows navigate through parent services with visual cursor
|
|
- ✅ **Focus Management**: Selection highlighting only visible when Services panel focused
|
|
- ✅ **Status Preservation**: Service health colors maintained during selection (green/red icons)
|
|
- ✅ **Smart Panel Switching**: Only cycles through panels with data (backup panel conditional)
|
|
- ✅ **Scroll Support**: All panels support content scrolling with proper overflow indicators
|
|
|
|
**Current Status - October 25, 2025:**
|
|
- All keyboard navigation features working correctly ✅
|
|
- Service selection cursor implemented with focus-aware highlighting ✅
|
|
- Panel scrolling fixed for System, Services, and Backup panels ✅
|
|
- Build display working: "Build: 25.05.20251004.3bcc93c" ✅
|
|
- Configuration hash display: Currently shows git hash, needs to be fixed ❌
|
|
|
|
**Target Layout:**
|
|
```
|
|
NixOS:
|
|
Build: 25.05.20251004.3bcc93c
|
|
Config: d8ivwiar # Should show nix store hash (8 chars) from deployed system
|
|
Active users: cm, simon
|
|
CPU:
|
|
● Load: 0.02 0.31 0.86 • 3000MHz
|
|
RAM:
|
|
● Usage: 33% 2.6GB/7.6GB
|
|
● /tmp: 0% 0B/2.0GB
|
|
Storage:
|
|
● root (Single):
|
|
├─ ● nvme0n1 W: 1%
|
|
└─ ● 18% 167.4GB/928.2GB
|
|
```
|
|
|
|
**System panel layout fully implemented with blue tree symbols ✅**
|
|
**Tree symbols now use consistent blue theming across all panels ✅**
|
|
**Overflow handling restored for all widgets ("... and X more") ✅**
|
|
**Agent hash display working correctly ✅**
|
|
|
|
### Current Keyboard Navigation Implementation
|
|
|
|
**Navigation Controls:**
|
|
- **Tab**: Switch between hosts (cmbox, srv01, srv02, steambox, etc.)
|
|
- **Shift+Tab**: Cycle through visible panels (System → Services → Backup → System)
|
|
- **Up/Down (System/Backup)**: Scroll through panel content
|
|
- **Up/Down (Services)**: Move service selection cursor between parent services
|
|
- **q**: Quit dashboard
|
|
|
|
**Panel-Specific Features:**
|
|
- **System Panel**: Scrollable content with CPU, RAM, Storage details
|
|
- **Services Panel**: Service selection cursor for parent services only (docker, nginx, postgresql, etc.)
|
|
- **Backup Panel**: Scrollable repository list with proper overflow handling
|
|
|
|
**Visual Feedback:**
|
|
- **Focused Panel**: Blue border and title highlighting
|
|
- **Service Selection**: Blue background with preserved status icon colors (green ● for active, red ● for failed)
|
|
- **Focus-Aware Selection**: Selection highlighting only visible when Services panel focused
|
|
- **Dynamic Statusbar**: Context-aware shortcuts based on focused panel
|
|
|
|
### Remote Command Execution - WORKING ✅
|
|
|
|
**All Issues Resolved (as of 2025-10-24):**
|
|
- ✅ **ZMQ Command Protocol**: Extended with ServiceControl and SystemRebuild variants
|
|
- ✅ **Agent Handlers**: systemctl and nixos-rebuild execution with maintenance mode
|
|
- ✅ **Dashboard Integration**: Keyboard shortcuts execute commands
|
|
- ✅ **Service Control**: Fixed toggle logic - replaced with separate 's' (start) and 'S' (stop)
|
|
- ✅ **System Rebuild**: Fixed permission issues and sandboxing problems
|
|
- ✅ **Git Clone Approach**: Implemented for nixos-rebuild to avoid directory permissions
|
|
- ✅ **Visual Feedback**: Directional arrows for service status (↑ starting, ↓ stopping, ↻ restarting)
|
|
|
|
**Keyboard Controls Status:**
|
|
- **Services Panel**:
|
|
- R (restart) ✅ Working
|
|
- s (start) ✅ Working
|
|
- S (stop) ✅ Working
|
|
- **System Panel**: R (nixos-rebuild) ✅ Working with --option sandbox false
|
|
- **Backup Panel**: B (trigger backup) ❓ Not implemented
|
|
|
|
**Visual Feedback Implementation - IN PROGRESS:**
|
|
|
|
Context-appropriate progress indicators for each panel:
|
|
|
|
**Services Panel** (Service status transitions):
|
|
```
|
|
● nginx active → ⏳ nginx restarting → ● nginx active
|
|
● docker active → ⏳ docker stopping → ● docker inactive
|
|
```
|
|
|
|
**System Panel** (Build progress in NixOS section):
|
|
```
|
|
NixOS:
|
|
Build: 25.05.20251004.3bcc93c → Build: [████████████ ] 65%
|
|
Active users: cm, simon Active users: cm, simon
|
|
```
|
|
|
|
**Backup Panel** (OnGoing status with progress):
|
|
```
|
|
Latest backup: → Latest backup:
|
|
● 2024-10-23 14:32:15 ● OnGoing
|
|
└─ Duration: 1.3m └─ [██████ ] 60%
|
|
```
|
|
|
|
**Critical Configuration Hash Fix - HIGH PRIORITY:**
|
|
|
|
**Problem:** Configuration hash currently shows git commit hash instead of actual deployed system hash.
|
|
|
|
**Current (incorrect):**
|
|
- Shows git hash: `db11f82` (source repository commit)
|
|
- Not accurate - doesn't reflect what's actually deployed
|
|
|
|
**Target (correct):**
|
|
- Show nix store hash: `d8ivwiar` (first 8 chars from deployed system)
|
|
- Source: `/nix/store/d8ivwiarhwhgqzskj6q2482r58z46qjf-nixos-system-cmbox-25.05.20251004.3bcc93c`
|
|
- Pattern: Extract hash from `/nix/store/HASH-nixos-system-HOSTNAME-VERSION`
|
|
|
|
**Benefits:**
|
|
1. **Deployment Verification:** Confirms rebuild actually succeeded
|
|
2. **Accurate Status:** Shows what's truly running, not just source
|
|
3. **Rebuild Completion Detection:** Hash change = rebuild completed
|
|
4. **Rollback Tracking:** Each deployment has unique identifier
|
|
|
|
**Implementation Required:**
|
|
1. Agent extracts nix store hash from `ls -la /run/current-system`
|
|
2. Reports this as `system_config_hash` metric instead of git hash
|
|
3. Dashboard displays first 8 characters: `Config: d8ivwiar`
|
|
|
|
**Next Session Priority Tasks:**
|
|
|
|
**Remaining Features:**
|
|
1. **Fix Configuration Hash Display (CRITICAL)**:
|
|
- Use nix store hash instead of git commit hash
|
|
- Extract from `/run/current-system` -> `/nix/store/HASH-nixos-system-*`
|
|
- Enables proper rebuild completion detection
|
|
|
|
2. **Command Response Protocol**:
|
|
- Agent sends command completion/failure back to dashboard via ZMQ
|
|
- Dashboard updates UI status from ⏳ to ● when commands complete
|
|
- Clear success/failure status after timeout
|
|
|
|
3. **Backup Panel Features**:
|
|
- Implement backup trigger functionality (B key)
|
|
- Complete visual feedback for backup operations
|
|
- Add backup progress indicators
|
|
|
|
**Enhancement Tasks:**
|
|
- Add confirmation dialogs for destructive actions (stop/restart/rebuild)
|
|
- Implement command history/logging
|
|
- Add keyboard shortcuts help overlay
|
|
|
|
**Future Enhanced Navigation:**
|
|
- Add Page Up/Down for faster scrolling through long service lists
|
|
- Implement search/filter functionality for services
|
|
- Add jump-to-service shortcuts (first letter navigation)
|
|
|
|
**Future Advanced Features:**
|
|
- Service dependency visualization
|
|
- Historical service status tracking
|
|
- Real-time log viewing integration
|
|
|
|
## Core Architecture Principles - CRITICAL
|
|
|
|
### Individual Metrics Philosophy
|
|
|
|
**NEW ARCHITECTURE**: Agent collects individual metrics, dashboard composes widgets from those metrics.
|
|
|
|
### Maintenance Mode
|
|
|
|
**Purpose:**
|
|
|
|
- Suppress email notifications during planned maintenance or backups
|
|
- Prevents false alerts when services are intentionally stopped
|
|
|
|
**Implementation:**
|
|
|
|
- Agent checks for `/tmp/cm-maintenance` file before sending notifications
|
|
- File presence suppresses all email notifications while continuing monitoring
|
|
- Dashboard continues to show real status, only notifications are blocked
|
|
|
|
**Usage:**
|
|
|
|
```bash
|
|
# Enable maintenance mode
|
|
touch /tmp/cm-maintenance
|
|
|
|
# Run maintenance tasks (backups, service restarts, etc.)
|
|
systemctl stop service
|
|
# ... maintenance work ...
|
|
systemctl start service
|
|
|
|
# Disable maintenance mode
|
|
rm /tmp/cm-maintenance
|
|
```
|
|
|
|
**NixOS Integration:**
|
|
|
|
- Borgbackup script automatically creates/removes maintenance file
|
|
- Automatic cleanup via trap ensures maintenance mode doesn't stick
|
|
- All cinfiguration are shall be done from nixos config
|
|
|
|
**ARCHITECTURE ENFORCEMENT**:
|
|
|
|
- **ZERO legacy code reuse** - Fresh implementation following ARCHITECT.md exactly
|
|
- **Individual metrics only** - NO grouped metric structures
|
|
- **Reference-only legacy** - Study old functionality, implement new architecture
|
|
- **Clean slate mindset** - Build as if legacy codebase never existed
|
|
|
|
**Implementation Rules**:
|
|
|
|
1. **Individual Metrics**: Each metric is collected, transmitted, and stored individually
|
|
2. **Agent Status Authority**: Agent calculates status for each metric using thresholds
|
|
3. **Dashboard Composition**: Dashboard widgets subscribe to specific metrics by name
|
|
4. **Status Aggregation**: Dashboard aggregates individual metric statuses for widget status
|
|
**Testing & Building**:
|
|
|
|
- **Workspace builds**: `cargo build --workspace` for all testing
|
|
- **Clean compilation**: Remove `target/` between architecture changes
|
|
- **ZMQ testing**: Test agent-dashboard communication independently
|
|
- **Widget testing**: Verify UI layout matches legacy appearance exactly
|
|
|
|
**NEVER in New Implementation**:
|
|
|
|
- Copy/paste ANY code from legacy backup
|
|
- Calculate status in dashboard widgets
|
|
- Hardcode metric names in widgets (use const arrays)
|
|
|
|
# Important Communication Guidelines
|
|
|
|
NEVER write that you have "successfully implemented" something or generate extensive summary text without first verifying with the user that the implementation is correct. This wastes tokens. Keep responses concise.
|
|
|
|
NEVER implement code without first getting explicit user agreement on the approach. Always ask for confirmation before proceeding with implementation.
|
|
|
|
## Commit Message Guidelines
|
|
|
|
**NEVER mention:**
|
|
|
|
- Claude or any AI assistant names
|
|
- Automation or AI-generated content
|
|
- Any reference to automated code generation
|
|
|
|
**ALWAYS:**
|
|
|
|
- Focus purely on technical changes and their purpose
|
|
- Use standard software development commit message format
|
|
- Describe what was changed and why, not how it was created
|
|
- Write from the perspective of a human developer
|
|
|
|
**Examples:**
|
|
|
|
- ❌ "Generated with Claude Code"
|
|
- ❌ "AI-assisted implementation"
|
|
- ❌ "Automated refactoring"
|
|
- ✅ "Implement maintenance mode for backup operations"
|
|
- ✅ "Restructure storage widget with improved layout"
|
|
- ✅ "Update CPU thresholds to production values"
|
|
|
|
## Development and Deployment Architecture
|
|
|
|
**CRITICAL:** Development and deployment paths are completely separate:
|
|
|
|
### Development Path
|
|
- **Location:** `~/projects/nixosbox`
|
|
- **Purpose:** Development workflow only - for committing new cm-dashboard code
|
|
- **Access:** Only for developers to commit changes
|
|
- **Code Access:** Running cm-dashboard code shall NEVER access this path
|
|
|
|
### Deployment Path
|
|
- **Location:** `/var/lib/cm-dashboard/nixos-config`
|
|
- **Purpose:** Production deployment only - agent clones/pulls from git
|
|
- **Access:** Only cm-dashboard agent for deployment operations
|
|
- **Workflow:** git pull → `/var/lib/cm-dashboard/nixos-config` → nixos-rebuild
|
|
|
|
### Git Flow
|
|
```
|
|
Development: ~/projects/nixosbox → git commit → git push
|
|
Deployment: git pull → /var/lib/cm-dashboard/nixos-config → rebuild
|
|
```
|
|
|
|
## Automated Binary Release System
|
|
|
|
**IMPLEMENTED:** cm-dashboard now uses automated binary releases instead of source builds.
|
|
|
|
### Release Workflow
|
|
|
|
1. **Automated Release Creation**
|
|
- Gitea Actions workflow builds static binaries on tag push
|
|
- Creates release with `cm-dashboard-linux-x86_64.tar.gz` tarball
|
|
- No manual intervention required for binary generation
|
|
|
|
2. **Creating New Releases**
|
|
```bash
|
|
cd ~/projects/cm-dashboard
|
|
git tag v0.1.X
|
|
git push origin v0.1.X
|
|
```
|
|
|
|
This automatically:
|
|
- Builds static binaries with `RUSTFLAGS="-C target-feature=+crt-static"`
|
|
- Creates GitHub-style release with tarball
|
|
- Uploads binaries via Gitea API
|
|
|
|
3. **NixOS Configuration Updates**
|
|
Edit `~/projects/nixosbox/hosts/common/cm-dashboard.nix`:
|
|
|
|
```nix
|
|
version = "v0.1.X";
|
|
src = pkgs.fetchurl {
|
|
url = "https://gitea.cmtec.se/cm/cm-dashboard/releases/download/${version}/cm-dashboard-linux-x86_64.tar.gz";
|
|
sha256 = "sha256-NEW_HASH_HERE";
|
|
};
|
|
```
|
|
|
|
4. **Get Release Hash**
|
|
```bash
|
|
cd ~/projects/nixosbox
|
|
nix-build --no-out-link -E 'with import <nixpkgs> {}; fetchurl {
|
|
url = "https://gitea.cmtec.se/cm/cm-dashboard/releases/download/v0.1.X/cm-dashboard-linux-x86_64.tar.gz";
|
|
sha256 = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
|
|
}' 2>&1 | grep "got:"
|
|
```
|
|
|
|
5. **Commit and Deploy**
|
|
```bash
|
|
cd ~/projects/nixosbox
|
|
git add hosts/common/cm-dashboard.nix
|
|
git commit -m "Update cm-dashboard to v0.1.X with static binaries"
|
|
git push
|
|
```
|
|
|
|
### Benefits
|
|
|
|
- **No compilation overhead** on each host
|
|
- **Consistent static binaries** across all hosts
|
|
- **Faster deployments** - download vs compile
|
|
- **No library dependency issues** - static linking
|
|
- **Automated pipeline** - tag push triggers everything
|