320 Commits

Author SHA1 Message Date
4fa2b079f1 Add sandbox column and security-based service status
Add new "SB" column to services widget showing systemd sandboxing status.
Service status now reflects security posture with unsandboxed services
showing as degraded/warning status.

Changes:
- Add is_sandboxed field to ServiceData and ServiceInfo structs
- Add check_service_sandbox method detecting systemd hardening features
- Add format_sandbox_value function showing "yes"/"no" for sandboxing
- Update service status determination to consider sandbox status:
  - Sandboxed + Running = "Running" (green/ok)
  - Unsandboxed + Running = "Degraded" (yellow/warning)
  - Failed services = "Stopped" (red/critical)
- Add "SB" column header to services widget

Services without proper NixOS hardening (PrivateTmp, ProtectSystem, etc.)
now show warning status to highlight security concerns.
2025-10-14 11:18:07 +02:00
17dda1ae67 Implement proper disk and memory quota detection
Replace misleading system total quotas with actual service-specific
quota detection. Services now only show quotas when real limits exist.

Changes:
- Add get_service_disk_quota method with filesystem quota detection
- Add check_filesystem_quota and docker storage quota helpers
- Remove automatic assignment of system totals as fake quotas
- Update dashboard formatting to show usage only when no quota exists

Display behavior:
- Services with real limits: "2.1/8.0" (usage/quota)
- Services without limits: "2.1" (usage only)

This provides accurate monitoring instead of misleading system capacity
values that suggested all services had massive quotas.
2025-10-14 11:01:04 +02:00
8de3d2ba79 Clean up services widget column headers and units
Standardize all services widget columns to show units in headers
and remove units from metric values for cleaner display.

Changes:
- Update column headers: "RAM (GB)", "CPU (%)", "Disk (GB)"
- Remove units from metric values:
  - RAM: "5.2/32.0" (no GB)
  - CPU: "2.5" (no %)
  - Disk: "1.5/500.0" (no GB)
- Simplify disk formatting to always show GB format

All columns now consistently display units in headers with
clean, uncluttered metric values.
2025-10-14 10:36:38 +02:00
4be1223a8d Update services widget memory display and formatting
Improve services widget to show consistent usage/total format for both
RAM and Disk columns, using system totals when no service quotas exist.

Changes:
- Change column header from "Memory (GB)" to "RAM (GB)"
- Remove "GB" units from memory values (units now in header)
- Add system memory total detection from /proc/meminfo
- Use system memory total as default quota for services without limits
- Services now show "5.2/32.0" format for both RAM and disk

Both RAM and Disk columns now consistently display usage/quota format
where quota is either service-specific limit or system total capacity.
2025-10-14 10:23:16 +02:00
630d2ff674 Add disk quota display to services widget
Implement disk quota/total display in services widget showing usage/quota
format. When services don't have specific disk quotas configured, use
system total disk capacity as the quota value.

Changes:
- Add disk_quota_gb field to ServiceData struct in agent
- Add disk_quota_gb field to ServiceInfo struct in dashboard
- Update format_disk_value to show usage/quota format
- Use system disk total capacity as default quota for services
- Rename DiskUsage.total_gb to total_capacity_gb for clarity

Services will now display disk usage as "5.2/500.0 GB" format where
500.0 GB is either the service's specific quota or system total capacity.
2025-10-14 10:14:24 +02:00
6265b1afb3 Disable service connection monitoring for CPU/C-state testing
Temporarily disable excessive connection monitoring in ServiceCollector
to test impact on CPU load and C-states. Keep nginx sites and docker
containers as they are needed for sub-service display functionality.

Disabled monitoring:
- SSH connections (ss commands)
- Database connections (PostgreSQL, MySQL, Redis)
- Web service connections (Apache, Gitea, Immich, etc.)
- Network service connections (Mosquitto, UniFi, etc.)

This eliminates most external command calls while preserving essential
nginx and docker sub-service enumeration.
2025-10-14 08:39:38 +02:00
dca3642e46 Implement multi-host autoconnect with consolidated host configuration
- Add DEFAULT_HOSTS constant in config.rs for centralized host management
- Update ZMQ endpoint generation to connect to all configured hosts
- Implement graceful connection handling for unreachable endpoints
- Dashboard now auto-discovers and connects to available agents on cmbox, labbox, simonbox, steambox, srv01
2025-10-14 00:44:38 +02:00
c8655bf852 Update dashboard ZMQ endpoints to use hostnames for all CMTEC hosts 2025-10-13 23:39:46 +02:00
8ab749ed04 Add gitea-runner to service discovery patterns 2025-10-13 23:18:41 +02:00
d052c4084d Fix Docker containers service description type mismatch 2025-10-13 22:07:08 +02:00
f5acf44e3b Show individual Docker containers as sub-services similar to nginx sites 2025-10-13 21:20:53 +02:00
6b22d23a2e Try multiple ports for vaultwarden connection detection 2025-10-13 20:57:28 +02:00
c5795e3add Remove storage info from immich, focus on connections only 2025-10-13 19:47:45 +02:00
322997932e Standardize connection descriptions and focus on connections for gitea/vaultwarden 2025-10-13 19:35:31 +02:00
07886ec317 Add MongoDB/mongod service monitoring and description 2025-10-13 19:01:21 +02:00
a33b019d83 Fix service descriptions with better fallbacks and correct paths 2025-10-13 18:49:39 +02:00
6da23019e5 Add comprehensive service descriptions with useful metrics 2025-10-13 18:25:53 +02:00
d11d8a74f3 Enhance service discovery to include MQTT, WordPress, HAASP, and backup services 2025-10-13 17:57:12 +02:00
617da088b1 Fix all remaining commands to use full paths
- Fix systemctl, du, df, uptime, ss, journalctl commands
- Add sudo for du command (needed for directory access)
- This should resolve all remaining command path issues in the service
- Storage, backup, and system monitoring should now work properly
2025-10-13 17:44:13 +02:00
2e67f17d6c Fix SSH connection count and re-enable nginx site accessibility check
- Use full path /run/current-system/sw/bin/ss for SSH connection counting
- Re-enable nginx site accessibility checking with full curl path
- This should show SSH connection counts and verify which nginx sites are accessible
2025-10-13 16:23:21 +02:00
5a215fc259 Disable nginx site accessibility check temporarily
- Shows all parsed nginx sites instead of filtering by accessibility
- This ensures nginx sites are displayed in dashboard immediately
- Accessibility check was filtering out sites due to curl issues or timeouts
2025-10-13 16:16:16 +02:00
7af4f09ca2 Fix sudo nginx and psql commands to use full paths
- Change nginx command from 'nginx' to '/run/current-system/sw/bin/nginx'
- Change psql command from 'psql' to '/run/current-system/sw/bin/psql'
- This ensures sudo rules can properly match the commands with full paths
2025-10-13 14:52:36 +02:00
92d6b42837 Fix nginx detection when running as root - skip sudo 2025-10-13 12:47:28 +02:00
9b6a504e48 Add NixOS integration documentation for updating cm-dashboard
Include step-by-step instructions for updating commit hash and
rebuilding NixOS configuration when new cm-dashboard code is available.
2025-10-13 12:20:36 +02:00
f786d054f2 Fix nginx config parsing for NixOS systemd format
Improve parsing of nginx config path from systemd ExecStart to handle
both traditional format and NixOS argv[] format. This should fix nginx
sites not being detected when running as a systemd service.
2025-10-13 12:12:36 +02:00
b0d7d5ce35 Testing 2025-10-13 11:23:49 +02:00
cd4764596f Implement comprehensive dashboard improvements and maintenance mode
- Storage widget: Restructure with Name/Temp/Wear/Usage columns, SMART details as descriptions
- Host navigation: Only cycle through connected hosts, no disconnected hosts
- Auto-discovery: Skip config files, use predefined CMTEC host list
- Maintenance mode: Suppress notifications during backup via /tmp/cm-maintenance file
- CPU thresholds: Update to warning ≥9.0, critical ≥10.0 for production use
- Agent-dashboard separation: Agent provides descriptions, dashboard displays only
2025-10-13 11:18:23 +02:00
bb69f0f31b Testing 2025-10-13 10:23:42 +02:00
42aaebf6a7 Testing 2025-10-13 09:57:43 +02:00
d76302e1c4 Testing 2025-10-13 08:45:12 +02:00
859df2dec1 Testing 2025-10-13 08:38:57 +02:00
5e8a0ce108 Testing 2025-10-13 08:31:18 +02:00
3de1e0db19 Updated readme 2025-10-13 08:13:20 +02:00
bab387c74d Refactor services widget with unified system metrics display
- Rename alerts widget to hosts widget for clarity
- Add sub_service field to ServiceInfo for display differentiation
- Integrate system metrics (CPU load, memory, temperature, disk) as service rows
- Convert nginx sites to individual sub-service rows with tree structure
- Remove nginx site checkmarks - status now shown via row indicators
- Update dashboard layout to display system and service data together
- Maintain description lines for connection counts and service details

Services widget now shows:
- System metrics as regular service rows with status
- Nginx sites as sub-services with ├─/└─ tree formatting
- Regular services with full resource data and descriptions
- Unified status indication across all row types
2025-10-13 08:10:38 +02:00
c68ccf023e Testing 2025-10-13 00:28:06 +02:00
57b676ad25 Testing 2025-10-13 00:16:24 +02:00
9e344fb66d Testing 2025-10-12 22:31:46 +02:00
4d8bacef50 Testing 2025-10-12 20:35:09 +02:00
d9edcda36c Testing 2025-10-12 20:29:08 +02:00
d08d8f306a Implement comprehensive status calculation and notification system
Agent Changes:
• Add CPU status thresholds (warning: ≥5.0, critical: ≥8.0)
• Add memory status thresholds (warning: ≥80%, critical: ≥95%)
• Add service status calculation (critical if failed>0, warning if degraded>0)
• All collectors now calculate and include status in output

Dashboard Changes:
• Update system widget to use agent-calculated cpu_status and memory_status
• Update services widget to use agent-calculated services_status
• Remove client-side status calculations in favor of agent status
• Add status_level_from_agent_status helper function

Notification System:
• Add SMTP email notification system using lettre crate
• Auto-configure notifications: hostname@cmtec.secm@cmtec.se
• Smart change detection with rate limiting (30min cooldown)
• Only notify on transitions to/from warning/critical states
• Rich email formatting with host, component, metric details
2025-10-12 20:04:40 +02:00
59bc3adad5 Testing 2025-10-12 19:57:05 +02:00
9c836e0862 Testing 2025-10-12 19:32:47 +02:00
fb91d8346f Testing 2025-10-12 19:13:19 +02:00
0b1d3ae0ad Testing 2025-10-12 19:02:50 +02:00
c8c91bdfec Testing 2025-10-12 18:58:07 +02:00
c312916687 Testing 2025-10-12 18:55:15 +02:00
75910610e4 Testing 2025-10-12 18:39:03 +02:00
0656af17f2 Testing 2025-10-12 18:31:44 +02:00
b65d29d86b Testing 2025-10-12 18:30:33 +02:00
53cb6510d0 Testing 2025-10-12 18:10:05 +02:00