Monitoring Overview
PMDaemon provides comprehensive monitoring capabilities to help you track the health and performance of your applications and the daemon itself.
Built-in Monitoring Features
Process Metrics
- CPU Usage: Real-time CPU utilization per process
- Memory Usage: RAM consumption tracking with leak detection
- Uptime: Process runtime and restart statistics
- Status: Current process state (running, stopped, errored, etc.)
System Health Checks
- Health Endpoints: Configurable HTTP health check endpoints
- Custom Health Scripts: Execute custom validation scripts
- Resource Limits: Monitor and enforce CPU/memory limits
- Auto-restart: Automatic restart on failure or resource exhaustion
Log Monitoring
- Centralized Logging: Unified log collection and rotation
- Log Levels: Configurable log levels (error, warn, info, debug)
- Error Tracking: Automatic error detection and alerting
- Log Rotation: Automatic log file rotation and cleanup
Monitoring Endpoints
PMDaemon exposes several monitoring endpoints through its REST API:
Health Check Endpoint
GET /health
Returns the overall health status of the daemon and all managed processes.
Process Metrics
GET /api/processes/{id}/metrics
Returns detailed metrics for a specific process including:
- CPU usage percentage
- Memory usage (RSS, heap, etc.)
- Process uptime
- Restart count
- Error count
System Stats
GET /api/system/stats
Returns system-wide statistics:
- Total CPU usage
- Available memory
- Disk usage
- Network statistics
External Monitoring Integration
Prometheus Integration
PMDaemon can expose metrics in Prometheus format for integration with monitoring stacks:
[monitoring.prometheus]
enabled = true
port = 9090
endpoint = "/metrics"
Custom Webhooks
Configure webhooks to send alerts to external systems:
[monitoring.webhooks]
enabled = true
url = "https://your-monitoring-system.com/webhook"
events = ["process_crash", "high_memory", "restart_loop"]
Log Forwarding
Forward logs to external logging systems:
[monitoring.logging]
forward_to = "syslog"
syslog_address = "localhost:514"
format = "json"
Alerting
Built-in Alerts
- Process crashes
- High memory usage
- CPU threshold breaches
- Restart loops
- Health check failures
Alert Configuration
[monitoring.alerts]
cpu_threshold = 80 # Percentage
memory_threshold = "1GB"
restart_threshold = 5 # restarts in 10 minutes
Notification Channels
- Email notifications
- Slack integration
- Discord webhooks
- Custom HTTP endpoints
Performance Monitoring
Real-time Metrics
Monitor performance metrics in real-time through the web dashboard or CLI:
# View real-time process metrics
pmdaemon monitor
# Show detailed process info
pmdaemon show <process_name>
# Display system overview
pmdaemon status
Historical Data
PMDaemon can store historical performance data for trend analysis:
[monitoring.history]
enabled = true
retention_days = 30
storage_path = "/var/lib/pmdaemon/metrics"
Dashboard Integration
Web Dashboard
Access the built-in web dashboard at http://localhost:3000
(configurable) to view:
- Process status overview
- Real-time metrics graphs
- Log viewer
- Alert history
Third-party Dashboards
Integrate with popular monitoring dashboards:
- Grafana (via Prometheus)
- DataDog (via StatsD)
- New Relic (via HTTP API)
Monitoring Best Practices
- Set Appropriate Thresholds: Configure CPU and memory thresholds based on your application's normal behavior
- Enable Health Checks: Implement proper health check endpoints in your applications
- Monitor Resource Trends: Watch for gradual increases that might indicate memory leaks
- Set Up Alerts: Configure notifications for critical events
- Regular Log Review: Periodically review logs for patterns and issues
Troubleshooting Monitoring Issues
Common Problems
- High monitoring overhead
- Missing metrics
- False positive alerts
- Dashboard connection issues
Solutions
- Adjust monitoring intervals
- Check endpoint configurations
- Review alert thresholds
- Verify network connectivity
For more detailed monitoring configuration, see our Configuration Guide.