← Projects

Automated System Health Checker (Python)

Proactive monitoring script for Tier 1 & Tier 2 help desk environments

Overview

This script checks disk usage, CPU & memory utilization, and watches specific processes you care about. It logs everything (rotating logs) and can alert via console, macOS desktop notification, and Slack (optional webhook). Thresholds and watchlists are configured in config.yaml.

Repo & Quick Start

GitHub: cmwalls/system-health-checker

# clone & run (macOS/Linux)
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python health_check.py

Project Structure

system-health-checker/
├─ health_check.py        # main script
├─ config.yaml            # thresholds, process watchlist, alert settings
├─ requirements.txt
├─ logs/                  # rotating logs (gitignored)
└─ .gitignore

Configuration (config.yaml)

Edit thresholds and watchlists without touching code:

thresholds:
  cpu_percent: 85
  memory_percent: 85
  disk_percent: 90
process_watchlist:
  required: []                # must be running, e.g., ["redis-server", "postgres"]
  monitor:
    - name: "redis-server"    # alert if over CPU/MEM limits
      cpu_percent: 70
      memory_percent: 20
logging:
  file: "logs/health_check.log"
  level: "INFO"
alerts:
  desktop: true               # macOS desktop notifications
  slack: true                 # requires SLACK_WEBHOOK env var
  console: true               # print to stdout

What the Script Does

Alerts

Logging

Rotating file logs (default logs/health_check.log, ~500 KB, 3 backups) + console output for visibility.

Scheduling (cron)

Run every 10 minutes (adjust the absolute path):

crontab -e
*/10 * * * * /bin/zsh -lc 'cd /ABSOLUTE/PATH/system-health-checker && source .venv/bin/activate && python health_check.py'

Code Walkthrough (selected)

Why This Matters (Employer Signal)