SelfHeal Practical auto-remediation for SRE teams
Ganges.app 's self-healing engine for production systems.
Alert → Normalize → Decide → (Optional AI) → Execute → Audit.
Current version: v1.0.9
Download v1.0.9 Quickstart Full docs Plans & pricing
What SelfHeal does
- Ingests alerts from Prometheus Alertmanager, Datadog, and others.
- Normalizes alerts into a consistent incident model (fingerprint, labels, status).
- Applies deterministic DB rules and per-instance allowlists.
- Optionally calls AI (Advisor / Single / Loop) when it is safe to do so.
- Executes actions over SSH with DRY-RUN and maintenance windows.
- Logs everything into SQLite for audit, replay, and dashboards.
Architecture at a glance
- Execution: SSH to your nodes, no agents required.
- Guardrails: Instance allowlist,Simulate, Command blocklist, DRY-RUN.
- AI: Uses your own OpenAI API key (or compatible) with strict token caps.
- Licensing: Offline-capable ed25519-signed
license.json.
Which plan should I choose?
- Free up to 3 nodes, Advisor-only AI, perpetual, community support.
- Freemium 30 days full AI on up to 10 nodes.
- Premium all capabilities, you choose node count, with business-day support.
- Enterprise all capabilities, higher node counts, 24x7 support and onboarding help.
Start with Free or Freemium on the plans page, then upgrade to Premium or Enterprise when your team is ready.
Already running SelfHeal? Generate a license using your request code at SelfHeal → Generate License.