Operational procedures

Operational procedures

Lekko has robust monitoring, alerting, and mitigation strategies to handle failures quickly.

Monitoring and alerting

Lekko uses the following observability tools for measuring, logging, and alerting based on critical operational metrics across the platform:

  • Rockset
  • Prometheus
  • AWS CloudWatch
  • Honeycomb
  • Pagerduty

Mitigation and recovery

Lekko maintains a weekly on-call rotation with escalation plans. The on-call engineers are responsible for investigating and mitigating ongoing issues. If an issue is discovered and resolved, Lekko conducts a post-mortem analysis within 48 hours and communicates the results to affected users.