R

SRE Command Center

Reliability monitoring & agent communication

Loading...
Total Jobs
--
Healthy
--
Degraded
--
Critical
--
Node.js
--
Python
--
Unresolved
--

Critical Issues

Loading...

Recent Audits

#TypeStatusFindingsWhen
Loading...

Unresolved Findings

Grouped by repeated issue signature so duplicates can be resolved together.

SeverityIssue GroupCountLatestAction
Loading...
Status Name Backend Schedule Criticality Last Run Failures Failure Rate
Loading...

Audit History

Loading...

Select an audit

Click an audit from the list to view details.

Disaster Recovery plans for all job categories. Click to expand recovery steps and escalation paths.

Loading...

Day Summary

Loading...

Activity Timeline

Time Severity Category Action Summary
Loading...

Sessions

Loading...

SRE Agent Chat

Ask about system status, give instructions, or brainstorm architecture changes

Hello! I'm your SRE Agent. I monitor all 42 background jobs across the Node.js and Python backends.

You can ask me about:
  • Current system health and job status
  • Specific job details or failures
  • Disaster recovery plans
  • Upcoming changes and their implications
  • Architectural improvements
How can I help you today?