Job Description
Job Description
As a Principal member of the Site Reliability Engineering (SRE) team, you'll take ownership of highly available systems, influence service design, and work across teams to drive resiliency, automation, and operational excellence. This is a hands‑on engineering role where deep infrastructure knowledge meets software engineering expertise, ideal for experienced SREs ready to take the lead.
This is not a fully remote role but a hybrid role. Requires in‑office presence at least 3 days a week in Guadalajara.
Responsibilities- Lead the design, automation, and support of OCI services with a focus on resiliency, security, scalability, and performance.
- Own and improve the end‑to‑end reliability metrics (SLOs, SLAs, KPIs) for your services.
- Design and implement high‑availability architectures and standards for large‑scale distributed systems.
- Serve as the ultimate escalation point for complex operationa...