Site Reliability Engineer

Insight Global

Remote, federative republic of brazil, Brazil
Full-time
Posted June 12, 2026

Job Description

Job Title: Site Reliability Engineer (SRE) / Infrastructure Operations MID LEVELRole OverviewResponsible for managing day-to-day infrastructure operations, including monitoring, alerting, and driving stability improvements across the environment.Key ResponsibilitiesMonitor overall infrastructure health and system performanceTrack key performance metrics such as CPU, memory, and disk utilizationTune alerts to improve signal-to-noise ratio and reduce alert fatigueSupport disaster recovery (DR) rehearsals and readiness activitiesMaintain and update runbooks, documentation, and operational reportsRequired Experience4–6 years of experience in Site Reliability Engineering (SRE) or infrastructure operationsHands-on experience with VMware environmentsExperience with monitoring tools such as PRTG, Datadog, or similar platformsStrong incident management experience, including response and resolution processesCore Skills & CompetenciesSolid understanding of infrastructure performance metrics (CPU,...