Important IT company At the Latin American level, growth requires: SRE- Site Reliability Engineering Job description What you will do Proactively build and implement services to make IT and support better at their jobs. Design and implement dashboard that provide valuable real-time insights of platform key metrics. Lead engagement with software developers, DevOps and other infrastructure engineers to integrate software development and delivery from inception to full operation, ensuring robust released software and systems. Optimize on-call rotations & processes. Ensure incidents assigned to the team are being managed within agreed SLAs. Ensure alarms are documented in up-to-date Knowledge Base Articles. Conduct post‑incident reviews to identify platform status. Qualifications Bachelor's degree in computer science or equivalent. 7+ years experience focused on Site Reliability Engineering or related position in major Cloud Platforms. Involved in the automation of multi‑tenant systems, preferably in a cloud environment. Good understanding of SRE philosophies, technologies, platforms and tools, SLO management, incident resolution, and automation. Ability to explain technical concepts in clear, non‑technical language. Experience building Infrastructure‑as‑Code. Experience with Docker and Kubernetes and networking concepts. Experience with Graphana and Prometeus. Integration experience with Pager‑Duty, ServiceNow, Datadog. Expertise with system and performance monitoring tools (Dynatrace, Splunk, etc.). Highly experienced us… ADVANCED CONVERSATIONAL ENGLISH ESSENTIAL (Will be evaluated). Job type and location Job type: On site. Location: Guadalajara. Salary: $95,000 gross. Benefits: Excellent superior benefits. Tipo de puesto: Tiempo completo. Puede trasladarse/mudarse: Guadalajara, Jal. Idioma: Inglés (Obligatorio). Lugar de trabajo: Empleo presencial. #J-18808-Ljbffr
Sre (Site Reliability Engineering) On Site Guadalajara
LINK-WORLDWIDE
guadalajara, guadalajara
Publicado hace 6 días
Denunciar empleo