Job Summary The Cloud & DevOps Engineering Principal leads operations that ensure the architecture, reliability, performance, and cost efficiency of our multi‑cloud infrastructure across AWS and GCP. The role manages a team of CloudOps and DevOps engineers, sets operating standards, and collaborates with Engineering, DevOps, and Security to guarantee secure, scalable, and highly available platforms. It supports the broader CloudOps and DevOps strategy, translates high‑level direction into operational practices, and drives continuous improvement across SaaS and on‑premise environments. The Principal oversees product development, ensuring new features comply with business processes and maintain performance at scale. What You Will Do Team & Organizational Leadership Lead, mentor, and grow CloudOps and DevOps teams, setting expectations and supporting professional development. Act as the primary escalation point for high‑impact technical and operational issues, providing direction during incidents. Coordinate priorities, workload distribution, and on‑call coverage to maintain operational readiness. Drive CloudOps and DevOps strategy and roadmap in alignment with product, engineering, and business objectives. Participate in product development, capturing deployment requirements, operational constraints, and runbook considerations. Enforce high standards for availability, performance, security, and compliance, ensuring practices meet or exceed SLA and security requirements. Partner with leadership to define, evolve, and reinforce operational standards across change management, release processes, and incident response. Cloud Infrastructure Management Architect, scale, and maintain reliable, cost‑efficient infrastructure across AWS, Azure, and GCP. Evolve governance standards for core cloud services, defining secure, repeatable architectural patterns. Ensure all environments are monitored, secured, and aligned with product and customer requirements. Lead technical guidance for multi‑cloud and hybrid‑cloud strategies. Oversee on‑premise footprint, including Windows build, packaging, and installer experiences. Define and enforce lifecycle processes for on‑premise applications, including build pipelines and supportability considerations. Maintain compatibility of cloud‑native designs with restricted on‑premise deployments. Drive continuous improvements to support scalable cloud deployments and secure on‑premise installations. Kubernetes Orchestration & Automation Operate daily infrastructure and application operations across Kubernetes platforms (EKS, GKE, AKS) and serverless workloads. Manage Kubernetes lifecycle, capacity planning, upgrades, and reliability. Govern networking, connectivity, and security infrastructure. Promote automation culture using infrastructure‑as‑code and GitOps practices. Lead automation pipelines with Terraform, Terragrunt, ArgoCD, Helm, and Ansible. Define and enforce deployment strategies such as rolling updates and blue/green rollouts. Apply rigorous change‑management procedures to reduce release risk. Encourage continuous improvement and upskilling in Kubernetes and cloud‑native operations. Monitoring, Alerting & Optimization Own monitoring and observability with Prometheus, Grafana, and OpenTelemetry. Define and refine SLOs, alerts, and performance metrics. Use metrics and incidents to drive preventive improvements. Track key CloudOps and DevOps KPIs such as uptime, MTTR, change failure rate, and deployment frequency. Database & Data Platform Management Guide production operations for MongoDB, Redis, MySQL, including scale, backup, failover, and performance optimization. Ensure databases are monitored, secure, and well integrated. Collaborate with data and application teams on capacity and growth planning. Security & Compliance Champion secure cloud practices including RBAC, encryption, segmentation, and audits. Oversee single‑sign‑on and identity federation (OAuth, SAML, OIDC). Support security/compliance teams to meet required standards. Participate in security assessments, remediation, and process improvements. Automation, CI/CD & Infrastructure‑as‑Code Promote infrastructure‑as‑code with Terraform, CloudFormation, Ansible. Partner with DevOps and engineering on auditable CI/CD pipelines. Drive reuse of automation for provisioning, deployment, and configuration. Support governance of CI/CD and release processes, ensuring quality, security, and rollback controls. Networking & Traffic Management Oversee cloud networking components such as VPCs, subnets, NATs, and ACLs. Guide service mesh and micro‑service traffic management. Standardize network patterns and connectivity across cloud, data centers, and partners. Cost Management & Capacity Planning Collaborate with Finance and product/engineering to monitor and optimize cloud costs. Use tagging and reporting to attribute spend by product or environment. Work on capacity planning as usage and customers grow. Provide cost and capacity summaries with trends, risks, and optimization recommendations. Cross‑Functional Collaboration & Communication Serve as a key partner for Product, DevOps, SRE, Security, and IT. Participate in planning to align CloudOps and DevOps initiatives with product and company priorities. Communicate platform health, risks, incidents, and improvements clearly. Act as operational escalation point for major incidents and releases, coordinating resolution with leadership. Requirements Experience and Skills 8–10+ years in CloudOps, DevOps, or SRE roles, including leadership of engineers or small teams. Hands‑on expertise with Azure, GCP (GKE, Cloud Run, App Engine, BigQuery) and AWS (EKS, EC2, IAM, VPC, Route 53, security groups). Deep knowledge of Kubernetes operations, containerized microservices, and tooling (Helm, etc.). Proven experience with monitoring, observability frameworks (Prometheus, Grafana, OpenTelemetry) and defining SLOs and alerting. Advanced knowledge of MongoDB, Redis, PostgreSQL, RabbitMQ, Kafka, and MySQL administration. Strong skills in Terraform, ArgoCD, Helm, Ansible, and CI/CD for infrastructure automation. Solid understanding of OAuth, SAML, OIDC, identity federation, and secure infrastructure design. Demonstrated use of operational metrics to guide cloud improvements. Experience with multi‑cloud or hybrid‑cloud environments, assessing performance, risk, and feasibility. Tracking and reporting of CloudOps/DevOps KPIs (uptime, MTTR, deployment frequency, change failure rate). Soft Skills Leadership and people‑development capabilities. Analytical thinker, able to decompose complex problems. Detail‑oriented, results‑driven, passionate about system reliability. Clear communicator across technical and business teams. Enthusiastic about rapid learning and technology adaptation. Effective collaboration in agile and cross‑functional environments. Education Bachelor’s Degree in Computer Science, Information Technology, Software Engineering, or related fields. Certifications: AWS Certified DevOps Engineer – Professional, CKA, Microsoft DevOps Engineer, GCP DevOps Engineer are a plus. Master’s Degree in Cloud Computing, Cybersecurity, or Software is a plus. Terms of Employment This is a full‑time contractor position. Our Culture We offer a dynamic, culturally diverse, and engaging environment where energy, creativity, and passion drive product excellence. We value and respect individual backgrounds, fostering inclusion and diversity. Equal Employment Opportunity Seagull Software, LLC, is proud to provide equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, veteran status, sexual orientation, gender identity, or expression. This policy applies to all terms and conditions of recruiting and employment. #J-18808-Ljbffr
Cloud & Devops Engineering, Principal
MOJIX
distrito federal, distrito federal
Publicado hace 22 días
Denunciar empleo