Senior Platform Engineer – (Focus on SRE & Observability)

Publié il y a 2 semaines

As a Senior Platform Engineer, you will ensure the reliability, scalability, and security of production systems. You will manage and optimize AWS infrastructure, automate CI/CD and Terraform workflows, implement observability solutions, and contribute to incident response and operational excellence.

Key Responsibilities

  • Improve overall production readiness
  • Define and implement the observability strategy (monitoring, alerting, dashboards)
  • Drive reliability enhancements and actively support incident response
  • Support and optimize AWS infrastructure
  • Harden and secure CI/CD pipelines
  • Improve Terraform governance and automation processes
  • Contribute to identity and security integrations (Auth0)

Technical Requirements

Cloud & Infrastructure :

  • Strong AWS expertise (EKS/ECS/EC2, ALB/NLB, IAM, VPC, CloudWatch)
  • Infrastructure as Code using Terraform (state management, modular design, remote backends, CI validation, best practices)
  • CI/CD pipelines (GitHub Actions preferred): safe deployments, rollback strategies, automation

Observability & Reliability

  • Metrics, logs, and traces (CloudWatch, OpenTelemetry, Signoz, Grafana)
  • Alerting strategies, SLO/SLI definition, error budgets
  • Designing and implementing production-grade monitoring from scratch

SRE & Operational Excellence :

  • Incident management and structured root cause analysis (RCA)
  • Reliability, scalability, and performance tuning
  • Production hardening and high-availability design

Automation, Identity & Security :

  • Python for operational tooling and automation
  • Auth0 knowledge (tenant management, RBAC, integrations, security best practices)
  • Security fundamentals (least-privilege IAM, secrets management, audit logging, compliance awareness)

Required Experience

  • + 5 years of experience
  • Hands-on support of production systems
  • Active participation in incident response and postmortems
  • Experience building or improving observability frameworks
  • Exposure to cloud-native architectures
  • Close collaboration with software engineers to improve deployments and system reliability
  • Experience with high-availability, customer-facing systems is strongly preferred

Postuler pour cet emploi

Un numéro de téléphone valide est requis.