Found Description
Responsibilities
- Design, build, and maintain scalable, secure cloud infrastructure on AWS using Terraform and Terragrunt (IaC).
- Manage and evolve Kubernetes (EKS) clusters, including node group management, autoscaling with Karpenter, and workload reliability.
- Own and improve CI/CD pipelines (GitLab CI), ensuring fast, reliable, and secure delivery.
- Drive observability initiatives: metrics, logging, alerting, and dashboards using Prometheus, Grafana, and related tooling.
- Support and evolve Kafka infrastructure in collaboration with backend engineering teams.
- Champion infrastructure-as-code practices, ensuring consistent, reviewed, and well-documented changes.
- Respond to production incidents, lead post‑mortems, and drive improvements in incident response processes.
- Collaborate with backend, security, and product engineering teams to support their infrastructure needs.
- Leverage AI‑assiste...