AWS Cloud Site Reliability Engineer

April 28, 2025
Application ends: December 4, 2026
Apply Now

Job Description

Our client is seeking a highly skilled AWS Cloud Site Reliability Engineer (SRE) to join their infrastructure and operations team. In this role, candidate will be responsible for designing, building, and maintaining scalable and reliable cloud systems on AWS. He will work closely with development, DevOps, and security teams to ensure high availability, performance, and cost-efficiency of our cloud infrastructure and services.

Responsibilities:

Design and implement scalable, secure, and highly available AWS cloud infrastructure.
Develop infrastructure-as-code using tools like Terraform, AWS CloudFormation, or CDK.
Monitor system performance, troubleshoot issues, and perform root cause analysis to resolve incidents.
Build and manage CI/CD pipelines to automate deployments and testing.
Collaborate with development teams to improve service reliability, observability, and operational excellence.
Implement and maintain monitoring, alerting, and logging solutions using CloudWatch, Prometheus, Grafana, etc.
Participate in on-call rotations and incident response to ensure uptime and reliability of services.
Continuously improve automation, tools, and documentation for operations and deployments.
Apply security best practices and assist in audits and compliance efforts (e.g., IAM, encryption, logging).

Requirements:

  • 3+ years of experience in an SRE, DevOps, or cloud infrastructure role.
  • Strong hands-on experience with Amazon Web Services (AWS), including services such as EC2, S3, RDS, Lambda, VPC, IAM, CloudWatch, ECS/EKS.
  • Proficiency with infrastructure-as-code tools (e.g., Terraform, CloudFormation).
  • Experience with monitoring and observability tools (e.g., Datadog, New Relic, Prometheus, Grafana).
  • Proficient in scripting or programming (Python, Bash, Go, etc.).
  • Solid understanding of networking, security, and system administration.
  • Familiarity with CI/CD tools like Jenkins, GitLab CI, or AWS CodePipeline.
  • Experience with containers and orchestration platforms like Docker and Kubernetes.
  • Strong problem-solving skills and ability to work in a fast-paced environment.

Preferred Qualifications:

AWS certifications (e.g., AWS Certified SysOps Administrator, Solutions Architect, DevOps Engineer).
Experience with incident response and postmortem analysis.

Share this post