Description

*Our roles are remote first, and can be based anywhere in India (#LI-Remote).

 

Responsibilities

  • Monitor and continually improve the capacity of our production environment
  • Design and implement scalable, reliable, and efficient infrastructure using Kubernetes, Terraform, AWS resources.
  • Partner with development teams to improve services through rigorous testing and release procedures with CI pipelines (Github Actions, Dockerfiles)
  • Gain a deeper understanding of RudderStack infrastructure and help debug incidents
  • Proactively build software to help operations and support teams
  • Identify opportunities for process improvements, automation, and cost savings

Requirements

  • A Bachelor or Master degree in Computer Science or equivalent experience is required
  • 5+ years of experience as a Site Reliability Engineer, Internal Platform Developer or similar role
  • Strong understanding of cloud computing, containers, and DevOps practices
  • Demonstrated Linux experience
  • Excellent debugging skills
  • Experience with Scripting and infrastructure automation
  • Familiarity with distributed systems design patterns using tools such as Kubernetes
  • Familiarity with AWS, Azure or Google Cloud Compute
  • Excellent verbal and written communication skills
  • Familiarity with Networking concepts like VPCs, proxies and CDNs

Here are examples of things we've worked on:

  • Build and maintain a Kubernetes platform to deploy all our applications with high availability
  • Build Kubernetes operator to automate 100s of deployments
  • Managed 100s of postgres with HA for our deployments
  • Provision and manage air-gapped on-premise deployments in diverse environments.
  • Manage multi-region multi-cluster environment with hundreds of customer deployments in single-tenant and multi-tenant models.
  • Complete Infrastructure as a code and enforced using GitOps model
  • Automated migrations of complex, highly available services
  • Working on compliance(i.e. SOC2 Type 2, HIPPA), security, scalability, and a lot more aspects to deliver top class, secure software
  • We follow FinOps and continuously optimize our cloud costs.

How we achieve results:

  • Empathy for the problems encountered by our customers.
  • Collaboration with engineering teams to achieve results.
  • Care deeply about the quality of your and the team's code
  • Curiosity and understanding, for investigating causes and finding effective solutions.
  • Output driven to provide value to our customers in a significant, measurable, and positive way.
  • Focus on writing testable, performant, bug-free code to provide the right solutions to the problems.


Please mention the word **SHARPEST** and tag RNTQuMTg2LjgyLjI0Mw== when applying to show you read the job post completely (#RNTQuMTg2LjgyLjI0Mw==). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.