Staff Site Reliability Engineer

Collective Health

Collective Health

Software Engineering
San Francisco, CA, USA
Posted on Friday, October 27, 2023

We all depend on healthcare throughout our lifetimes, for ourselves, and our families and friends, but it is notoriously difficult to navigate and understand. As an industry that comprises 20% of the US economy we think healthcare should work better for all of us. At Collective Health we believe it’s time for a new day in healthcare where as members we are informed and empowered to make the right care choices when the decisions are urgent and critical.

Site Reliability Engineering at Collective Health is a discipline combining software and systems engineering skills. We apply modern infrastructure, systems, software, architecture, and development practices to give our customers a more reliable healthcare management experience. Partnering with engineering teams, Site Reliability Engineers build on public cloud services to deliver a comprehensive platform that enables our developers to rapidly deliver high-quality, impactful, scalable, and reliable services. As a broader group of Site Reliability Engineers including those focused on infrastructure and those embedded in other engineering teams, we collaborate and identify themes and solutions to benefit Collective Health at large, engage in regular knowledge sharing activities and retrospectives, and relentlessly support one another in order to gain knowledge, remove barriers, and grow as individuals and a team.

Together, we’re building the next generation healthcare platform, and proud to be on the leading edge of this important mission.

Responsibilities

On any given day you may need to...

  • Collaborate on and/or lead engineering efforts from requirements to production, solving problems of developer productivity and presenting complex technical concepts to the team, engineering org, and leadership audiences.
  • Write code that is well-tested, easily understood, and maintainable by others.
  • Troubleshoot and fix complex production issues related to availability or performance, even if they are outside your comfort zone.
  • Apply software engineering principles to the operations of our systems in order to reduce toil.
  • Advise, critique, or comment on engineering designs.
  • Help our internal customers solve their problems in as efficient and future-proof a manner as possible.
  • Create and execute plans that ensure our existing infrastructure remains up-to-date, compliant, and secure.
  • Work independently and autonomously.

Qualifications

  • 12+ years experience in DevOps, Site Reliability Engineering, or Platform Engineering.
  • Production experience building and supporting Kubernetes clusters.
  • Familiarity with using Infrastructure as Code and CI/CD technologies (e.g., Terraform, Ansible, ArgoCD, Jenkins)
  • Good understanding of private and public cloud design considerations and limitations in the areas of infrastructure, distributed systems, data storage, Linux-based operating systems, and security.
  • Ability to drive projects that involve multiple internal and external stakeholders to completion.
  • Experience creating and monitoring SLIs and SLOs in order to set and remain within error budgets.
  • Proven technical domain leadership: decomposing tasks, setting priorities, triaging incoming bugs and requests.
  • Experience in supporting customer-facing production systems and responding to incidents as part of an on call rotation.
    Knowledge of data structures, algorithms, distributed systems, and information retrieval.
  • Experience developing in one or more general purpose programming or scripting languages, including but not limited to: Java, GoLang, JavaScript, Groovy, Python, Shell Scripting, Rust.
  • Experience in diagnosing and resolving incidents that involve application, OS, network, infrastructure, partners, people, and process.
  • Understanding of networking concepts such as routing, firewalls, load balancers, and secure communication -- especially in the context of cloud infrastructure.
  • Methodical problem-solving approach, coupled with strong communication skills and an ability to own and drive projects to completion.
  • Demonstrated technical mentorship and ability to increase the abilities of those on and outside the team.

What you'll do:

  • Collaborate on and/or lead engineering efforts from requirements to production, solving problems of developer productivity and presenting complex technical concepts to the team, engineering org, and leadership audiences.
  • Write code that is well-tested, easily understood, and maintainable by others.
  • Troubleshoot and fix complex production issues related to availability or performance, even if they are outside your comfort zone.
  • Apply software engineering principles to the operations of our systems in order to reduce toil.
  • Advise, critique, or comment on engineering designs.
  • Help our internal customers solve their problems in as efficient and future-proof a manner as possible.
  • Create and execute plans that ensure our existing infrastructure remains up-to-date, compliant, and secure.
  • Work independently and autonomously.

Pay Transparency Statement

This is a hybrid position based out of our San Francisco office, with the expectation of being in office at least two weekdays per week. #LI-hybrid

The actual pay rate offered within the range will depend on factors including geographic location, qualifications, experience, and internal equity. In addition to the salary, you will be eligible for stock options and benefits like health insurance, 401k, and paid time off. Learn more about our benefits at https://jobs.collectivehealth.com/#benefits.

San Francisco, CA Pay Range
$184,400$242,025 USD

About Collective Health

Collective Health simplifies employee healthcare with an integrated technology solution that makes healthcare work for everyone. With 400,000 member lives and over 70 clients—including Driscoll’s, Pinterest, Red Bull, Restoration Hardware (RH), and more—Collective Health is reinventing the healthcare experience for forward-thinking organizations and their people across the U.S. The company has developed an integrated health benefits platform, and partnered with innovative companies across care delivery and diagnostics to meet the most pressing healthcare challenges for employers today.

Privacy Notice

For more information about why we need your data and how we use it, please see our privacy policy: https://collectivehealth.com/privacy-policy/.