Senior Cloud Software Engineer - Observability (Remote)
We are the company behind the popular open-source, high performance columnar OLAP database management system for real-time analytics. ClickHouse works 100-1000x faster than traditional approaches. By offering a true column-based DBMS, it allows for systems to generate reports from petabytes of raw data with sub-second latencies. With an amazing community already adopting our open-source technology, we are now embracing our journey in delivering Cloud first solutions to delight our customers.
With top adopters such as Uber, Cisco, and eBay - not only do our products work at lightning speed, so do we.
We are an open and collaborative company. Our colleagues are curious, engaged and excited about what they do. If you want to work in an environment where you can learn, grow, be an agent of change and have your voice heard - then please read on!
About the team:
The Observability team is fully committed to optimizing our internal systems and empowering other teams at ClickHouse by providing them with the necessary visibility to ensure optimal performance. Our overarching mission is to foster a strong observability culture, enabling all ClickHouse teams to create, monitor, and maintain systems that are not only dependable but also easily comprehensible and diagnosable. We are actively seeking exceptionally talented and seasoned software and site reliability engineers to join our dynamic team. As part of this role, you will be responsible for architecting, developing, and overseeing the observability platform we are building at ClickHouse.
What will you do:
- Design and build petabyte-scale observability platform for all engineering teams to consume.
- Develop and improve instrumentation for monitoring and logging the health and availability of services.
- Design and develop tools for metric collection, analysis and reporting.
- Educate and lead efforts to improve observability among all engineering teams.
- Systematically improve availability by applying industry and distributed systems best practices.
- Improve performance and cost efficiency of our infrastructure.
- You have 5+ years of relevant software development industry experience building and operating scalable, fault-tolerant, distributed systems.
- Software development experience in Go, Python, C/C++ or similar languages
- Experience with one or more Cloud Service Providers such as AWS, GCP, Azure
- Experience with technologies such as Kubernetes, Helm, ArgoCD, Temporal as well as infrastructure-as-code tools such as Terraform or CloudFormation
- Understanding of network topologies, protocols, and security principles, such as VPNs, firewalls, and load balancers.
- Knowledge of cloud security best practices, including encryption, access controls, and compliance standards like SOC2 and GDPR.
- You have excellent communication skills and the ability to work well within a team and across engineering teams.
- You are passionate about observability, efficiency, availability, scalability and data governance.
- You are a strong problem solver and have solid production debugging skills.
- You thrive in a fast-paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward.
- You have a high level of responsibility, ownership, and accountability
- Experience building or maintaining observability solutions, such as designing and/or building telemetry collection, transport, and storage solutions
- Familiarity with open source and/or commercial observability technologies; familiarity with solutions in metrics, logs, and tracing domains
- Experience leading and shipping large scope technical projects in collaboration with other engineers.
- Experience with ClickHouse database
This role offers cash compensation and a stock options grant. For roles based in the United States, you can find above our typical starting salary ranges for this role, depending on your specific location.
The positioning of offers within a certain range depends on various factors, including: candidate experience, qualifications, skills, business requirements and geographical location.
- Flexible work environment - ClickHouse is a distributed company offering remote-first work to all employees
- Healthcare - Employer contributions towards your healthcare.
- Equity in the company - Every new team member who joins our company receives stock options.
- Time off - Flexible time off in the US, generous entitlement in all countries.
- A $500 Home office setup if you’re a remote employee.
- Employee-driven international mobility- we enable you to relocate internationally if you wish (within certain countries and timelines and subject to role requirements, time zones and work permit considerations)
Culture - We All Shape It
As part of our first 200 employees, you will be instrumental in shaping our culture.
We look for candidates who are:
- Motivated by doing great work as part of a team :)
- Open to learning from others and sharing with others
- Team Players: helpful, resourceful, responsive
- Respectful and see feedback as an opportunity to grow
Are you interested in finding out more about our culture? We are a one year old company therefore we are excited to be building it together at the moment. Our first 200 employees are the culture shapers of our future. Check out our blog posts or follow us on LinkedIn to find out more about what’s important to us, and to find out if you’d like to come and contribute to building our culture with us!