Staff Site Reliability Engineer - SaaS

North America | Remote

Datto, the world’s leading provider of IT solutions delivered through managed service providers (MSPs), is looking for a Staff Site Reliability Engineer (SRE) to join a rapidly growing team. Learn more at datto.com.

The Position

You will be joining the Site Reliability Engineering team, where you will protect, provide for, and progress the software and systems behind Datto’s SaaS Protection services, with an ever-watchful eye on their availability, latency, performance, and capacity.

Who You Are

  • Never place blame during postmortems.
  • Gonna learn something new, every day.
  • Give time and effort to assist and mentor teammates.
  • You care about the work you provide to customers.
  • Up for any challenge.

What You’ll Be Doing

  • Write code to automate operational tasks and reduce toil.
  • Drive product reliability improvements through monitoring, alerting, and application of software development best practices.
  • Identify creative ways to break products, uncover and report defects, and validate systems/solutions are operating as intended.
  • Write, review, and execute test plans/strategies for validating product/system performance, scalability, and reliability.
  • Work with infrastructure teams to build, configure, and monitor Kubernetes clusters.
  • Participate in incident postmortems and perform root cause analysis.
  • Analyze production logs, alerts, and metrics in order to identify potential issues and implement service improvements.
  • Participate in an on-call rotation while helping to reduce MTTR and achieve SLO targets.

What We’re Looking For

  • 5+ years in an SRE or DevOps role.
  • Proficiency in Python and/or Go.
  • Proficiency in one or more Infrastructure as Code technologies (Ansible, Terraform, Salt, etc).
  • Experience with Kubernetes clusters.
  • Experience with one or more observability technologies (VictoriaMetrics, Prometheus, Grafana, Datadog, Zabbix, etc).
  • Familiarity with SLIs and SLOs.
  • Experience with relational databases at scale.

The Company

Datto is the world’s leading provider of cloud-based software and security solutions purpose-built for delivery by over 17,000 managed service providers (MSPs), and reaching more than a million small-to-medium businesses. Datto is a leader in the $28B global managed services market, operating at exabyte scale and experiencing rapid year-over-year growth since becoming a public company in 2020.

Datto SaaS Protection is a cloud-to-cloud storage solution that is purpose built for M365 and G-Suite. MSPs know that the shared responsibility model with backing up these applications can save their clients time and money by protecting the data that matters most to them. To date, over 2 million SaaS Protection users are backing up to Datto Private Cloud 24x7x365. In the event of loss, MSPs can quickly recover their clients’ data with one easy to use single pane of glass.

Benefits

At Datto, we believe our employees are our greatest asset and offer all full-time employees a wide-ranging benefits package, including:

  • Comprehensive health-care benefits
  • Flexible paid time off policy
  • Charity match program
  • Education reimbursement
  • ...and more!

By submitting an application, you acknowledge we will process your data in order to consider you for the position you apply for and for other open positions within our company for which you may be suited.  We collect and store your data in accordance with our Recruiting Privacy Practices.

Datto is an equal opportunity employer.

Staff Site Reliability Engineer - SaaS

loadingspinner

Sorry, your application was not successfully submitted

Hurray! Your application was successfully submitted

Back to Careers