Location: Remote

Employment Type: Full-Time

Team: Operations

As a Site Reliability Engineer (SRE) at Circonus, you will be responsible for keeping Circonus SaaS and on-premise customers up and running as well as improving the automation, scalability, and performance of systems. This is an unparalleled opportunity to grow on a small, collaborative, and friendly team with established leadership in the field of SRE.

A successful candidate will be able to effectively communicate across multiple departments and customers, can shift gears at a moment’s notice, and enjoys the challenges of supporting enterprise clients. This is a client facing role where presentation skills are important. Also, a successful candidate will be working in a support rotation capacity.

This position is 100% remote.

Job Responsibilities

    Install, upgrade and manage systems powering customer infrastructure running Circonus software

    Troubleshoot availability and performance issues

    Diagnose production issues and perform front-line remediation

    Communicate with management and customers regarding aberrant system’s behavior

    Influence software and architecture design based on system and architecture observations related to performance and reliability

```
    Participate in an on-call schedule
```

Job Requirements

```
    Linux (RHEL, CentOS, Ubuntu)
```

    Experience working with cloud service providers such as AWS, Azure, or GCP

    Ansible, Chef or similar configuration system

    HAProxy, PostgreSQL, Apache or similar technologies

    Strong networking knowledge: firewalls, TCP & UDP, DNS, SSL/TLS

    Strong understanding of monitoring principles

    Familiarity leveraging REST and REST-like APIs for operations tasks

    UNIX troubleshooting skills: tcpdump, strace, bpftrace, etc

    Fluency in one or more of the Git, Subversion or Mercurial version control systems

Preferred Experience

    7+ years’ experience in the technology industry

    Experience and/or senior technical knowledge of monitoring and analytics solutions

    Experience with Docker, Kubernetes and containers

    Terraform, Chef and Ansible experience

```
    Open search experience
```

    The right person will be highly technical and analytical much like the company itself

Circonus offers a powerful telemetry intelligence platform to handle the world's most demanding use cases. From mission-critical IT infrastructure to data-intensive IoT applications, Circonus works with any tech and at any scale. Circonus uses advanced data science and patented technology to ingest and analyze telemetry data to deliver unmatched clarity, insights, and performance. From real-time alerts and fault detection to ML-based predictive analytics, Circonus helps companies optimize operations and deliver exceptional user experiences with confidence.

We recently raised a $10M Series B round led by Baird Capital with participation from our existing investors NewSpring Capital, Osage Venture Partners, and Bull City Venture Partners. This new funding is earmarked to further accelerate our growth, scale product innovation, and build upon the company’s record-setting performance in 2021.

Culturally, we operate like a startup. Small, agile teams with quick decisions and short, iterative cycle times. We relish our core values of respect, integrity, value, and growth, among others.

All of our positions include a discretionary PTO policy, generous employer health, and dental insurance, employer-matched 401(k) Plan, and more.

Job Summary

Job Title
Site Reliability Engineer
Company
Circonus
Location
Remote
Employment Type
Full time

Ready to apply?

Job Responsibilities

Job Requirements

Preferred Experience

Site Reliability Engineer

Circonus

Remote

Full time