Staff Site Reliability Engineer at Matillion | Powderkeg

Location: Ireland

Employment Type: Full time

Team: Site Reliability Engineering

Matillion is The Data Productivity Cloud.

We are on a mission to power the data productivity of our customers and the world, by helping teams get data business ready, faster. Our technology allows customers to load, transform, sync and orchestrate their data.

We are looking for passionate, high-integrity individuals to help us scale up our growing business. Together, we can make a dent in the universe bigger than ourselves.

We are now looking for a Staff Site Reliability Engineer to join our SRE team.

About the Role

The SRE org at Matillion is made up of multiple teams which combined, own the operation and efficiency of our cloud platforms and services. It covers everything from the build, provisioning and maintenance of our cloud Infrastructure as well as reliability, capability management, observability, monitoring and metrics of our SaaS platform.

Reporting into the Director of SRE and Observability, you will utilise your experience across all pillars of Site Reliability Engineering to drive best practice aimed at enhancing our ability to build truly reliable, observable and performative infrastructure for all our core services. Your experience building modern, multi-cloud platforms will play a pivotal role as we continue to modernise our stack and implement a wide range of new tools around logging, monitoring, metrics and alerting.

This role can work remotely from Ireland, with occasional in person meetings required in either our Manchester, UK or Denver, US HQ.

Technologies You’ll Use... Kubernetes, AWS, ArgoCD, Terraform, DataDog, Prometheus, Golang/ Python.

What You’ll Be Doing:

  • Leading the design of major software components, systems, and features to improve the availability, scalability, latency, and efficiency of Matillion’s SaaS services
  • Drive the design, implementation and management for expanding observability infrastructure, keeping up to date with new tools and technologies and be a recognised member of the broader Observability community
  • Lead sustainable incident response, blameless postmortems, and production improvements that result in direct business opportunities for Matillion
  • Define and document best practices across all pillars of SRE
  • Providing guidance and mentorship to other team members on managing end-to-end availability and performance of critical services, design techniques and coding standards to cultivate innovation and collaboration across the business
  • Balancing competing priorities as you manage a range of individual projects, deadlines, and deliverables

What we’re looking for:

  • A passion for everything performance, observability, availability, scalability and security with experience owning and delivering projects using Agile methodologies
  • Extensive experience with Kubernetes and the surrounding ecosystem with tools such as Linkerd, Traefik and ArgoCD is a must
  • Have previous experience of large scale web operations in a public cloud environment
  • Be competent in Ruby, Go, Java, Python or an equivalent programming language
  • Have worked with some of the following key technologies: Prometheus, Grafana, Elasticsearch, Logstash, Kibana, OpenTelemetry, Micrometer, New Relic, Data Dog
  • Ability to manage and provision infrastructure using code with Terraform or CloudFormation

At Matillion, we are committed to providing competitive salaries in line with market standards. Our estimated compensation range for this position is €84,000 - €126,000 but the final salary will be based on your relevant skills, experience and qualifications demonstrated in the hiring process.

Matillion has fostered a culture that is collaborative, fast-paced, ambitious, and transparent, and an environment where people genuinely care about their colleagues and communities.

Our 6 core values guide how we work together and with our customers and partners. We operate a truly flexible and hybrid working culture that promotes work-life balance, and are proud to be able to offer the following benefits:

- Company Equity

- 30 days holiday + bank holidays

- 5 days paid volunteering leave

- Health insurance

- Life Insurance

- Access to mental health support

- Career development with access to a Udemy account, Blinkist and much more!

More about Matillion

Thousands of enterprises including Cisco, DocuSign, Pacific Life, Slack, and TUI trust Matillion technology to load, transform, sync, and orchestrate their data for a wide range of use cases from insights and operational analytics, to data science, machine learning, and AI.

With over $300M raised from top Silicon Valley investors, we are on a mission to power the data productivity of our customers and the world.

We are passionate about doing things in a smart, considerate way. We’re honoured to be named a great place to work for several years running by multiple industry research firms.

We are dual headquartered in Manchester, UK and Denver, Colorado.

We are keen to hear from prospective employees, so please apply and a member of our Talent Acquisition team will be in touch. Alternatively, if you are interested in Matillion but don't see a suitable role, please email talent@matillion.com

Matillion is an equal opportunity employer. We celebrate diversity and we are committed to creating an inclusive environment for all of our team. Matillion prohibits discrimination and harassment of any type, Matillion does not discriminate on the basis of race, colour, religion, age, sex, national origin, disability status, genetics, sexual orientation, gender identity or expression, or any other characteristic protected by law.

Job Summary
  • Job Title
    Staff Site Reliability Engineer
  • Company
    Matillion
  • Location
    Altrincham, CH
  • Employment Type
    Full time
Ready to apply?
Ready to apply?