Site Reliability Engineer at TrueAccord | Powderkeg

Location: Remote, USA

Employment Type: Full-time

Team: Data Streaming Platform

Why TrueAccord?

TrueAccord is a category-defining company. We combine machine learning with a human-based approach to transform debt resolution and to get people on the path towards financial health. Every year, more than 70 million Americans have negative experiences dealing with debt. We are changing this by providing a personalized digital experience that guides lenders and consumers through this challenging financial process.

With a world-class leadership team, passionate team members, and proprietary predictive models trained on years worth of transactional data, TrueAccord is well-positioned to deliver on a huge opportunity: helping millions of consumers to regain and keep their financial footing while lowering the cost of doing business for creditors across many industries.

The Opportunity:

You will be joining the Data Streaming Platform team. You will be part of a highly collaborative and competent team where everyone owns a piece of TrueAccord’s success. The main goal of this team is to make TrueAccord’s data layer more accessible. You will help design a distributed query system based on Spark and Kafka running on Kubernetes. You will also help drive data liberation initiatives and CDC pipelines for data synchronization that helps different teams build new features faster.

Your average workday will involve writing and reviewing infrastructure code, scoping and validating ideas, and mentoring those around you. You will be designing infrastructure constructs using CDK and Terraform, reusing technologies that are a part of our current toolkit, or introducing new ones.

What We're Looking For:

  • Solid understanding of container orchestration via Kubernetes, primarily using AWS EKS.
  • Cloud Services and Architecture expertise, preferably AWS
  • Knowledge of Infrastructure as Code principle and following frameworks: AWS CDK, Terraform, Helm
  • CI/CD expertise to streamline build pipelines and know how to reduce the friction when moving from development to production
  • A good understanding of observability patterns and principles, and knowing how to implement them in practice
  • The confidence to work with a high level of autonomy
  • The ability to quickly identify issues in production and design long-term improvements to prevent them

You might also have:

  • Experience with distributed systems and hands-on with at least a few of the Big data technologies such as Kafka or Spark
  • Experience with SRE goals, processes, and culture
  • Experience working in database, SRE or infrastructure teams, or have operated a data storage system such as MySQL.
  • Experience mentoring and educating those around you

_ Benefits, Perks, and Culture_

- Everything you need to work remotely

- Work with talented and motivated people in a fast-paced, mission-driven environment

- Medical/dental/vision insurance, 401k (with match), flex spending plan, and life insurance

- Family-friendly policies - parental leave, flexible work from home

- Unlimited PTO

- Transportation benefits

- Paid time off to do volunteer work in your community!

We are a dynamic group of people who are subject matter experts with a passion for change. Our teams are crafting solutions to big problems every day. If you’re looking for an opportunity to do impactful work, join TrueAccord and make a difference.

_ Our Dedication to Diversity & Inclusion_

TrueAccord is an equal opportunity employer. We promote, value, and thrive with a diverse & inclusive team. Different perspectives contribute to better solutions and this makes us stronger every day. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Job Summary
  • Job Title
    Site Reliability Engineer
  • Company
    TrueAccord
  • Location
    Lenexa, KS
  • Employment Type
    Full time
Ready to apply?
Ready to apply?