Principal Site Reliability Engineer at Xometry | Powderkeg

Location: Atlanta, GA

Employment Type: Full-Time

Team: DevOps/Site Reliability

Xometry (NASDAQ: XMTR) powers the industries of today and tomorrow by connecting the people with big ideas to the manufacturers who can bring them to life. Xometry’s digital marketplace gives manufacturers the critical resources they need to grow their business while also making it easy for buyers at Fortune 1000 companies to tap into global manufacturing capacity.

Xometry is looking for an experienced Site Reliability Engineer who is excited about containers and container orchestration with Kubernetes, understands microservices, and has a passion for infrastructure as code. This person also has a passion for building tooling that makes it easier for others to build, deploy and scale their software in a cloud environment.

What You'll Do

  • Automate all the things
  • Build new tools and platforms when you see repeatable patterns across the team workflows
  • Coach Software Engineering and Data Science teams on best practices and architectural decisions
  • Own the security operations that protect our customer data while maintaining development velocity
  • Obsess over feedback loops: build, measure, and improve
  • Have a passion for resolving reliability issues and identifying strategies to mitigate repeat issues
  • Enable the software engineering community to build faster with less friction
  • On-call support rotations

What We’re Looking For

  • 5+ years experience as an SRE or DevOps engineer in an eCommerce, API based, or B2C platform company. Said differently - this isn’t your first SRE rodeo
  • Architectural experience designing highly available and secure internet facing web-based services
  • Strong experience with AWS (preferred), Azure, or Google cloud infrastructure
  • Strong container management expertise with Docker, Kubernetes, Helm, Service Mesh, and Microservices
  • Versed in automating infrastructure (Terraform preferred, though similar experience with Ansible, Cloudformation, etc. considered)
  • Experienced with CI/CD Pipeline creation and operations
  • Knowledgeable in full system monitoring, metrics, KPIs, and reporting (Datadog preferred, though not necessary)
  • Strong experience with API fundamentals (REST and GraphQL in particular)
  • Experience developing software tools to support operations and development (language agnostic)
  • A master of root cause analysis, especially of complex distributed systems
  • Capable of writing documentation on complex topics for easy digestion
  • Able to clearly present technical project plans, issues, system status, policies and procedures, etc. to all levels of management
  • Excellent understanding of Internet technologies and protocols (TCP/IP, DNS, HTTP, SSL, etc.)

If this job isn’t for you but you have a friend who may be a perfect fit - share this job with them!

Xometry is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

Xometry participates in E-Verify and after a job offer is accepted, will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S.

Job Summary
  • Job Title
    Principal Site Reliability Engineer
  • Company
    Xometry
  • Location
    N/A
  • Employment Type
    Full time
Ready to apply?
Ready to apply?