Site Reliability Engineer - Cloud Platform at Cyence | Powderkeg

Location: India - Bangalore

Employment Type: Full-time

Team: Product Development Operations

At Guidewire, we make software that offers Property and Casualty (P&C) Insurance companies the tools to take care of their customers when they need it the most, whether that’s a time of crisis, a natural disaster, an accident, or exposure to cyber risks. We build the core applications that insurance companies use to sell and underwrite policies, settle claims, and bill their customers. We also have a portfolio of innovative products serving the needs of P&C insurance companies in areas such as data management, digital online portals, and predictive analytics. We run these products on the Guidewire Cloud Platform, and we help hundreds of insurance providers all over the world to handle billions of dollars of business.

We are proud to be voted a Top Cloud Employer on Glassdoor by our own employees and positioned as a market leader by industry experts like Gartner. We have a fun work environment and a culture that lives by our core values of integrity, rationality, and collegiality.

We’re searching for people who are as passionate about working together to deliver quality products and support as we are. Join us and enjoy a career where you can make an impact. You’ll be inspired by those around you, and you’ll be trusted and empowered to go further.

Guidewire’s Platform Team is part of the Product Development, Cloud Common Services organization, delivering 24x7 platform services to Guidewire’s flagship Cloud Platform.

As a Site Reliability Engineer, you will be part of a team that is passionately automating everything possible to make Guidewire systems run more efficiently. The Platform team is dedicated full-time to creating and running software that improves the reliability of systems in production, serving hundreds of customers and supporting millions of transactions each day. You will be ensuring the reliability of Guidewire’s flagship cloud platform and InsuranceSuite products and building tooling to help ensure efficient operations and optimal availability of all SaaS multi-tenant and customer- focused systems. Platform SREs collaborate closely with Guidewire’s core product developers to ensure that the Guidewire core cloud products address functional and non-functional requirements such as availability, performance, observability, and maintainability.

This role requires a high degree of collaboration, teamwork, ownership, and responsibility. If you like to be challenged and have a passion for solving problems at scale with systems like AWS, Kubernetes and Aurora, then we would love to hear from you. The ideal candidate is someone who exemplifies the ethics of, "If you have to do something more than once, automate it," and who can rapidly self-educate on new concepts and tools. Bonus points if you have prior experience doing production support of a SaaS platform, and are comfortable working with bleeding-edge highly containerized cloud-native environments in AWS.

How you'll make an impact:

  • Deploy and take an SRE approach to shared multi-tenant infrastructure for resilient SaaS microservice-based containerized systems in addition to customer-centric application environments
  • Oversee and automate the team’s growing presence in AWS
  • Contribute to core infrastructure systems development with features, bug fixes, reliability improvements, etc
  • Oversee at a platform level a complex single sign-on SAML/OAuth-based central authentication platform
  • Develop and deploy tooling to aid in driving 24x7x365 service operations of critical worldwide systems
  • Automate deployment tasks for core product and infrastructure tools and maintain automation infrastructure
  • Create system documentation and training materials to empower and educate our fellow team members
  • Build and maintain observability tooling, metrics, and dashboarding for a global core platform infrastructure

Education and Work Experience

  • Bachelor’s Degree in Computer Science or related field
  • Must be able to write code and program – expected of all team members
  • Familiarity with the Agile software development lifecycle
  • Deep background with Linux system administration and engineering
  • Near expert-level understanding of and experience with engineering on Amazon Web Services (AWS)
  • Software engineering and task automation skills with Bash, Python, and/or Go are a must.
  • Experience supporting web applications running on Java / Apache / Tomcat in a live production environment
  • Demonstrable experience with automating systems and infrastructure with Terraform
  • Production-At-Scale support background in a heavily microservice-based world
  • Working with and engineering Kubernetes hands-on in a “been there, done that” manner
  • Strong understanding of Single-Sign-On, SAML, OAuth (Bonus if the hands-on experience with Okta)
  • Familiarity and direct hands-on experience with DevOps tools (SCM (git, Bitbucket) and CI/CD (TeamCity))
  • Seasoned expertise around x.509 certificate technology and basic concepts of encryption
  • Solid understanding of concepts surrounding containerized networking and all things IP
  • Experience working with Relational Databases such as Aurora Postgres and/or Oracle RDS
  • Advanced exposure to broad technical skills such as application development, web UI (design and development), JSON, application architecture
  • Ability to read and interpret application server thread dumps, Catalina outputs, CloudTrail, and other critical logging outputs.
  • Experience strongly utilizing observability tools (logging/APM) like Datadog, CloudWatch, and PagerDuty.

Personal Qualities and Soft Skills

  • You greatly prefer CLIops to ClickOps
  • You enjoy teaching and being a mentor to others
  • Outstanding troubleshooting skills; ability to think critically and display an aptitude for problem-solving
  • Strongly analytical mind with a penchant for process development and enhancement
  • Display a strong work ethic and do whatever it takes to get the job done
  • A highly positive can-do attitude with a knack for being a team player
  • Excellent communication skills and ability to explain complex technical concepts to a varied audience
  • Demonstrate strong follow-through and consistently keep commitments to customers and employees

Other Requirements

  • Ability to read, write, and speak fluent English.
  • This position is regularly involved in pair programming.
  • We provide 24x7 support to our customers, so you'll to take turns with your teammates being on-call for weekend production emergencies or to provide rotating weekend operational support.
  • This is an office-based role; you will be working onsite in our Bangalore office.
  • May involve occasional travel (less than 5%) to other Guidewire offices for training and team meetings.

Inside Guidewire

#Featured

About Guidewire

Guidewire is the platform P&C insurers trust to engage, innovate, and grow efficiently.

Guidewire combines core, data, digital, analytics, and AI to deliver our platform as a cloud service. More than 400 insurers, including the largest and most complex in the world, run on Guidewire.

As a partner to our customers, we continually evolve to enable their success. We are proud of our unparalleled implementation track record with 1000+ successful projects, supported by the largest R&D team and partner ecosystem in the industry. Our Marketplace provides hundreds of add-ons that accelerate integration, localization, and innovation.

Guidewire Software Inc. provides equal employment opportunities to all applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. All offers are contingent upon passing a criminal history and other background checks where it's applicable to the position.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Job Summary
  • Job Title
    Site Reliability Engineer - Cloud Platform
  • Company
    Cyence
  • Location
    San Mateo, CA
  • Employment Type
    Full time
Ready to apply?
Ready to apply?