Site Reliability Engineer
Job Details
Remote - Fort Mill, SC
Fully Remote
Full Time

Position Overview:

AccessOne is a leading provider of flexible, co-branded patient financing solutions. Founded by providers, our solution provides a consumer-focused experience which drives high patient satisfaction for our clients. We have helped over one million consumers afford out-of-pocket medical expenses for health systems nationwide.

We have an immediate need for a Site Reliability Engineer with responsibility for reliability, health, and performance of critical applications in the cloud. This opportunity is remote-eligible from the Eastern or Central time zones in the United States. It can also be based out of our headquarters in Fort Mill, SC.

What you can expect:

  • Learning, advocating, and adopting processes and industry best-practices
  • Gaining deep knowledge and understanding of our application ecosystem and services
  • Setting a high bar for reliability, quality, and operational efficiency through continuous improvement
  • Thinking, innovating, and engineering solutions to detect and solve complex problems, which are hard to solve using conventional tools
  • Analyze, design, and implement strategies across our cloud infrastructure with emphasis on security, traffic management, cluster configuration, monitoring and operations
  • Investigate issues and drive tickets from triage through to resolution
  • Conduct system tests and put processes in place to monitor security, performance, and availability of the service
  • Provide continuous feedback from the production environment to the development team
  • Set up telemetry (logs, metrics, and events) on production systems as well as deployment pipelines to create continuous real-time feedback mechanisms, providing insights to the development team
  • Drive automation efforts for, but not limited to, health checks, data corrections, and self-heal
  • Handle and address requests from the engineering team for configuration changes, permissions/access based on laid out security policies
  • Make recommendations to the development team on areas related to the reliability, maintainability, availability, security, and performance of the system as well as efficiency of the team
  • Metrics reporting and Dashboard creation
  • Perform root cause analysis and work to implement preventative measures
  • Incident, Problem, and Change ticket management
  • Coordination of release and change activity
  • Communicate to IT leadership and Business partners regarding major incidents

What you bring:

  • Are passionate about enabling teams to build, test and deploy software faster and more reliably
  • 5+ years enterprise experience with a diverse mix of SRE, DevOps, System Administration
  • Must be experienced in working with containers, orchestration (Kubernetes) and supportive monitoring configuration tools within the ecosystem
  • Must be an experienced script developer, comfortable developing shell scripts and at-least one or more scripting languages
  • Must be knowledgeable about integrating, configuring, deploying and managing centrally provided common cloud services (e.g., IAM, networking, logging, operating systems, containers)
  • Are knowledgeable about network, server, and application-status monitoring.
  • Comfortable deploying and configuring tools to suit evolving needs by setting up tools and dashboards (Ex: New Relic, CloudWatch, etc.)
  • Are experienced in working day-to-day with Git (version control), with knowledge of versioning, release and change management
  • Are knowledgeable about automated configuration management, infrastructure-as-code and orchestration: (e.g. CloudFormation, Terraform, Ansible, ARM)
  • Ability to be in an on-call rotation to support a production environment
  • Experience managing and monitoring real-world applications and services in the cloud
  • Experience building CI/CD pipelines
  • Security awareness throughout the stack
  • Experience working in a remote Agile environment
  • Exceptional troubleshooting skills with the ability to spot issues before they become problems
  • Curiosity and the desire to learn

Bonus points for:

  • Experience with microservices and serverless functions
  • Cloud computing experience, such as AWS and/or Azure
  • Experience with SOC2, HIPAA, PCI, or other regulatory or compliance standards
  • Experience with databases such as MS SQL, MySQL, NoSQL, Elasticsearch a plus

We understand the importance of offering quality compensation and benefits to our outstanding employees. Our commitment to your success is enhanced by our competitive compensation and an extensive benefits package. As an AccessOne employee, you will have access to top of the line medical, dental and vision benefits on day one. Other benefits available to all employees include but are not limited to 401(k) with company match, company paid life insurance, paid holidays, generous paid time off, tuition reimbursement, paid parental leave and flexible work schedules. We work to maintain the best possible environment for our employees, where people can learn and grow with the company with offerings such as remote work, casual dress code, volunteer opportunities, company competitions, career development, employee engagement, rewards & recognition and more. We strive to provide a collaborative, creative environment where each person feels encouraged to contribute to our processes, decisions, planning, and culture.

AccessOne is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, or any other characteristic protected by law.