Our client is currently seeking an experienced SRE to deliver insights from our industry leading C2P platform in real time. Specifically, searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to help our customers succeed.
- Run the production environment by monitoring availability and taking a holistic view of system health
- Build software and systems to manage platform infrastructure and applications
- Improve reliability, quality, and time-to-market of our suite of SaaS software solutions
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
- Provide primary operational support and engineering for multiple large distributed software applications
- Gather and analyze metrics from all layers of the C2P platform (AWS, OS, Application, DB) to assist in performance tuning and fault finding.
- Participate in a follow-the-sun support model ensuring the reliability and uptimes of the C2P platform.
- Partner with engineering teams to improve services through rigorous testing and release procedures using a dynamic DevOps model.
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplifts
- Balance feature development speed and reliability with well-defined service level objectives
- Bachelor’s degree in computer science, IT or other highly technical, scientific discipline
- 3+ Years experience in a Site Reliability role
- Experience with automation and scripting languages, including CloudFormation and Terraform
- Experience with Amazon Web Services, including supporting IAM, Networking, S3, Lambda, EC2 & RDS services.
- Experience with Configuration Management tools in a DevOps environment.
- Experience with Elastic Search, Splunk, Sylog, Logstash and related logging and monitoring tools.
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
- Previous success in technical engineering and supporting a complex SaaS solution.
- Coding experience beyond simple scripts