The SaaS Engineering organisation is seeking an SRE Delivery Lead that enjoys solving problems, working with customers, and has a technical background from a variety of different fields including Linux/Windows systems administration, Cloud design and optimization, network administration, and DevOps.
The role will involve implementing SaaS delivery in a secured and operational manner and the successful candidate will lead a team of SRE engineers who implement SRE best practices before rolling them out across the entire SaaS Engineering team. As the leader of the SRE team, you will be a key part of the SaaS engineering leadership team and will shape the future of SaaS Engineering.
• Development skills are needed as we do all our infrastructure setup and configuration via Infrastructure as Code in Typescript using the Pulumi framework.
• Hands on experience of mentoring and leading infrastructure teams who own, build, and run their resources.
• Production experience in Cloud DevOps engineering or similar role and experience dealing with universally available, resilient systems which can fail.
• We use AWS for our end-to-end SaaS infrastructure so expert knowledge of working with the AWS stack is a big plus.
• Capability to react to failures and resolve them with composure and robust, repeatable practices, including active participation in our 24/7 production support processes.
• Willingness to take ownership and ability to show strong personal commitment for the department and the team goals; must be comfortable with being considered a reliable and proactive influential technical lead who is not afraid to take on responsibility in the team.
• As a technical leader, this person is expected to drive technical excellence at every level while leading the team to identify and address technical risks early through rigorous design reviews.
• Hands-on experience leading Cloud architecture design, development & deployment and can deep dive into technical issues and provide technical leadership.
• Unix / Linux administration skills and experience working with containerised applications running in serverless container hosts.
• Effectively communicate progress toward project/program goals
• Has strong business acumen and customer-centricity.
• Participate in on-call duties rotation for out of hours Support
• May be required to travel occasionally (15%)
• Develop and propose CI/CD improvements including technology choices that can help our engineering teams deliver more effectively.
• Take ownership of the SRE aspects of a highly serverless, cloud native SaaS platform that has been built from the ground up in AWS.
• Propose AWS service selections and enhancements to our overall product stack and support delivery of those technologies all the way to production.
• Help projects to adopt cloud technologies, supporting them in defining requirements, building prototypes, proof of concept environments.
• Troubleshooting complex deployment issues on a multi-tier cloud enterprise solution.
• Proactively automate infrastructure and services to enable a small team to deliver value to a global enterprise.
• Provide input on automation, and a systematic approach to configuration, deployment and infrastructure maintenance and recovery as well as continuing to improve the performance and reliability of the network and the overall service.
• Work directly with customers, operations, and engineering to research, troubleshoot, and resolve performance issues in a timely manner.
• Help define technical solutions to meet business needs through Agile process.
• Gather and analyse data to aide in informed decision-making while providing detailed, realistic estimates.
• Interact skilfully with business stakeholders and third-party technical organizations.
• Work closely with architecture and security teams to deliver security by design for our client’s wider IT teams.
• Document & present DevOps practices to internal teams and wider teams to promote Cloud DevOps and SRE best practices.
• Degree or Diploma in Computing or similar related qualification.
• AWS Certification.
• Linux Red hat certification would be a bonus.
• Datadog experience would be a bonus.