Site Reliability Manager/Owner

Contract
  • Contract
  • Roseland, NJ
  • Applications have closed

MaxJobsClub US Contractor

minimum of 10+ years of software engineering and/or infrastructure experience with an in-depth understanding of cloud.

Must have Azure/ AWS and VMware exp.

The System Reliability Owner will be responsible for management of company GETS Infrastructure agenda on achieving a fundamentally better reliability posture. System Reliability Owners use engineering and development mindset to determine solutions that will ultimately improve the availability, performance, and security of the external facing software products and internal applications used by ***’s clients, partners and associates.
We believe people make great companies, not the other way around. Our people make all the difference in delivering innovative HR technologies and solutions that help employees all over the world do their jobs better. The result? We’re building the next generation of *** technologies.
Responsibilities:
•Work with Major Incident Managers during and after the incident recovery life-cycle.
•Identify key priority initiatives to significantly improve reliability, both proactively and reactively.
•Provide leadership and direction to GPT System/Site Reliability Engineers and Software Engineers to define and implement sustainable and scalable solutions.
•Publish After Action Reviews which are timely and clearly understood by technical and business personnel, and include accurate root causes and concrete follow-up items with clear owners.
•Design and implement stability and reliability best practices and proactive solutions to potential issues by collaborating with GPT partners.
•Define, track, review and report on Service Level Objectives (SLOs), Service Level Indicators (SLIs), System Availability, and the progress and outcomes related to reliability initiatives.
•Mitigate further impact and risk to company and our clients.
•Understand and explain incident situations, recovery and plans to prevent recurrence to business stakeholder and clients.
•Summarize complex technical issues into concise recaps for client associates and client leaders.
•Assume an advisory role to ensure emergency changes adhere to a defined policy.
•Develop and maintain strong and effective working relationships with various internal and external stakeholders both in and outside of company.
•Proactively identify opportunities for process improvement.
•Participate in special projects and performs other duties as assigned.
•Participates with other senior managers to establish strategic plans and objectives.
•Makes final decisions on administrative or operational matters and ensures effective achievement of objectives.
•Participates in development of methods, techniques and evaluation criteria for projects, programs, and people.
•Ensures budgets and schedules meet requirements. Acts as a lead within the team and is jointly accountable for coaching and mentoring with Manager of other team members.