Manager, Site Reliability Engineering

.

Location
Palo Alto, CA 94301
Industries
Internet Services
Job Type
Full Time
Employee
Relevant Work Experience
7+​ to 10 Years
Education Level
Bachelor's Degree
Career Level
Manager (Manager/​Supervisor of Staff)
Salary

Generous bonus and stock package

Manager, Site Reliability Engineering
About the Job
The Manager of Site Reliability Engineering is responsible for day-to-day health and uptime for all Facebook services.​ As the leader, you are responsible for maintaining and improving service uptime, headcount growth, personnel management, and service stability.​ Additionally, this role is responsible for handling either planned or unplanned maintenance events as well as executing capacity and capabilities growth as Facebook expands.​ This position is located in our Palo Alto, CA Headquarters.​
Responsibilities

* Responsible for directing and growing a team of engineers across many time zones who work to analyze and maintain service stability by documenting policies and best practices in a 7x24x365 operation
* Responsible for the day-to-day health of all network, server, storage, and ancillary infrastructure
* Focus on lifecycle – deployment, maintenance, management, and decommission - of applications, components, and processes for Facebook products and services
* Work closely with cross functional teams to negotiate requirements, specifications, schedules, quality, and acceptance criteria
* Work closely with engineering, project management, and operational peers to develop innovative technical solutions that meet Facebook’s needs with respect to functionality, performance, scalability, and reliability
* Identify tactical issues and emerging areas of concern
* Work with regional leads to establish organizational goals, meet recruiting objectives, and fulfill the mission of unyielding site stewardship
* Participate in recovery from and forensic examination of major site incidents
* Develop reports and feedback to inform technical solutions that meet design needs


Requirements

* At least 4-6 years experience managing an Operations organization
* A natural team leader who can motivate and encourage personal advancement
* Excellent project management skills and the ability to work in a fast-paced and hectic work environment
* Ability to prioritize tasks effectively
* Perfect communications skills (written and verbal) and an ability to work seamlessly with organizational partners and peers
* A minimum experience of 4-6 years demonstrating the planning and roll-out of infrastructure in a global enterprise environment
* Must demonstrate experience with - Server OS and application management in large-scale production environment, Global infrastructure management in 24x7 co-located environments, Network and system troubleshooting and maintenance practices, and Management of engineering leads and support staff
* Must be willing to travel to domestic and international datacenter and office locations
* Understanding of best practices concepts, change management, SLA’s, policies, procedures, and design review driven standards

PLEASE APPLY ONLINE AT:
www.​facebook.​com/​careers/​apply.​php?​id=​558&jobBoardId=​1

 
engineer-jobs-today is proudly powered by Blogger.com | Template by Agus Ramadhani | o-om.com