Salesforce.com, Inc Senior Manager - Site Reliability Engineering in Hyderabad, India
Job CategoryProducts and Technology
Salesforce is seeking a Manager to join the Site Reliability organization in our Hyderabad office, working closely with counterparts in the Infrastructure and R&D organizations. This organization provides a global team of engineers monitoring cloud service availability and ready to swiftly repair any service-impacting issues. Seven days a week, 24 hours a day, the Site Reliability team keeps the Salesforce cloud and our customers protected. As a Manager of the Site Reliability team, your team will be responsible for the primary task of detecting and resolving incidents within minutes. This objective is met by monitoring the services, reacting to problems, and proactively addressing issues before they affect performance or availability.Responsibilities:We’re looking for someone with a balance of technical expertise, leadership skills and managerial experience. Your operational skills must be of a sufficient level that will enable them to set technical direction on incident bridges and marshal resources accordingly, as well as ensuring that investigations are following appropriate troubleshooting paths, monitoring, triage and change execution remain optimal. The ideal candidate is someone who will drive continuous improvement while streamlining how we do Operations. This position will involve fostering and maintaining strong relationships with other connected areas of the business, ensuring the SRE team are vital stakeholders within any process and procedural enhancements. The leader in this role must demonstrate a strong focus on engineering practices, service ownership, agile leadership and people management skills. You will be responsible for managing and supervising the day-to-day responsibilities of front-line Site Reliability Engineers.
Incident management - Act in key support roles during major incidents e.g. Sev0, Sev1, Sev2.
Problem Management - Populate and participate in RCAs and partner with the Global Solutions team to permanently fix issues
Drives the team to be proactive in diagnostics, detection and configuration of applications. Develops service-ownership to fill gaps and provides excellent customer experience
Oversees process improvement and change management
Creates capabilities to have SR team respond in a timely manner to incidents and find root cause
Works successfully with other cross-cloud service owners (Developers, DBAs, Network, etc) with positive relationships but with influence
Proactive measures to impact customers beyond current SRE team - We want to actually solve the problems and configure visibility
Involved in automation and tooling of manual and repetitive processes
Collaborate with SR dashboards and analytics to give predictive insights on data center environments for customers
10+ years of Infrastructure Engineering or Operations experience
5+ years managing Site Reliability, NOC, or mixed Operations teams preferably in globally distributed environments
Experience in working in a 24/7/365 Operations team, managing large data centers and infrastructure
Past Experience in Incident Management and strong understanding of ITIL service operations and SCRUM methodologies
Expertise with enterprise monitoring systems, such as, Nagios and Splunk
Strong understanding of monitoring implementations and administration
Has a passion for: Teamwork and collaboration, Adaptability, Communication, Problem Solving, Customer Focus, Results, and Innovation.
Passionate about employee development with experience successfully coaching individuals to achieve goals
Strong communication, organizational, analytical and problem solving skills and attention to detail
Experience in a large-scale Linux data center environment with knowledge in administration, troubleshooting
Entrepreneurial-spirited, Results-driven, communicator, aloha spirit
Passionate about engineering productivity and service ownership and customer success
Experience designing, developing, debugging, and operating resilient distributed systems that run across thousands of compute nodes in multiple data centers
Experience with traditional data centers as well as some knowledge/experience in any of the Public Clouds(AWS, GCP, or Azure)
Windows Systems knowledge as well
Experience with integrating new functions/on boarding new services into Site Reliability
MS in Computer Science or related field, or
BS in Computer Science plus relevant job-related experience
Salesforce.com and Salesforce.org are Equal Employment Opportunity and Affirmative Action Employers. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. Headhunters and recruitment agencies may not submit resumes/CVs through this Web site or directly to managers. Salesforce.com and Salesforce.org do not accept unsolicited headhunter and agency resumes. Salesforce.com and Salesforce.org will not pay fees to any third-party agency or company that does not have a signed agreement with Salesforce.com or Salesforce.org.
Founded in 1999, Salesforce is the global leader in Customer Relationship Management (CRM). Companies of every size and industry are using Salesforce to transform their businesses, across sales, service, marketing, commerce, and more by connecting with customers in a whole new way. We harness technologies that can revolutionize companies, careers, and, hopefully, our world.
Salesforce is built on a set of four core values: Trust, Customer Success, Innovation, and Equality. By making technology more accessible, we're helping create a future with greater opportunity and equality for all. This has taken our company to great heights, including being ranked by Fortune as one of the “Most Admired Companies in the World” and one of the “100 Best Companies to Work For” eleven years in a row, and named “Innovator of the Decade” and one of the “World’s Most Innovative Companies” eight years in a row by Forbes.
There are those who choose to work with the best and brightest. And then, there are those who want to do more than just a job. They are the ones improving lives, not only their careers. Having an impact now instead of later. Doing something that’s so much bigger than themselves, an industry, and their company.
We believe everyone can be a Trailblazer. Join Salesforce and discover a future of new opportunities.