Please refer to JobSuchmaschine in your application
Site Reliability Engineer 2
Site Reliability Engineer II: Dynamics Service Engineering
The Microsoft Social Engagement - Site Reliability Engineering team is looking for talented engineers to design, build and operate large scale business oriented services for Microsoft. Our team has a global presence with engineers in the US, Switzerland, and Ireland. The ideal candidate is a deep technologist with a proven track record of building large scalable services, creative thinker, problem solver, teacher, and learner. The successful candidate will be at ease leading incident investigations and capable of identifying root cause in a complex, distributed environment.
In this role you will utilize best in breed technology including Microsoft an open source projects. In addition, to ensure smooth operations of the service, we are working closely with the development teams to identify and solve gaps in the service. Furthermore, we design and support the deployment pipeline together with release engineering and development.
This is an awesome opportunity in an exciting division inside of Microsoft. You’ll work on some of our hardest problems, building high quality, architecturally sound systems that are aligned with business needs. You’ll think globally when building systems, ensuring we build high performing, scalable systems that fit well together. Although Microsoft is massive our team is small and agile; if you are an amazing engineer with proven experience in a Site Reliability Engineering role please apply today!
• Microsoft Azure
• Ubuntu Linux
• Puppet, r10k
• Nginx, HAProxy
• Elasticseach, Watcher, Marvel
• Nagios, Logstash, Kibana
• Python, Java, Shell Scripts, Ruby, PowerShell, C#
• TSQL (Azure SQL, MSSQL), Redis
• Docker, Mesos, Swarm, Etcd
• Influence feature design, architecture, standards and processes to ensure Security, Performance, Operability and Scale.
• Build tools, develop efficient processes, and contribute to our knowledge base.
• Conduct performance analysis/tuning and ensure accurate service capacity planning.
• Identify gaps in current technology and processes and recommend improvements.
• Participate in on-call rotation duties.
• Optimize monitoring and self-healing capabilities.
• Collaborate at depth with peers in Development, Release Engineering, and Program Management.
• Plan and conduct proof of concept projects with new technology and processes.
Required Experience and Skills
• You have experience with continuous delivery, deployment automation, and configuration management.
• You are able to write maintainable Python, Ruby or Java code.
• You have deep hands‐on technical expertise in large scale systems engineering.
• You have a solid understanding of HTTP, load balancing, and Internet transport protocols.
• You have a deep understanding of Git.
• You care about the customer experience and usability.
• You are able to manage multiple priorities, commitments and projects.
• You are curious and want to know how things work.
• You are familiar with either Kanban, Scrum, or Xtreme Programming.
• You have 5+ years’ experience in large scale internet service design and implementation.
• You know how to monitor and instrument a large scale web service.
• Having an academic background in computer science is a plus (BSc or MSc).
Microsoft is an equal opportunity employer & supports workforce diversity. All applications for vacant positions will be welcomed & will be considered on the relative merits of the applicant against the role profile for the position regardless of colour, race, nationality, ethnic origin, sex, gender, sexual orientation, marital status, disability, parental responsibilities, age, religion, or belief.