Last shared 67 days ago
Site Reliability Engineering (SRE) at VMware fills the mission-critical role of ensuring that the infrastructure supporting our large-scale, multi-tenant cloud-hosted desktop and application service is efficient, healthy, monitored, automated, and designed to scale for many customers worldwide across various cloud platforms. As a member of this newly formed team, you'll use your operations focused developer background to work closely with our core platform development and operations teams from the early stages of design all the way through identifying and resolving production issues.
● Serve as a primary point responsible for the overall health, performance, and capacity of our cloud first solutions ● Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth. ● Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale Linux environment. ● Work closely with development teams to ensure that platforms are designed with "operability" in mind. ● Function well in a fast- paced, rapidly changing environment and where things needs to be sorted at in a dynamic environment Basic Qualifications: • Experience with Python (strongly preferred), Perl, Ruby, or Java/C++ (one of the OOP language)
• Experience with virtualisation skills highly desirable (vSphere, NSX, vSAN, View, DaaS)
• Experience with any config management tool - Ansible (preferred), Salt, Puppet…
• Experience with build and release management like Jenkins • Experience with Linux systems and/or systems administration experience. • Experience designing and operating critical web services and ability to dig thru complex code to find the issues and be able to fix
• Prior experience with any one of cloud platforms - vCA, AWS, Softlayer, Azure
• Experience with general performance tuning and optimization of all aspects of platforms and services (systems, network, code).
Preferred Qualifications: ● 10+ years in a Linux-based large-scale SaaS operations role ● B.S. or higher in Computer Science or other technical discipline. ● Previous experience working with geographically distributed team. ● Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, OpsAutomation, Engineers, Product Managers, etc. ● Ability to work independently and available ● Good RESTful API and systems design sensibilities. ● Broad understanding of Internet protocols and network programming.
Sign up free of charge to apply for jobs and meet your future colleagues!
Sign up for free