Our Careers

Escalation Engineer

As members of the Cloud Support Escalation & Managed Services (MS) team, we work to identify widespread and systemic customer facing problems for AWS or other cloud hyperscalers. We are responsible for monitoring internal tools to identify customers impacting issues. When a problem is identified, we ensure the appropriate parties are engaged to drive the resolution of the problem and act as an advocate of the customer to both report on and manage the customer experience. Because of our unique role as Escalation Engineers, we have front-and-center limitless exposure to all things AWS & cloud hyperscalers, including numerous leading edge technologies.

Responsibilities

  • Every day will bring new and exciting challenges that include elements of:
    · Real-time monitoring of telemetry and incoming alarms
    · Detect and respond to internal services experiencing customer impacting events
    · Provide critical incident response/management focused on customer communications for AWS Service Teams
    · Drive down mean time to engagement and communication for all incident types
    · Monitor and manage communications during high impact events via relevant channels
    · Facilitate Post-Mortem/Root Cause Analysis after each event to mitigate problem recurrence
    · Prioritize, manage and own issues impacting AWS and public cloud customers from detection to resolution
    · Provide crisp and timely communication on developing issues to relevant stakeholders
    · Work with key stakeholders across AWS to improve the customer experience and develop mechanisms that support operational excellence
    · Analyze data trends on internal tickets, customer contacts, social media, and network monitors to identify potential issues
    · Build a broad understanding of AWS architecture and service inter-dependencies
    · Maintain composure in dynamic and high pressure situations

    • Guide and mentoring other level 1 and level 2 support engineers
      · Other duties as required by the organization

     

Requirements and Qualifications

  • 5+ years of experience with incident management for mission critical services
    · 5+ years of experience in Systems (Windows/Linux) operations and/or Networking with an emphasis on monitoring and alarming
    · 5+ years of experience building or supporting customer solutions in the cloud
    · 5+ years of experience leading and managing critical incident internal communications
    · Bachelor’s degree in Information Science / Information Technology, Computer Science, Engineering, Mathematics, Physics, or a related field (or 6+ years of relevant work experience)

 

Preferred Qualifications

Candidates that have been most successful after joining our team have demonstrated capabilities in one or more of these areas:
design

    • Industry specific accredited certification(s)
      · Experience with Python, Ruby, PERL, Node.js or shell scripting
      · Knowledge of ITIL/Lean Processes
      · Excellent written and oral English communication skills
      · Ability to review complex details regarding ongoing issues/events and convey the key details to senior stakeholders to facilitate real-time decision making
      · Effective prioritization and time management skills
      · Ability to work in ambiguous environments
      · Demonstrated critical thinking and logical problem solving skills
      · Familiarity with AWS application architecture with a focus on high availability and fault tolerant design

164 Kallang Way

Solaris @ Kallang 164

#04-17 Singapore 349248

Escalation Engineer

Managed Services Team