Resiliency Engineer

Renaissance InfoSystems

About Us: Renaissance Info Systems is a technology and digital recruitment agency, connecting contract and permanent professionals with clients across Asia-Pacific. We aim to differentiate ourselves through our level of responsiveness, and our understanding that comes from being an IT recruitment agency from the IT Industry. Our recruiters balance sophisticated and simple inter-personal techniques to assure a strong candidate network. Know More: http://www.reninfo.com.au.

Job Description: L2 Resiliency Engineer

Role Overview

The L2 Resiliency Engineer is responsible for supporting the availability, stability, and recoverability of systems. The role focuses on executing resilience processes, supporting incident resolution, and maintaining disaster recovery (DR) and business continuity (BCP) capabilities across infrastructure and cloud environments.

Key Responsibilities

  • Support implementation and maintenance of high availability and resilience solutions
  • Execute DR/BCP procedures and assist in meeting RTO/RPO targets
  • Handle L2 incident support and escalate complex issues to L3
  • Perform initial root cause analysis and contribute to problem management
  • Assist in DR drills, failover testing, and recovery validation
  • Monitor system health and respond to alerts using observability tools
  • Maintain backup, recovery, and resilience documentation/runbooks
  • Support compliance with ISO 22301, ISO 27001, and regulatory requirements
  • Collaborate with L1/L3 teams to improve operational resilience

Key Skills

  • Basic to intermediate knowledge of cloud (Azure/AWS) and infrastructure
  • Understanding of backup/recovery, DR, and high-availability concepts
  • Familiarity with monitoring tools (e.g., Splunk, Dynatrace)
  • ITIL fundamentals (incident and problem management)
  • Basic scripting/automation skills (PowerShell, Python – desirable)
  • Strong troubleshooting and analytical skills

Experience

  • 3–5 years of experience in IT operations, infrastructure, or support roles
  • Exposure to DR, BCP, or resilience practices preferred

Role Summary

  • Level: L2 (Mid-level Engineer)
  • Focus: Execution and operational support
  • Scope: Incident handling, DR support, monitoring, and escalation to L3

Job Description: L3 Resilience Manager

Role Overview

The L3 Resiliency manager is a senior technical resource responsible for ensuring system availability, resilience, and recoverability. The role focuses on disaster recovery (DR), incident resolution, and strengthening system reliability across infrastructure and cloud platforms.

Key Responsibilities

  • Design and implement high availability, failover, and resilience solutions
  • Support and execute DR/BCP strategies; ensure RTO/RPO targets are met
  • Act as L3 escalation for major incidents and lead root cause analysis (RCA)
  • Conduct DR drills, failover testing, and recovery validation
  • Implement monitoring, alerting, and automation for proactive resilience
  • Strengthen resilience across cloud (Azure/AWS), infrastructure, and networks
  • Support compliance with ISO 22301, ISO 27001, APRA CPS 230/234
  • Collaborate with cross-functional teams to improve resilience posture

Key Skills

  • Strong cloud and infrastructure knowledge (Azure/AWS, networking, storage)
  • Experience with DR, backup/recovery, and high availability architectures
  • Familiarity with monitoring tools (e.g., Splunk, Dynatrace)
  • ITIL knowledge (incident, problem management)
  • Scripting/automation skills (PowerShell, Python)
  • Strong troubleshooting and problem-solving ability
  • Dependency Mapping & Service Architecture Awareness
  • DevOps / SRE Integration
  • Chaos engineering
  • Fault injection testing

Experience

  • 8+ years in IT operations, SRE, or infrastructure engineering
  • Hands-on experience with DR and resilience practices

Regards,

Reshu Seth

Recruitment Consultant

Renaissance InfoSystem

M: +61 478 487 026

E: ***email_hidden***

W: http://www.reninfo.com.au