DevOps Cloud Infrastructure Engineer
1QLabs
Build the Platform That Powers Multiple Ventures
We're an Australian venture studio that launches AI-powered businesses on enterprise-grade infrastructure. Every product we ship runs on a cell-based architecture in Google Cloud Platform—cells can be siloed single-tenant or shared multi-tenant, and the same application deploys to any supported GCP region. Data sovereignty, compliance, security, and portability are structural requirements—not afterthoughts. We need someone who can own that foundation.
You'll design, provision, and operate the GCP infrastructure that our engineering teams build within—Terraform-managed environments, CI/CD pipelines, Kubernetes clusters, observability stacks, and the security controls that let us onboard institutional clients. When we spin up a new venture or a new region, you make sure it has production-grade infrastructure from day one.
"By 2029, 70 per cent of enterprises will deploy agentic AI as part of IT infrastructure operations—up from less than 5 per cent in 2025." — Gartner, Predicts 2026: AI Agents Will Reshape Infrastructure & Operations
What You'll Do
- Design, provision, and maintain cell-based GCP infrastructure using Terraform—each cell is an isolated GCP project with its own databases, compute, networking, and security boundary
- Build and maintain CI/CD pipelines (Cloud Build, GitHub Actions, or similar) that deploy across multiple ventures and cells with proper environment promotion (dev → staging → production)
- Manage Kubernetes clusters (GKE) including node pools, autoscaling, resource quotas, network policies, and workload identity federation
- Implement and operate observability: structured logging (Cloud Logging), metrics (Cloud Monitoring/Prometheus), distributed tracing, alerting, and incident response runbooks
- Enforce security controls: IAM policies, VPC Service Controls, private networking, Cloud KMS (including customer-managed encryption keys), Assured Workloads for region-specific data residency, and zero-trust access patterns
- Own disaster recovery: configure cross-region replication (e.g. Sydney primary, Melbourne standby), test failover procedures, and maintain RPO/RTO targets. Support multi-region cell deployments as ventures expand internationally
- Automate everything: infrastructure drift detection, compliance policy audits, secret rotation, certificate management, and cost optimisation
"Platforms aren't magic APIs. They're agreements that make engineers faster at delivering business value." — Kelsey Hightower
What This Role Requires
Required: Core Infrastructure Skills
- Terraform (IaC) — Deep experience writing, reviewing, and maintaining production Terraform. You structure modules, manage state, and handle multi-environment deployments confidently
- Google Cloud Platform — Strong working knowledge of GCP core services: Compute Engine, GKE, Cloud SQL, Firestore, Cloud Run, Pub/Sub, Cloud Storage, VPC networking, IAM, Cloud KMS, and Cloud Build
- Kubernetes & containers — Production experience with GKE or equivalent. You understand pod security, network policies, resource management, Helm charts, and workload identity
- CI/CD pipelines — You've built and maintained deployment pipelines with proper gating, rollback, canary/blue-green strategies, and environment promotion
- Networking & security — VPCs, firewall rules, private service connections, load balancers, DNS, and TLS certificate management
Required: Operational Skills
- Observability — You've set up and maintained monitoring, logging, and alerting at scale. You write useful runbooks and know how to debug production incidents under pressure
- Security & compliance mindset — Experience implementing security controls in a regulated or compliance-driven environment. You understand least privilege, audit trails, encryption at rest/in transit, and access governance
Nice to Have
- Experience with Assured Workloads, IRAP PROTECTED, or equivalent sovereign cloud requirements
- SOC 2 Type II preparation or audit experience—control mapping, evidence collection, and continuous compliance monitoring
- Cell-based architecture experience—siloed and multi-tenant models, multi-region deployment
- Cost optimisation: committed use discounts, resource right-sizing, FinOps practices
- Experience supporting AI/ML workloads: Vertex AI, GPU provisioning, model serving infrastructure
Valued Certifications
We're a GCP shop. The following Google Cloud certifications are recognised and valued, though not strictly required if you can demonstrate equivalent hands-on experience:
- Google Professional Cloud DevOps Engineer — Validates SRE practices, CI/CD, monitoring, and incident management on GCP
- Google Professional Cloud Architect — Validates ability to design secure, scalable, and compliant cloud solutions on GCP
- Google Professional Cloud Security Engineer — Validates IAM, VPC security, data protection, and compliance controls on GCP
- Google Associate Cloud Engineer — Foundation-level cert demonstrating GCP operational competence
- HashiCorp Terraform Associate or Engineer — Validates IaC skills that are central to this role
- Certified Kubernetes Administrator (CKA) — Validates production Kubernetes operations skills
A Note on Agentic Workflows
This is primarily a hands-on infrastructure role. That said, Terraform is increasingly managed through AI-assisted workflows—generating modules, reviewing plans, and iterating on configurations. Familiarity with agentic coding tools for infrastructure-as-code is a plus, but deep DevOps and GCP expertise is what matters most.
Your Growth Path
- 30 Days: Understand the cell-based architecture, existing Terraform modules, CI/CD pipelines, and security controls. Ship your first infrastructure change through the full pipeline
- 90 Days: Own cell provisioning end-to-end, improve observability and alerting coverage, and contribute to SOC 2 control implementation
- 6 Months: Lead infrastructure decisions across ventures, own the DR strategy, and help shape our compliance posture for institutional client onboarding
Why This Role Is Unique
- Multiplied Impact: Your infrastructure powers multiple startups simultaneously—not just one product
- Enterprise-Grade from Day One: Build infrastructure that meets institutional standards (IRAP PROTECTED, SOC 2, CMEK)—rare for an early-stage environment
- Ownership: You own the platform. Infrastructure decisions are yours to make, defend, and iterate on
- Cutting-Edge Stack: GCP Assured Workloads, cell-based isolation, AI/ML infrastructure—this isn't a legacy migration role
Compensation
- Base: $100K–$140K AUD (based on experience) + Bonus
- Benefits: Flexible PTO, professional development budget (including certification costs), remote-first culture
Interview Process
Our process is practical and skills-focused.
- Portfolio / Experience Review (async) — Share examples of infrastructure you've built: Terraform repos, architecture diagrams, CI/CD pipelines, or write-ups of production incidents you've resolved
- Intro Call (30 min) — Get to know each other and explore fit
- Infrastructure Challenge (take-home) — Design a cell-based GCP environment given a set of requirements. We evaluate your Terraform structure, security decisions, networking design, and operational thinking
- Technical Interview(s) (1–2 rounds) — Architecture deep-dive, incident response scenario, security and compliance discussion
- Reference Checks — 2–3 professional references
Email ***email_hidden*** with:
- Your resume/CV and any relevant certifications
- Examples of infrastructure work: Terraform repos (even sanitised samples), architecture diagrams, or a brief write-up of a production system you've built or operated
- Your timezone and Sydney availability hours
We review weekly and respond within 5 business days. 1QLabs welcomes diverse perspectives. We're building a team that combines deep infrastructure expertise with enterprise-grade standards to power the next generation of AI-powered businesses.