עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
About Us
We are a fast-growing startup building a next-gen platform for the AI era.
The Role
We are looking for an exceptional Site Reliability Engineer to establish and lead the SRE discipline within our organization. This is a unique opportunity to define what reliability means at our scale — building the practices, standards, and tooling from the ground up, with high visibility and impact from day one.
We run a lean SRE team with high expectations — and we back that up with an AI-first approach to tooling and operations.
If you're the kind of engineer who thrives on ownership, thinks in systems, builds their own tools, and wants to leave a real mark on a company's technical foundation — this role is for you.
What You'll Do
- Develop deep product knowledge across our platform — understanding its internals, failure modes, and operational behavior well enough to own incident resolution end-to-end.
- Define and track SLOs/SLIs across critical platform services, and use error budgets to drive engineering decisions.
- Own live site reliability — including on-call rotations, incident response, and post-mortems — with a focus on minimizing MTTR and preventing recurrence through systemic fixes, not just firefighting.
- Lead capacity planning, performance analysis, and proactive risk identification across our multi-cloud environments.
- Work hand-in-hand with engineering teams across the stack — infrastructure, application, and business layers — to embed reliability requirements everywhere.
- Lay the groundwork for a future SRE team — designing processes and tooling that scale beyond a single person.
What We're Looking For
- Proven experience as an SRE in a high-scale production environment, with hands-on ownership across the full stack — infrastructure and application layers; business-level reliability experience is a bonus.
- Strong AWS expertise is a must; GCP experience is a significant advantage.
- Strong coding skills and a software engineering mindset — you build your own tools rather than waiting for someone else to.
- Familiarity with infrastructure-as-code and container orchestration — enough to collaborate effectively with the teams that own them.
- Rust experience is a strong bonus given that the majority of our codebase is written in Rust.
- Experience building with AI — working with LLMs, designing agents, or integrating AI into operational tooling — is a strong bonus.
- A true owner — you take end-to-end accountability for what you build and operate, and you don't wait to be asked.
- A "can-do" partner to engineering and product — your default is to find a way, not to say no. You raise concerns early and constructively, but you're known for unblocking teams, not gatekeeping them.
Why Join Us
- Be the person who defines what reliability means at a cutting-edge AI big data platform.
- Work alongside a world-class engineering team on genuinely hard problems across the full stack.
- High ownership, real impact, and room to grow — from day one.
- Based in Israel, with global reach and a multi-cloud environment at massive scale.
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
30,000-45,000 ₪