עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
At JFrog, we’re reinventing DevOps to help the world’s greatest companies innovate -- and we want you along for the ride. This is a special place with a unique combination of brilliance, spirit and just all-around great people. Here, if you’re willing to do more, your career can take off. And since software plays a central role in everyone’s lives, you’ll be part of an important mission. Thousands of customers, including 75% of the Fortune 100, trust JFrog to manage, accelerate, and secure their software delivery from code to production -- a concept we call “liquid software.” Wouldn't it be amazing if you could join us in our journey?
As a Senior SRE Engineer, you will play a pivotal role in ensuring the reliability, resilience, and scalability of our complex systems. You will lead the design, implementation, and execution of chaos engineering practices to proactively identify weaknesses, vulnerabilities, and potential failure points within our software and infrastructure. Your expertise will contribute to enhancing our system’s overall stability, robustness, and performance in production environments.
As a Senior SRE Engineer at JFrog you will…
- Lead the team towards technical solutions guided by a strong understanding of the latest and greatest technologies and tools of chaos engineering
- Develop and refine the chaos engineering strategy, methodologies, and best practices tailored to our specific systems and applications
- Design controlled chaos experiments that simulate real-world failures, ensuring that these experiments align with business goals and risk tolerance
- Collaborate with the engineering and operations teams to build and maintain chaos engineering tooling and automation frameworks, making it easier to carry out experiments and analyze results
- Lead the execution of chaos experiments in production and non-production environments, closely monitoring the impact and gathering data on system behavior during simulated failures
- Analyze the results of chaos experiments to identify weaknesses, bottlenecks, and failure patterns. Collaborate with cross-functional teams to address these issues and enhance system resilience
- Collaborate with software engineers, DevOps teams, and system architects to incorporate resilience and reliability measures directly into the development lifecycle
- Work with incident response teams to simulate and practice response procedures for various failure scenarios, ensuring preparedness for real-world incidents
- Define and implement relevant monitoring and metrics to measure the success of chaos engineering efforts and the overall health of the system
- 5+ years of relevant DevOps experience in large-scale production environments , chaos engineering , site reliability engineering or related
- Proficiency in utilizing chaos engineering tools and frameworks (e.g., Chaos Monkey, Gremlin, LitmusChaos)
- Deep understanding of distributed systems, microservices architecture, and cloud infrastructure (AWS, Azure, Google Cloud, etc.)
- Strong programming/scripting skills (Python, Go, Java, etc.) for building and automating chaos experiments
- Solid grasp of networking, security, and application performance monitoring
- Excellent problem-solving skills, with the ability to analyze complex system behaviors and identify areas for improvement
- Strong communication and collaboration skills to work effectively across multidisciplinary teams
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
משרות נוספות מומלצות עבורך
-
Linux SRE Specialist
-
תל אביב - יפו
comblack
-
-
MATRIX - מהנדס/ת SRE
-
תל אביב - יפו
MATRIX
-
-
Site Reliability Engineering (SRE)
-
תל אביב - יפו
Riskified
-
-
Site Reliability Engineer
-
תל אביב - יפו
NetNut.io
-
-
Senior HPC Site Reliability Engineer
-
תל אביב - יפו
NVIDIA
-
-
Senior HPC Site Reliability Engineer
-
יקנעם עילית
NVIDIA
-