עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
About Us:
DigitalOwl offers a revolutionary solution for analyzing and summarizing medical records for insurance underwriters and claim adjusters.
Our technology is the first and only machine learning platform that interprets medical records, and drastically reduces time, money, and risk of human error to a minimum.
We are always growing and looking for new and motivated talents to join our journey and lead us to even greater success! At the base of our values lies maintaining our work environment fun, challenging, and friendly, with a strong focus on personal and professional growth.
Job Overview:
Join our team as a Site Reliability Engineer (SRE) and play a key role in keeping our systems reliable, scalable, and efficient!
We are looking for a proactive and detail-oriented professional who enjoys problem-solving and optimizing complex systems.
In this role, you’ll work closely with developers and operations teams to improve system performance, automate processes, and ensure smooth and secure platform operations.
We value teamwork, adaptability, and a supportive work environment where your contributions make a difference.
If you’re passionate about building resilient systems and driving operational excellence, we’d love to have you on board!
Key Responsibilities:
- Design and implement robust and scalable monitoring and observability solutions for DigitalOwl’s cloud services to ensure high availability and system health.
- Analyze, troubleshoot, and mitigate production incidents with a focus on root cause analysis, long-term remediation, and prevention strategies to minimize downtime and operational risk.
- Improve system reliability by developing automated detection, response, and resolution mechanisms, reducing manual intervention.
- Collaborate closely with development, security, and operations teams to integrate reliability best practices for high availability and fault tolerance.
- Optimize system performance, scalability, and capacity planning.
Requirements:
- Minimum 2 years of experience in a SRE role.
- Proficiency in monitoring, logging, and alerting solutions - MUST.
- Ability to quickly adapt to new technologies and proactively seek improvements to existing infrastructure and processes.
- Knowledge in Kubernetes and Docker.
- Experience with cloud platforms (AWS, GCP, or Azure).
- Strong scripting and automation skills.
- Solid understanding of networking principles, security best practices and standards in cloud environments.
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.