Senior Big Data Engineer

עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!

במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.

מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.

OriginAI

רמת גן

OriginAI

רמת גן
היברידית
22,000-32,000 ₪ הערכה מבוססת AI ולא שכר שהתקבל מהמעסיק
הערכה מבוססת AI ולא שכר של המעסיק

About us:

OriginAI is a leading AI research lab dedicated to tackling challenges in the field of AI, including key domains such as computer vision, robotics, speech processing, and natural language processing (NLP).

Job Description:

We are looking for an experienced Data Engineer to join our engineering team. In this role, you will design and implement scalable data pipelines and infrastructure, enabling fast, reliable access to data for analytics and machine learning applications. You’ll work with a modern data stack to process large-scale datasets, optimize data storage, and support complex workflows across real-time and batch environments.

Key Responsibilities:

Implement scalable data processing pipelines using Apache Spark for both batch and streaming use cases.
Manage and optimize storage and retrieval of high-dimensional data using Vector Databases.
Develop flexible data models and maintain large-scale datasets using NoSQL databases.
Build and maintain core data services and tools using Java and Python, with a focus on performance and maintainability.
Deploy, scale, and manage containerized applications using Docker and Kubernetes (k8s).
Collaborate with data scientists, ML engineers, and platform teams to deliver high-quality data solutions.
Apply best practices in data governance, quality assurance, and operational monitoring to ensure data integrity and reliability.

Requirements:

4+ years of hands-on experience in big data engineering or a similar role.
Proficiency in Apache Spark, including batch/streaming and performance tuning.
Strong programming skills in Java and Python.
Proven ability to design and manage workflows using Apache Airflow.
Hands-on experience with containerization and orchestration tools: Docker and Kubernetes.
Solid understanding of distributed systems and scalable data architectures.
Familiarity with CI/CD processes and tools.

Nice-to-Have:

Proficiency with Elastic Search, Vespa.AI or any other Vecors DB as a big advantage.
Background in NLP, computer vision, or other relevant ML fields
Familiarity with the Hadoop ecosystem