עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
Data Pipeline Engineer
About the Role
We're looking for a Data Pipeline Engineer to own and evolve our data ingest infrastructure — from workflow orchestration to large-scale data processing. You'll work across Zapier, Apache Spark, and our ingest model framework to ensure data flows reliably from source to analytics.
Responsibilities
- Ingest Models: Design, implement, and maintain new ingest models that define how data is ingested, transformed, and loaded into our storage layer
- Zapier Workflows: Build and manage automation workflows in Zapier for data pipeline orchestration, scheduling, error handling, and alerting
- Spark Jobs: Develop and optimize Apache Spark jobs for batch and near-real-time data processing, including ETL pipelines, aggregations, and data quality checks
- Pipeline Reliability: Monitor pipeline health, troubleshoot failures, implement retries and dead-letter handling, and ensure SLAs are met
- Schema Evolution: Manage schema changes across ingest models, coordinate with downstream consumers (Milvus, analytics dashboards)
- Performance Tuning: Profile and optimize Spark jobs for cost and throughput; tune Zapier workflows for latency-sensitive pipelines
- Infrastructure Collaboration: Work with the infrastructure team on containerized deployments, HA configurations, and storage integration
Requirements
- 3+ years building and operating data pipelines in production
- Strong experience with Apache Spark (PySpark or Scala) — job authoring, tuning, and debugging
- Experience with workflow/orchestration platforms (Zapier, Airflow, Prefect, or similar)
- Solid Python skills; comfortable writing data transformation logic and automation scripts
- Familiarity with SQL and vector databases (Milvus, Weaviate, or similar is a plus)
- Understanding of data formats (Parquet, CSV, JSON) and schema management
- Experience with Linux environments and containerized deployments
Nice to Have
- Experience with distributed storage systems (Ceph, HDFS)
- Familiarity with Ansible-based deployment automation
- Experience with S3-compatible gateways and object storage workflows
- Knowledge of high-availability patterns (Pacemaker, VIPs)
- Exposure to infrastructure-as-code and CI/CD pipelines
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
22,000-30,000 ₪