עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
RELOCATION: Senior Staff AI Researcher (KV cache optimization)
Sydney, NSW, Australia, with the option to work from Israel as well.
About the job
We’re hiring an AI Researcher to drive advanced research in neural network and LLM optimization, identify the most promising opportunities, and translate them into production-ready innovations. You will evaluate and apply emerging techniques (pruning, quantization, inference acceleration, memory optimization, scheduling, runtime tuning) and determine what will materially improve LLM inference performance, model efficiency, and cost, then partner with engineering to ship the changes that make a measurable difference.
Responsibilities:
- Inference & Compute Optimization: Design and implement highly optimized inference pipelines and computational kernels to accelerate LLM and neural network workloads, leveraging low-level techniques such as SIMD vectorization, cache-aware memory access patterns, and hardware-specific tuning.
- Neural Network Compression & Model Optimization: Research and implement pruning, quantization, and other compression techniques to reduce model size and accelerate inference while preserving accuracy. Apply both in-training and post-training optimization methods across LLM and vision model workloads.
- Profiling & Observability: Build and utilize advanced profiling tools to identify bottlenecks across the inference and training stack—from memory bandwidth and cache utilization to CPU-side data preprocessing stalls and end-to-end pipeline throughput.
- Evaluation & Benchmarking: Design and maintain rigorous evaluation and benchmarking frameworks for systematic model comparison across optimization configurations. Develop automated pipelines (e.g., LLM-as-a-judge) to measure the impact of optimization techniques on model quality and performance.
- Mentorship: Act as a technical lead for engineers and researchers, fostering a culture of high-performance code, rigorous benchmarking, and research-to-production excellence. Drive team growth, technical interviews, and cross-functional collaboration.
Required Qualifications:
- Deep Systems Expertise: 8+ years of experience in high-performance computing, AI systems, or low-level software optimization. Deep familiarity with performance-critical development including CPU/GPU architecture, memory hierarchies, SIMD/vectorization, and profiling-driven tuning.
- LLM & NN Optimization Track Record: Proven experience optimizing neural networks and LLMs through techniques such as pruning, quantization, and inference acceleration, with a demonstrated path from research to production deployment.
- Communication: Ability to translate complex systems-level constraints and optimization trade-offs into actionable research directions for modeling and engineering teams.
- Experience building evaluation frameworks, ML observability, or developer tools that help researchers understand and compare model performance across optimization configurations.
- A history of working on neural network compression, inference acceleration, or applied AI research problems that required bridging algorithmic research with high-performance implementation.
- Patent authorship or published research in AI/ML optimization.
- Experience with C/C++ inference engines, x86 intrinsics, or similar low-level performance work is a strong plus.
About the Company
An Australian startup with an office in Israel is developing a compute-optimization engine. Backed by leading U.S. venture capital , we’re looking for exceptional talent to join us as true partners on our journey.
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
שאלות ותשובות עבור משרת RELOCATION Senior Staff AI Researcher (KV cache optimization))
התפקיד כולל הנעת מחקר מתקדם באופטימיזציית רשתות נוירונים ומודלי שפה גדולים (LLM), זיהוי הזדמנויות מבטיחות ותרגומן לחידושים מוכנים לייצור. זה כולל אופטימיזציית הסקה וחישוב, דחיסת רשתות נוירונים, פרופיילינג, הערכה ובנצ'מרקינג, וכן מנטורינג למהנדסים וחוקרים.
משרות נוספות מומלצות עבורך
-
Senior AI Researcher
-
תל אביב - יפו
Zenity
-
-
Senior Applied AI Researcher
-
תל אביב - יפו
Airis Labs
-
-
Senior Applied scientist, Agentic AI
-
חיפה
Amazon
-
-
Senior AI Researcher – Foundation Models
-
רמת גן
Immunai
-
-
Senior Deep Learning Researcher, LLM Inference
-
תל אביב - יפו
NVIDIA
-
-
Senior Applied Researcher, LLM Agents
-
תל אביב - יפו
NVIDIA
-