Senior LLM Agents Architect

עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!

במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.

מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.

NVIDIA AI

רעננה

NVIDIA AI

רעננה
מלאה, היברידית
35,000-55,000 ₪ הערכה מבוססת AI ולא שכר שהתקבל מהמעסיק
הערכה מבוססת AI ולא שכר של המעסיק

Job Requisition ID

JR2005216

Job Category

Engineering

Time Type

Full time

We don't just build the hardware and software that powers the AI revolution — we are building the AI that designs the next generation of both. Our team sits at the intersection of inference software and GPU architecture, creating autonomous LLM-driven systems that reason about hardware, write high-performance CUDA, and automate the complex loops of architectural simulation, analysis, and optimization.

We are looking for a senior LLM Agents Architect to work hands-on with hardware architects, verification engineers, GPU performance experts, and software developers to build end-to-end agent flows that drive significant improvements in kernel optimization, architectural exploration, and developer efficiency.

What You'll Be Doing

Design and build agentic AI systems that generate, analyze, and optimize GPU compute kernels — targeting speed-of-light performance on NVIDIA hardware.
Collaborate with GPU architects and performance engineers to encode domain expertise — memory hierarchy trade-offs, occupancy tuning, instruction-level reasoning — into agent workflows that rival hand-tuned optimization.
Build automated performance forensics agents capable of ingesting large-scale simulation traces and Nsight profiler data to identify bottlenecks and propose architectural or software mitigations.
Partner with HW architects to develop agentic flows for GPU architectural studies — enabling rapid what-if analysis across micro-architecture configurations such as cache sizing, memory controller design, and compute unit scaling.
Explore agentic approaches to HW/SW co-design challenges, including replacing or augmenting graph-compiler functionality (e.g., TorchInductor) with LLM-driven optimization and code-generation pipelines.
Rapidly prototype and thoughtfully productize; integrate with internal services, utilize GPU capabilities, remove bottlenecks, and deliver fitting solutions.
Set up evaluation backbone using offline golden sets and online telemetry for confident iterations, cost control, and safe improvements.
Mentor and improve teams through insights in agent orchestration, prompting, RAG, observability, crafting documentation and playbooks for NVIDIA's teams.

What We Need To See

7+ years in applied ML/AI or large-scale systems, with 2+ years crafting agentic or LLM-powered applications in production environments.
B.Sc in Computer Science / Electrical Engineering.
Solid grounding in computer architecture: memory hierarchies, parallelism models, pipelining, and cache behavior. Specific familiarity with NVIDIA GPU architecture — streaming multiprocessors, warp scheduling, shared/global memory model, and occupancy reasoning — is essential.
Hands-on CUDA programming experience: writing, profiling, and optimizing GPU kernels — not just calling into CUDA-accelerated libraries. Comfortable with tools such as Nsight Compute, Nsight Systems, or equivalent profiling workflows.
Proven ownership of at least one end-to-end agentic system or LLM application: requirements, architecture, implementation, evaluation, and incremental hardening in production — not just experience with off-the-shelf frameworks.
Strong software engineering skills in Python and one systems language (C++ preferred).
Proficient in tool use, RAG pipelines, and model adaptation techniques for building agentic systems.
Demonstrated ability to collaborate with HW/SW domain experts and translate their heuristics into deterministic tools, constraints, and evaluation metrics.
Excellence in communication and facilitation: aligning diverse collaborators, documenting decisions/assumptions, and influencing without authority.
Track record of building observability for AI systems: dataset/version management, offline test suites, online telemetry, guardrails/safety checks, and rollback plans.

Ways To Stand Out From The Crowd

Familiarity with the PyTorch compilation and lowering stack (torch.compile, TorchDynamo, TorchInductor, Triton, down to PTX), and with GPU graph compilers, kernel fusion strategies, or auto-tuning frameworks.
Background in performance engineering for HPC or GPU-accelerated workloads, including experience with performance modeling or hardware simulators.
Familiarity with distributed processing, multi-GPU workloads, and networking (e.g., NVLink, InfiniBand).
Familiarity with frontier agentic coding tools (e.g., Claude Code, Codex, Cursor) — understanding their underlying architecture: tool orchestration, context management, and autonomous task execution patterns.
Hands-on experience building a domain-specific coding agent — whether on top of frontier agentic harnesses (e.g., Claude Code, Codex SDK) or lower-level agent frameworks (e.g., LangChain/LangGraph deep agents, CrewAI). Comfortable with the design choices that make a coding agent useful in practice: task scoping, tool and context curation, evaluation, and failure recovery.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com.

שאלות ותשובות עבור משרת Senior LLM Agents Architect

מהו התפקיד המרכזי של Senior LLM Agents Architect ב-NVIDIA AI בפיתוח מערכות מונעות-LLM?

התפקיד המרכזי של Senior LLM Agents Architect ב-NVIDIA AI הוא לעבוד באופן מעשי עם אדריכלי חומרה, מהנדסי אימות, מומחי ביצועי GPU ומפתחי תוכנה, כדי לבנות זרימות סוכנים מקצה לקצה המביאות לשיפורים משמעותיים באופטימיזציית ליבות, חקר ארכיטקטורה ויעילות מפתחים. המטרה היא ליצור מערכות אוטונומיות מונעות-LLM שמסוגלות להבין חומרה, לכתוב קוד CUDA בעל ביצועים גבוהים, ולבצע אוטומציה של תהליכי סימולציה, ניתוח ואופטימיזציה ארכיטקטוניים מורכבים.

אילו כישורים טכניים נדרשים לתפקיד Senior LLM Agents Architect ב-NVIDIA AI, במיוחד בתחום ארכיטקטורת ה-GPU?

כיצד תורם Senior LLM Agents Architect ב-NVIDIA AI לשיפור ביצועי ליבות GPU ואופטימיזציה ארכיטקטונית?

משרות נוספות מומלצות עבורך

לכל המשרות של Senior AI Architect

עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!

Senior LLM Agents Architect