עדיין מחפשים עבודה במנועי חיפוש? הגיע הזמן להשתדרג!
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.
NextSilicon is reimagining high-performance computing (HPC & AI). Our accelerated compute solutions leverage intelligent adaptive algorithms to vastly accelerate supercomputers, driving them forward into a new generation. We have developed a novel software-defined hardware architecture that is achieving significant advancements in both the HPC and AI domains.
At NextSilicon, everything we do is guided by three core values:
- Professionalism: We strive for exceptional results through professionalism and unwavering dedication to quality and performance.
- Unity: Collaboration is key to success. That's why we foster a work environment where every employee can feel valued and heard.
- Impact: We're passionate about developing technologies that make a meaningful impact on industries, communities, and individuals worldwide.
- impact on industries, communities, and individuals worldwide.
The ideal candidate combines strong technical depth in AI/ML systems, hands-on experience with LLM workloads, and leadership capability to guide a high-performance engineering team.
Requirements:
- 5+ years of experience in AI/ML engineering, performance optimization, or ML systems.
- Deep understanding of LLM architectures, training & inference mechanics, and modern ML frameworks.
- Strong proficiency in PyTorch ecosystem, with a specific focus on performance tuning via Triton, Cuda or MLIR-based compiler frameworks.
- Hands-on expertise profiling and optimizing kernels (GEMM, attention, softmax, token pipelines).
- Demonstrated experience running or tuning MLPerf or similar large-scale benchmarks.
- Strong Python and C++ development skills.
- Proven leadership experience: mentoring, guiding, or managing engineers.
- Lead and mentor a team of AI application and performance engineers.
- Run and optimize AI workloads (LLaMA, DeepSeek, etc.) and execute MLPerf benchmarks.
- Analyze end-to-end performance and identify HW/SW bottlenecks.
- Develop optimization strategies across models, kernels, frameworks, and runtime.
- Build profiling, debugging, and validation tools for large-scale AI workloads.
- Collaborate with hardware, compiler, and device software teams to improve performance.
במקום לעבור לבד על אלפי מודעות, Jobify מנתחת את קורות החיים שלך ומציגה לך רק משרות שבאמת מתאימות לך.
מעל 80,000 משרות • 4,000 חדשות ביום
חינם. בלי פרסומות. בלי אותיות קטנות.