Senior Gen-AI Algorithm Engineer – Edge AI & In-Memory Computing
Company Overview:
GSI Technology is a leader in associative in-memory computing (AIMC), developing advanced AI hardware and software platforms for edge Gen-AI applications in robotics, drones, defense, and real-time systems. Our latest chips (Plato™, Gemini-2™) deliver ultra-low power and high-throughput performance for LLM, vision, and multimodal workloads.
We are expanding our Gen-AI team and seeking a talented Algorithm Engineer with deep knowledge of large models and edge implementation constraints.
Position Summary:
We are looking for a highly skilled algorithm expert who can bridge the gap between advanced Gen-AI models (LLM, LVM, multimodal) and efficient hardware deployment on GSI’s proprietary in-memory computing platform. The candidate will be responsible for developing reference implementations, optimizing quantization and dataflow, ensuring model accuracy, and preparing algorithms for high-performance, low-power edge inference.
Key Responsibilities:
- Design and implement dataflow, memory access patterns, and quantization strategies for Gen-AI models (LLM, LVM, diffusion, MoE) targeting GSI hardware.
- Develop accurate and efficient reference implementations in C/C++, to be used by hardware and firmware teams for on-chip deployment.
- Analyze and optimize model quantization (INT8, mixed-precision, ternary) to balance accuracy, speed, and power consumption.
- Evaluate model performance and accuracy using standard metrics (e.g., perplexity, BLEU, mAP), and refine algorithms to meet strict power and bandwidth constraints.
- Collaborate closely with AI researchers, compiler engineers, and device firmware teams to align algorithmic and hardware design.
- Document algorithm flows, assumptions, and accuracy trade-offs clearly and precisely.
Required Qualifications:
- MSc or PhD in Computer Science, Electrical Engineering, or related field.
- 5+ years of experience in AI algorithm development, with a strong focus on inference optimization for edge or embedded devices.
- Deep understanding of transformer-based models, including LLMs (e.g., LLaMA, Gemma, Mixtral) and vision-language models (e.g., SigLIP, Flamingo).
- Experience in model quantization and compression: PTQ, QAT, ternary or mixed-precision flows.
- Strong C/C++ coding skills with focus on clean, modular, and hardware-friendly code.
- Familiarity with PyTorch and ONNX or equivalent frameworks for model analysis and conversion.
- Solid background in memory-efficient computing and embedded system constraints (latency, power, DRAM/IO bandwidth).
- Excellent communication skills and proven ability to work across teams (research, hardware, and software).
Preferred Qualifications:
- Experience with in-memory computing or neuromorphic/edge AI accelerators.
- Knowledge of embedded platforms (ARM, DSP, RISC-V, FPGA, etc.) and MIPI/RF data interfaces.
- Contributions to AI compiler stacks or quantization libraries (e.g., TVM, Glow, MLIR).
- Familiarity with model deployment on resource-constrained systems such as drones, UGVs, or handheld ISR tools.
What We Offer:
- Opportunity to work on cutting-edge AI hardware and contribute to next-generation Gen-AI solutions at the edge.
- Dynamic, fast-paced environment with a team of experts across AI, embedded systems, and silicon design.
- Flexible hybrid work policy with strong support for innovation and technical ownership.
Our Privacy Policy: Your resume and information will be kept confidential.
רוצה לראות עוד משרות מתאימות?
Jobify מנתחת את הניסיון התעסוקתי שלך ומציגה לך משרות עדכניות - בחינם!