A company that's revolutionizing AI-driven communications through phone, internet calls, and chat is looking for an experienced Machine Learning Engineer who can build, fine-tune, and optimize LLMs for client-specific use cases, integrating the latest AI frameworks and tools.
As they expand, they are focusing on custom Large Language Model (LLM) training tailored to client-specific domains and industry needs. They aim to push the boundaries of AI adaptability, performance, and usability for real-world applications.
What You’ll Do
Train and fine-tune Large Language Models (LLMs) based on client domains and industry-specific data.
Design, develop, and optimize custom AI workflows that integrate LLMs into production environments.
Utilize LangChain, CrewAI, and LangFlow to orchestrate complex LLM-based applications.
Implement and optimize retrieval-augmented generation (RAG) techniques for better contextual responses.
Work on data preparation pipelines, including tokenization, augmentation, and embedding optimizations.
Develop scalable and efficient inference pipelines for deploying LLMs in production.
Collaborate with software engineers to integrate AI models into real-world applications.
Optimize model performance, latency, and cost to ensure smooth deployment at scale.
Research and experiment with cutting-edge AI advancements in LLM fine-tuning and prompt engineering.
What You’ll Bring
3+ years of experience in Machine Learning & NLP, with a focus on LLM training and deployment.
Experience with LLM fine-tuning techniques such as LoRA, PEFT, and instruction tuning.
Proficiency in Python, PyTorch, TensorFlow, and Hugging Face Transformers.
Hands-on experience with LangChain, CrewAI, and LangFlow (bonus points for deep expertise).
Strong understanding of vector databases (Pinecone, Weaviate, FAISS) and embedding models.
Experience building production-ready AI products, ensuring scalability and reliability.
Deep knowledge of prompt engineering, tokenization strategies, and data augmentation for LLMs.
Familiarity with ML-Ops best practices, cloud-based AI deployments, and GPU optimizations.
A passion for AI-driven automation, custom model development, and pushing the boundaries of LLM capabilities.
Bonus Points
Experience deploying LLMs in low-latency, real-time environments.
Strong background in serverless AI architectures and containerized deployments.
Hands-on experience with Kubernetes, Docker, and cloud-based ML workflows (AWS/GCP/Azure).
Knowledge of speech-to-text (STT), text-to-speech (TTS), or conversational AI.
רוצה לראות עוד משרות מתאימות? Jobify מנתחת את הניסיון התעסוקתי שלך ומציגה לך משרות עדכניות - בחינם!