Job Description
We are seeking a visionary Senior Generative AI Engineer to join our elite research division. As we accelerate towards our 2026 product launch, we need an expert who can architect the next generation of Large Language Models (LLMs) and multimodal systems. If you are passionate about pushing the boundaries of artificial intelligence and building the future of human-machine interaction, this is your opportunity to lead.
Why Join Us?
- Work on cutting-edge technology that will define the AI landscape of 2026 and beyond.
- Competitive compensation package including equity and remote-first flexibility.
- Collaborate with world-class researchers and engineers.
The Role:
You will be responsible for designing, training, and deploying state-of-the-art generative models. You will work directly with our product team to translate theoretical research into scalable, production-ready AI solutions.
Responsibilities
- Model Architecture: Design and implement novel neural network architectures for large-scale generative tasks, specifically focusing on LLMs and diffusion models.
- Training & Fine-tuning: Lead the training pipelines for our proprietary models, optimizing for performance, accuracy, and inference speed.
- Research & Innovation: Stay ahead of the curve with the latest academic research, adapting breakthrough techniques for our 2026 roadmap.
- Deployment: Engineer robust, scalable APIs and deployment strategies for AI models in cloud environments (AWS/GCP).
- Collaboration: Partner with data scientists, product managers, and engineers to integrate AI capabilities into user-facing products.
- Mentorship: Guide junior engineers and researchers, fostering a culture of innovation and technical excellence.
Qualifications
- Education: Masterβs or PhD in Computer Science, Mathematics, or a related field with a focus on Machine Learning or Artificial Intelligence.
- Experience: 5+ years of professional experience in machine learning engineering or applied research.
- Technical Skills: Deep expertise in PyTorch or TensorFlow, and experience with Hugging Face Transformers.
- LLM Knowledge: Proven track record of working with LLMs (GPT, Llama, BERT), fine-tuning techniques, and prompt engineering.
- Programming: Proficiency in Python, C++, and distributed systems.
- Communication: Excellent verbal and written communication skills, with the ability to explain complex technical concepts to non-technical stakeholders.