Search

AI/ML Engineer

Beacon Talent
locationHouston, TX, USA
PublishedPublished: 6/14/2022
Engineering
Full Time

Job Description

Job DescriptionAI/ML Engineer – Production AI Agents & RAG Systems

Location: [SF Bay Area/Remote] | Full-Time

About the Client

Our client is an innovative technology company focused on building intelligent, real-world AI systems. They specialize in deploying advanced machine learning solutions that move beyond the lab and into production, delivering value across a wide range of user-facing applications. Their work spans research, prototyping, and full-scale deployment of AI-powered agents capable of solving complex, dynamic problems at scale.

About the Role

This is a unique opportunity for an AI/ML Engineer to lead the development of robust AI agents and systems powered by retrieval-augmented generation (RAG). The role requires full-stack capabilities, blending research-driven experimentation with strong engineering execution to bring scalable, production-grade AI solutions to life.

You’ll work across multiple layers of the technology stack—from model orchestration and data pipelines to API development and UI integration—ensuring high performance, reliability, and user value.

Responsibilities

  • Develop RAG Architectures: Design and implement retrieval pipelines using vector search, hybrid search, embeddings, and reranking techniques to enhance LLM performance.

  • Deploy AI Agents: Build and scale agentic workflows that involve tool usage, multi-step reasoning, and persistent memory, ensuring they are reliable in production.

  • Contribute Across the Stack: Support the development of backend services, APIs, and orchestration layers that integrate AI capabilities into user-facing products.

  • Optimize System Performance: Improve latency, throughput, and reliability of deployed AI models through monitoring, tuning, and system design.

  • Build Tooling & Infrastructure: Create internal tools and frameworks to support evaluation, observability, and rapid iteration cycles for model and system updates.

  • Collaborate Cross-Functionally: Partner with product teams, designers, and researchers to translate user needs into performant AI-powered features.

Requirements

  • Proficiency in software engineering with experience in Python, TypeScript/Node.js, or similar languages.

  • Direct experience working with large language models (LLMs), retrieval-augmented generation (RAG), and vector databases such as FAISS, Pinecone, Weaviate, pgvector, or Milvus.

  • Familiarity with frameworks like LangChain, LlamaIndex, or other agent-based libraries.

  • Strong understanding of cloud platforms (AWS, GCP, or Azure), containerization (Kubernetes), and MLOps practices (CI/CD, monitoring, evaluation).

  • Ability to move fluidly between research and production—capable of testing cutting-edge ideas while delivering reliable systems.

  • Experience integrating external APIs and tools into agentic workflows.

Nice to Have:

  • Background in information retrieval, natural language processing, reinforcement learning, or distributed systems.

Benefits & Why Join

  • Work on truly production-grade AI systems that go beyond demos and deliver real-world impact.

  • Join a lean, fast-paced team where your contributions directly shape product direction and company success.

  • Enjoy competitive salary, equity, and the chance to grow within a forward-thinking, AI-first environment.

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...