F

AI Evaluation Engineer (Cequeliños)

FirstIgnite

arbo, galicia, Spain Full-time June 12, 2026

Found Description

AI Evaluation Engineer

We’re hiring an AI Evaluation Engineer to own the quality bar for every LLM-powered feature we ship. You will design, build, and scale the infrastructure that tells us -- with evidence -- whether a prompt change, model swap, or agent refactor made things better or worse.

Responsibilities

  • Build evaluation infrastructure: Design and maintain eval suites using Promptfoo, LLM-as-judge methodologies, and custom harnesses for features such as our expert search system, natural language grants search, and AI SDR agents.
  • Define what good means: Partner with product and domain experts to translate vague customer outcomes (does this surface the right principal investigator?) into precise, measurable rubrics.
  • Own the feedback loop: Instrument production traffic, curate golden datasets from real customer interactions, and build pipelines that turn user behavior into regression tests.
  • Ship quickly under uncertainty...

Ready to Apply?

Submit your application for AI Evaluation Engineer (Cequeliños) at FirstIgnite

Apply Now