All Jobs
No items found.
Senior AI QA Engineer
Europe
Remote
Who We Are
Role Description

You will work within an AI Incubator program, scouting, incubating, and validating client and internal AI conceptson a 3–5-year horizon. The focus is on building advanced AI prototypes, agentic workflows, and new AI-powered products and services while continuously exploring frontier AI capabilities.

Role Summary

The AI QA Engineer is the quality, safety, and reliability backbone of delivery. You ensure that agentic workflows, RAG systems, models, data pipelines, APIs, and UX layers behave reliably, safely, and consistently under real-world conditions.

This is not traditional QA. You design and own continuous evaluation strategies for non-deterministic AI systems and act as the final line of defense before solutions reach production environments.

Key Responsibilities

  • Own the end-to-end QA strategy across UI, backend, data, retrieval, and AI layers
  • Design and maintain LLM, RAG, and agent evaluation frameworks
  • Build automated test harnesses for Python services, APIs, agents, and pipelines
  • Integrate testing into CI/CD pipelines and prevent regressions
  • Validate data quality, embeddings, retrieval accuracy, and ranking performance
  • Identify hallucinations, reasoning failures, bias, and model drift
  • Design red-team and edge-case scenarios for safety and robustness
  • Define observability metrics for behavior, latency, cost, and failures
  • Run defect triage and deliver clear, actionable defect reporting
  • Collaborate closely with Tech Leads, engineers, and AI Ops

Required Skills & Experience

Technical

  • Strong Python scripting for automation and evaluation
  • Experience with LLM / RAG testing, ML evaluation, or model benchmarking
  • Familiarity with vector databases, retrieval systems, and agent workflows
  • CI/CD, DevOps tooling, and observability platforms
  • Ability to validate embeddings, precision/recall, and ranking metrics

QA & Risk

  • 5–6+ years in QA, SDET, testing, or ML evaluation roles
  • Experience testing non-deterministic or probabilistic systems (preferred)
  • Strong instincts for edge cases, failure modes, and adversarial risks

Mindset

  • Curious, skeptical, and systematic
  • High ownership and strong communication skills
  • Comfortable defining what “quality” means for AI systems

We Expect You to Have:

Apply for this position

Our team will review your application within the next 5 days.

Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Send

Thank you!
We will be in touch shortly

kid giving a thumbs-up while sitting at a desktop table
Done
Oops! Something went wrong while submitting the form.