Mindrift

Freelance Agent Evaluation Engineer

📍 Location
buenos aires, espírito santo
⏰ Job Type
Full-time
📅 Posted
May 28, 2026

Job Description

Please submit your CV in English and indicate your level of English proficiency.

What This Opportunity Involves

  • Build virtual companies following a high-level plan - codebase, infrastructure, and context (conversations, documentation, tickets) that form a realistic environment with development history
  • Assemble and calibrate tasks from intermediate states of the virtual company: craft the prompt, define evaluation criteria, and ensure the task is solvable and the evaluation is fair
  • Design tasks set in isolated environments - emulations of a developer's workstation: a Linux machine with development tools (terminal, CLI), MCP servers (repository, task tracker, messenger, documentation, etc.), and a real web application codebase
  • Write tests that accept all correct solutions and reject incorrect ones - neither too strict (breaking on valid approaches) nor too lenient (passing bad ones)
  • Iterate with an AI agent on tests - ...

Ready to Apply?

Take the next step in your career journey with Mindrift

Apply Now