Bonus (in case you even read the posting!):

If you send us an email at contact@judgmentlabs.ai that you’ve taken a look at our open-source agent post-building SDK and given it a star, we’ll bump you up in our queue! https://github.com/JudgmentLabs/judgeval

Company Description

Judgment Labs is an infrastructure provider specializing in evaluation, monitoring, and reward modeling for long trajectory agents. Leading agent teams use Judgment Labs for testing, monitoring, and optimization loops. Founded by LLM researchers from Stanford AI Lab, Berkeley AI Research, and Together AI, Judgment Labs is dedicated to unleashing self-improving agents.

Role Description

This is a full-time on-site role for an AI Engineer at Judgment Labs located in San Francisco, CA.

AI Engineers at Judgment Labs span creating core infrastructure for capturing agent telemetry, building custom evaluation pipelines, and applying state-of-the-art optimization methods (RL, SFT, DPO) to agents.

Qualifications

Strong background in Computer Science and Software Development
Knowledge of large language model evaluation techniques (algorithmic, LLM-judge, etc.)
Excellent problem-solving and analytical skills
Maniacal work ethic
Bachelor’s or above in Computer Science, Artificial Intelligence, or related field