A Year of Evals
Was 2025 the year of Agents? That might be 2026. For me, 2025 was the year of Evals, or more accurately - the year in Evals. Evals are the 2nd best tool to measure feature quality: the best way is to put the feature in the hands of actual users, but that’s riskier, so evals are the best safe way to measure feature quality. This was the year I started working on a new project, a new GenAI project, and my role was to take charge of the evals process we had and do everything which needed to be done so that the our feature’s quality improved....