January 2, 2026

I See What You Did There

TGIF! Thank god it’s features, here’s what we shipped this week:

Happy New Year! We’re kicking off 2026 with a vision—literally. Multimodal evaluation is here, and your image-understanding models are about to get the testing they deserve.

Changelog January 2, 2026

Added

  • Multimodal Prompts - Drop images into your experiments. Test GPT-4V, Claude 3, Gemini, or whatever vision model you’re building with.
  • Multimodal Test Cases - Build datasets with images + text. Because modern AI isn’t just about words anymore.