Labs

Tools and research from the Unsupervised team.

DeepWork

Structured workflows for long-running AI agents. Describe a process in plain English, and DeepWork turns it into a reusable skill with quality gates. Works as a plugin for agent harnesses like Claude Code, Codex, and Gemini CLI.

Learn more

DA-Bench

A public benchmark for evaluating how well AI tools handle real data analysis tasks. Tests dozens of prompts across 9 categories, with hallucination-focused scoring — tools that fabricate answers lose all points. Includes a leaderboard with scores and video recordings of every test.

Visit dabench.com