The AI definition of done
As AI assistants take on longer tasks, the best prompt is not just a request. It is a goal plus clear success criteria.
A pattern is showing up across current AI product news: assistants are being designed for longer, more goal-directed work. OpenAI's May release notes describe Codex goal mode, richer context through appshots, and browser improvements for keeping tasks moving. Anthropic's KPMG and PwC announcements show AI being embedded into large professional workflows, where vague tasks are not enough. Google is also pushing Gemini toward proactive, ongoing help.
The practical skill is to define "done" before the AI starts. A clear definition of done prevents the assistant from polishing the wrong output, wandering into extra work, or stopping too early with a confident but incomplete answer.
The skill
An AI definition of done is a short success contract for any task that takes more than a few minutes. Use it for research, writing, reporting, analysis, code fixes, spreadsheet cleanup, slide drafts, or workflow redesign.
AI definition of done
Goal:
{what should be true when this task is complete}
Success criteria:
- {observable check}
- {observable check}
- {observable check}
Non-goals:
- {what not to do}
- {what not to change}
Required evidence:
- {sources, files, checks, citations, tests, screenshots, examples}
Review output:
- What changed or was produced
- Which success criteria were met
- Which checks were run
- What remains uncertain
Stop condition:
{when the AI should pause and ask instead of guessing}
A worked example: clean up a monthly report
A weak task is: "Improve this monthly report." That can mean better writing, new charts, fewer pages, deeper analysis, or a total redesign. The assistant may do something impressive and still miss the actual need.
The definition-of-done version is tighter:
Goal:
Make the monthly operations report ready for the leadership meeting.
Success criteria:
- Executive summary is under 120 words
- Top 3 risks are visible on the first page
- Every metric has a period label and source
- No new claims are added without evidence
- Follow-up actions have owners or are marked "owner needed"
Non-goals:
- Do not redesign the template
- Do not add new metrics
- Do not change the underlying data
Required evidence:
- Current report draft
- Metrics export
- Risk log
Review output:
- Summary of edits
- List of unresolved gaps
- Any metrics that need human confirmation
Stop condition:
If a metric source is missing, flag it instead of inventing one.
The prompt
Paste this before any AI task where quality matters:
Before starting, convert my request into a definition of done.
Request:
{task}
Create:
1. Goal
2. Success criteria
3. Non-goals
4. Required evidence
5. Review output
6. Stop condition
Then ask me to approve or edit it before doing the work.
When the work is complete, report back against each success criterion instead of only summarizing what you did.
The review checklist
- Goal: Does the goal describe the finished state, not just the activity?
- Criteria: Can you objectively tell whether each criterion was met?
- Non-goals: Did you protect unrelated files, sections, data, or decisions?
- Evidence: Did you name the sources or checks that prove the work?
- Stop condition: Does the AI know when to pause instead of guessing?
Why it works
AI tools are becoming better at taking initiative. That is useful only if the initiative is pointed at the right target. A definition of done gives the model a finish line and gives you a review checklist.
The habit also makes collaboration calmer. You can let AI work through the task without micromanaging every sentence, because the success criteria are already visible.