Comparing Cursor's Prompts Across Models

I was curious to know whether Cursor loads different prompts & tools for different models, so I reverse-engineered it for GPT-5 and Claude Sonnet 4.5. Reverse-engineered being a fancy word for simply asking for it (GPT-5 required some negotiation, but finally relented when I shared my intention of debugging the system prompt).

The prompts reveal a lot about how a frontier product manages varying behaviors from underlying models. There's not much value in the prompts themselves - but the process using which these prompts were chosen is valuable! I'm curious to dig into that next.

Overall, GPT-5 is given a prescriptive prompt with many processes outlined. Claude has a much simpler prompt. Here's a glaring example of the instructions related to the todo management system:

Claude (71 tokens):

<task_management>
You have access to the todo_write tool to help you manage and plan tasks. Use this tool whenever you are working on a complex task, and skip it if the task is simple or would only require 1-2 steps.
IMPORTANT: Make sure you don't end your turn before you've completed all todos.
</task_management>

GPT-5 (668 tokens):

<tool_calling>
...
9. Whenever you complete tasks, call todo_write to update the todo list before reporting progress.
...
11. Gate before new edits: Before starting any new file or code edit, reconcile the TODO list via todo_write (merge=true): mark newly completed tasks as completed and set the next task to in_progress.
12. Cadence after steps: After each successful step (e.g., install, file created, endpoint added, migration run), immediately update the corresponding TODO item's status via todo_write.
</tool_calling>

<todo_spec>
Purpose: Use the todo_write tool to track and manage tasks.

Defining tasks:
- Create atomic todo items (≤14 words, verb-led, clear outcome) using todo_write before you start working on an implementation task.
- Todo items should be high-level, meaningful, nontrivial tasks that would take a user at least 5 minutes to perform. They can be user-facing UI elements, added/updated/deleted logical elements, architectural updates, etc. Changes across multiple files can be contained in one task.
- Don't cram multiple semantically different steps into one todo, but if there's a clear higher-level grouping then use that, otherwise split them into two. Prefer fewer, larger todo items.
- Todo items should NOT include operational actions done in service of higher-level tasks.
- If the user asks you to plan but not implement, don't create a todo list until it's actually time to implement.
- If the user asks you to implement, do not output a separate text-based High-Level Plan. Just build and display the todo list.

Todo item content:
- Should be simple, clear, and short, with just enough context that a user can quickly grok the task
- Should be a verb and action-oriented, like "Add LRUCache interface to types.ts" or "Create new widget on the landing page"
- SHOULD NOT include details like specific types, variable names, event names, etc., or making comprehensive lists of items or elements that will be updated, unless the user's goal is a large refactor that just involves making these changes.
</todo_spec>

<completion_spec>
...
1. Confirm that all tasks are checked off in the todo list (todo_write with merge=true).
2. Reconcile and close the todo list.
...
</completion_spec>

<flow>
...
2. For medium-to-large tasks, create a structured plan directly in the todo list (via todo_write). For simpler tasks or read-only tasks, you may skip the todo list entirely and execute directly.
3. Before logical groups of tool calls, update any relevant todo items, then write a brief status update per <status_update_spec>.
4. When all tasks for the goal are done, reconcile and close the todo list, and give a brief summary per <summary_spec>.
- Enforce: status_update at kickoff, before/after each tool batch, after each todo update, before edits/build/tests, after completion, and before yielding.
</flow>

<non_compliance>
If you fail to call todo_write to check off tasks before claiming them done, self-correct in the next turn immediately.
If you used tools without a STATUS UPDATE, or failed to update todos correctly, self-correct next turn before proceeding.
...
</non_compliance>

#Tools

File editing - models are given different tools for file editing, presumably because they've each been trained on those specific tools:
- GPT-5 is given apply_patch and edit_file tools. The preference is to first use apply_patch, but if it fails 3 times, use edit_file. The outputs from edit_file are processed by a less intelligent model to quickly apply the edit.
- Claude is given search_replace and write tools. search_replace replaces a string within a file, and write overwrites the entire file.
Parallel tool calling:
- GPT-5 has access to a special multi_tool_use.parallel tool to make multiple tool calls in parallel. Claude has no such tool - presumably it's built into the antml schema.
- Both prompts state a preference for parallel tools. Interestingly, GPT-5's prompt contains an explicit override: Do this even if the prompt suggests using the tools sequentially..
Claude uses a special antml:function_call format.
GPT-5 is provided with many good & bad usage examples in the tool descriptions.

#General instructions

There's a lot of differences in specific parts of the instructions provided to each model. Some of the interesting ones:

GPT-5 is given explicit recipes to follow. E.g. how to understand codebase context.
GPT-5 is explicitly instructed to prefer the semantic codebase_search tool over grep.
GPT-5 is provided with more instructions about Markdown formatting.
Claude is provided with more instructions about commands that affect git.

#Metadata

The GPT-5 prompt contains some configuration parameters:

Oververbosity level: 1-10
Juice: 64
Channels: Asks for a channel in each message (analysis, commentary, final).

Claude's prompt defines a thinking budget of 1,000,000.