Most people are still using coding agents like chat windows.
Ask. Wait. Interrupt. Correct. Re-explain. Repeat.
This is the wrong frame.
The better frame is not conversation. It is orchestration.
A tool like Codex should not be treated as a clever autocomplete box. It should be treated as a semi-autonomous worker sitting inside a queue. The human does not hover over every motion. The human defines the job, constrains the environment, waits for the artifact, then reviews the result.
That sounds small. It is not small.
It changes the role of the user from typist to dispatcher.
Steering interrupts. Queueing preserves motion.
The usual instinct is to steer constantly.
You see the agent start doing something slightly wrong, so you interrupt. You clarify. You correct. You add another requirement. You change the scope halfway through.
Sometimes that is necessary. Usually it is expensive.
An agent in motion is holding a temporary map in its head: files it inspected, assumptions it made, errors it saw, possible fixes it considered. When you interrupt too aggressively, you break that map. The agent now has to rebuild context while also reconciling your new instruction with its half-finished plan.
Queueing is different.
Queueing says: finish the current unit of work. Then take the next instruction.
This is closer to how operating systems, CI/CD pipelines, factories, and competent teams work. You do not stop the compiler halfway through to ask it to also write the README. You let the current job complete, then pass the next job into the pipeline.
The human should not be the bottleneck
The old model spends human attention on supervision.
Watch every token. Catch every deviation. Keep the conversation alive. Nudge constantly.
The new model spends human attention on architecture.
Define the objective. Define the acceptance criteria. Define the boundaries. Let the agent execute. Review the artifact.
The agent does the throughput work.
The human does the taste work.
This is the more durable split.
Use Codex like production tickets
A vague prompt creates vague labor.
“Make this better” is not a task. It is a mood.
A better Codex task looks like this:
Task: Implement the onboarding modal using the existing component patterns.
Context:
- Repo already uses shadcn components.
- Auth flow must not be changed.
- Preserve the current visual style.
Acceptance criteria:
- Modal appears after first login only.
- User can dismiss it.
- Dismissal state persists.
- Existing tests still pass.
Constraints:
- Do not introduce a new UI library.
- Do not modify unrelated files.
- Summarize changed files at the end.That is not chatting. That is dispatching.
Once the first task is done, queue the next one:
Review your previous changes. Look for regressions, duplicated logic, and missed edge cases. Do not implement fixes yet. Produce a short review report only.Then:
Implement only the fixes from the review report. Keep the diff small. Run tests or explain why they cannot be run.Then:
Write a commit message and a concise PR summary with testing notes.This is the real workflow:
plan → implement → test → review → fix → document → commit
Not:
prompt → panic → interrupt → re-prompt
Specialize the threads
A thread is not just a place where messages happen.
A thread becomes a cognitive environment.
It accumulates assumptions, terminology, architecture, naming conventions, and local memory. If every thread becomes a general dumping ground, the agent becomes worse. If each thread has a bounded identity, the agent becomes more useful.
Use one thread for frontend UI.
Use another for database and schema work.
Use another for QA and regression testing.
Use another for copy, onboarding, documentation, or launch material.
The point is not to create complexity. The point is to stop mixing incompatible mental contexts.
A frontend worker should not also be asked to redesign the business model, debug Supabase policies, write landing page copy, and invent a pricing system in the same breath.
That is how agents become soup.
The queue is the product
The interesting interface is no longer the chat box.
The interesting interface is the queue.
A queue lets you preserve momentum. It lets you line up work without collapsing the current task. It lets the human think in pipelines instead of interruptions.
For coding work, the queue might look like this:
1. Inspect repo and produce architecture map.
2. Identify the smallest implementation path.
3. Implement feature behind a flag.
4. Run tests.
5. Fix failures.
6. Refactor only touched code.
7. Produce PR summary.
8. Create follow-up issues.Each step has a bounded output.
Each output becomes the input to the next step.
This is how AI work becomes operational instead of conversational.
Do not ask agents to “build the startup”
Agents are powerful, but they still degrade under ambiguity.
They do best when the work is narrow, local, verifiable, and bounded.
Bad:
Build my app.Better:
Create the empty dashboard route, using the existing layout shell, with placeholder cards for the three metrics already defined in the product spec. Do not connect live data yet.Bad:
Fix the auth.Better:
Trace the login flow and identify where the session is lost after redirect. Do not change code. Return the relevant files and the most likely cause.Bad:
Make it professional.Better:
Rewrite the empty-state copy in a calm, product-led tone. Keep all strings under 90 characters. Return only the replacement strings.The smaller the task, the better the agent.
The better the task, the less supervision it needs.
The operating model
The model I want to use:
- One repo has one persistent architecture thread.
- Each feature gets its own execution thread.
- Each thread receives production-ticket style prompts.
- Interruptions are minimized.
- Review happens after artifacts exist.
- Agents produce summaries, diffs, tests, and next actions.
- The human decides priority, taste, and direction.
The human should not be typing every brick into place.
The human should be deciding what kind of building is worth constructing.
Codex is not a magic engineer. It is not a person. It is not a teammate in the romantic sense.
It is closer to delegated computational labor.
Used badly, it becomes another notification surface.
Used well, it becomes a small factory for turning clear intent into working artifacts.
The trick is to stop prompting like a user.
Start dispatching like an operator.