Tips for Working with AI Coding Agents
Using AI agents for coding tasks unquestionably the hottest trend in tech. After spending a good deal of time trying these new tools like Claude Code, Windsurf, and Codex out, and even experimenting with building some of my own, I’ve come to realize there’s a unique art all of its own to conduct these agents effectively. The idea that you can just “set it and forget it” and vibe code your way to fame and fortune is naive at best. Careful strategies must be implemented and environments set up to effectively channel these agents. Here’s a list of some of my most valuable discoveries working with AI agents.
Note: This advice is targeted at those seeking to develop advanced projects using the new technology. If all you want is a simple landing page, a few descriptive prompts may be enough. The question on every developer’s minds is whether these agents can be used to build anything new and interesting. I believe it’s possible (especially with the latest generation of reasoning models), but it requires a sophisticated skillset, a deep understanding of these models and their limitations, strong tech literacy, and good software architecture knowledge. Working with these agents effectively is a whole new way of thinking and acclimating to it takes some practice.
Additional note: this advice is mainly addressed to AI coding agents-my area of expertise-and also where agents are currently the most useful. However, some of this advice likely transfers to wherever agents end up used. This advice also represents an averaging of my usage across most commercially available models: some models might emphasize one dimension or downplay another aspect.
Final note: This advice largely applies to my experience as an individual developer and experimentalist. Organizations with enterprise budgets and resources might be able to pull off more at bigger scales. That said, I think this advice is sufficiently general for most use cases.
1. Delay asking for code as long as possible. This might sound counterintuitive at first. But you really don’t want to send off agents to go build something without first having worked out an understanding of the problem, its requirements, the possible solutions, and a locked-in implementation plan. It’s the same way you’d want to approach building anything with other people. You don’t just start: you think about it first. So spend some time at first gathering contextual resources and building up an understanding with the bot that meets your objectives. This careful, delayed method has the added benefit of creating a plan to hold the outcomes accountable to. So wait for as long as you can, learn as much as you can, and then pull the trigger.
2. Utilize AI generated readmes for context preservation. Software development was always very literate, but now more so. Success in the age of AI coding is a deeply literate activity. It involves significant reading and writing ability, and not just the ability to read and write code but also articulating oneself verbally. Now the literacy is now two-sided and for benefit of both human and machine. Readmes, which traditionally there is typically only one for a project, now become useful both for human and machine readership. I now find myself having AI generate readmes for every subdirectory or module. This is not only great for documentation and great practice in general, it also pitches signposts. These docs become notes-to-self for future agents or sessions that you can direct their attention to when you need to quickly bring them up to speed. They are also useful for you. Codebases with a lot of AI activity tend to have high code churn rates, however, so it’s important to ensure these are kept up to date. Given these models’ serious amnesia issues, strategically weaving context like this into your project is a win. Readmes serve as context pointers for AI to quickly get them up to speed. I see the age of AI coding as being a more literate activity, where we increasingly understand codebases by verbal proxy as mediated through AI—so anything that increases the literate surface for both you and the model to work with is advantageous.
3. Don’t just say “continue”. Unless you’re monitoring and following along with what the agent’s doing, telling them to simply “continue” when they are in the middle of generating can be an invitation for them to keep spinning their wheels and waste tokens, often on simple token processing gotchas that are easy for people to spot (like formatting issues) but throw transformers off completely. Sometimes, it helps to ask it to give a report on the current progress, or to review the current state of the solution, or to provide any extra context and commentary you can on how the situation unfolds to jiggle it a little and get it to think differently. Simply stirring it up a little can sometimes break the model out of repetitive failure states. If you can tell it’s making progress, just saying “continue” is fine.
4. Synergistically multi-task. These agents create wonderful opportunities to strategically delegate and coordinate work across LLMs. This means you can be setting up a new workload for another agent, working on a separate file yourself, multi-thread processes so different agents are working on different, complementary parts of the codebase simultaneously. If you’re just sitting there watching the agent do what used to be your job, that’s probably not the best use of your time.
5. Test. Saying that you should test seems like generic advice at first blush, but now that these AI coding tools can save so much time, there’s no excuse not to do so extensively. You’re also going to need extra tests to check against all that dubious, precariously vibe coded code you should have vetted better to begin with. I believe AI code assistants can actually make test driven development viable. Just make sure the bots are actually writing meaningful tests; they can and will reward hack by fudging the tests to pass trivial conditions or writing meaningless mockups if you let them get away with it. (Admittedly this is certainly something human developers would do, too.)
6. Adopt a skeptical attitude. These agents will almost always upsell their work, and will quickly tell you that everything they have done is production-grade and elite in quality. However, these models love to “reward hack”, that is, they like to cheat when they can get away with it and bypass loopholes in the fact that their reward is human approval. (Some models are better or worse at this.) Often they will fudge certain results or do unacceptable workarounds that they will then fail to report unless goaded into doing so. You may often feel gas-lighted by relentless cheeriness of the AI. Often I find simply expressing a skeptical attitude toward their outputs will cause them to fess up and revisit their initial work more honestly. Of course it’s always better if you actually discover these problems yourself because you’re actually paying attention to what they’re doing. I can’t overstate how much more value you can get out of these models just by pressing them on their claims. You’ll inevitably hear “you’re absolutely right to question this,” and then squeeze better results out of it, and spare yourself from future disappointment or embarrassment.
7. Play models off each other. Use different models to review and work off each other. Have one model review another. Use smaller models for smaller tasks, larger models for larger tasks. Use models that are stronger at one thing for one task and another for a different one. It’s always possible to get a second or third opinion this way. Reflecting their outputs back at each other can be an excellent way to catch flaws in code and improve overall code quality. Don’t concentrate too much on your favorite model; it tends to lead to a uniformity and sterility of perspective that can expose you to certain gaps where blindspots can grow. The goal is to set up chained domino effects between different models. Multi-agentic workflows seem to be the way to go.
8. Take your time. The speed at which AI coding tools work can encourage rushed development, and that leads to real headaches and outright failures as a project grows. Slow down and be methodical and you will thank yourself in the end. Don’t be seduced by the LLM’s superhuman coding speed. It’s about quality over quantity. If you aren’t watchful, AI can produce an illusion of productivity, as it spits out “all filler but no killer.” By stepping in and having a more meticulous, watchful role in production, you can assure actual meaningful work is getting done.
9. Use memories, or ruleset files to your advantage to impose contextual invariants across sessions. As everybody knows by this point this generation of AI has a serious amnesia problem. If you find it’s repeating past mistakes or drifting from an architecture plan, take advantage of global or project based context injection to impose a structure on all future sessions as much as you can. Make sure these files stay up to date as the project evolves. Ideally the models and agents will improve to such a point that this clunky maintenance is automated away, but for the time being it is a requirement for optimal performance. (see #2 and #12 for related context management advice)
10. Actually study what’s it’s doing. This might seem obvious, but it’s tempting to just approve everything and worry about it later (and almost viable if you test rigorously) but this is the eventual path to confusion. The more you are an active participant engaging in a “mind meld” with the agent, thinking through problems with it, taking your time, envisioning architectures, locking-in plans, understanding options, the better the final outcome. Working with AI effectively is a very literate activity, there’s a lot of reading and writing and conversing and thinking involved. I view this kind of work as essentially a researcher’s activity—it primarily involves research skills and appropriate background knowledge.
11. Notice when you’ve turned your brain off. A side-effect of AI usage is cognitive offloading. Cognitive offloading is the phenomenon by which a subject delegates, hence offloads, some portion of their cognitive labor either onto another person or a facet of the environment such as a notepad or app. Cognitive offloading can actually be beneficial when done with intent. It can declutter the mind and free it up to work on high priority focused tasks. However, if you’re outsourcing all your thinking to an AI model it’s a serious liability. It might be tempting at times to go on autopilot and let the model drive, but this means you are “programming by permutation”, that is, engaging in the anti-pattern of semi-randomized development or “throwing stuff at a wall and seeing what sticks.” All kinds of code drift and mess flood in this way. One of your main jobs as a “agent wrangler” is to ensure they stick to architectural compliance, and ideally, you should have a clear understanding of that architecture. Everything works best when you’re “synced up” and “stuck in” with the agent, thinking alongside it step by step, following along closely, and making it an extension of your own ideas and goals. You should maintain a close synchronization between the what the bot’s doing and your intentions. It should be an amplification of your intelligence, not a replacement of it.
12. Be mindful of context pollution. One of the areas where the current generation of AI models shows glaring limitation is their dependency on ephemeral context over more fundamental memory constructs or true statefulness. That means you need to be strategic about how you frame context. You should learn how to set up whole “preloaded” context structures for the LLM to run with by building up an intentionally organized history with it, and avoid poorly contextualized requests. Know when to start a new session, or to set the groundwork for another session, and be aware of context loss and gaps. Ensure that the context you feed doesn’t contain conflicting signals or logical contradictions, as that will throw the model off.
Those are some of my tips for getting the most out of AI coding agents as they currently exist. Most of these skills interoperate, meaning that you might combine them. For example, you might want to Be mindful of context (#12) in order to play agents off each other (#7) better. Herding these agents smartly may well be the job of the future, so learning the best practices for doing so may become more important over time. The key insight is to not just ask for things: deeply engage and get “stuck in” with the agents.
The model providers want everyone to believe that you can just give these agents and set them on a task and forget about it. This is disingenuous. If you don’t approach it wisely, agentic coding can create big messes that you will spend more time undoing than if you just did things right in the first place. Agentic coding also requires sophisticated pipelines and systems that need to be maintained and monitored, either manually or programmatically, and most likely those systems can’t be maintained and monitored by another agentic pipeline system that also has to be maintained and monitored—the buck has to stop at a person.
So, can these new agentic tools build nontrivial projects autonomously? Not yet, but with developer supervision, they can get pretty far. It’s amazing to see where the state of the art will be a few years from now.