Managing AI Agents: Lessons Learned So Far (February 2026 Update)

Written by Sean | Feb 27, 2026 2:00:00 PM

How I get the most from my AI-powered "staff"

Ever since the “ChatGPT moment” in late 2022, I have forced myself to try to use AI daily in nearly every aspect of my professional life. At first, this was strictly experimentation, with no expectation that it could pass as an acceptable work product — but with a firm belief it could eventually get there. As time has gone on, my thinking has evolved. I now see the AI agents I collaborate with as an essential set of virtual staff members to delegate key tasks to. I also now use AI routinely in my daily personal life and for real work products. Choosing not to collaborate with AI when it is readily available^*, for me, feels like incompetence. A lot has changed.

You may feel differently. Perhaps you are an AI skeptic. Or maybe you use AI routinely but still don’t seem to be getting the quality of results you’re after. In most cases, the difference won’t come from technical wizardry or better prompting. It will come from changing management practices and how you structure the collaboration. Below are the systems and habits I’ve used to achieve better results.

My impressions of where things stand

This past month (February 2026) has felt like a pivot point. Early in the month, on the same day, OpenAI released Codex 5.3 and Anthropic released Opus 4.6 — two incredible models with big promises. OpenAI seems to be doubling down on coding and the needs of developers, while Anthropic is focusing more on other realms of professional work. On February 20th, Google released Gemini 3.1 Pro.

As I began to write this paragraph, I felt the urge to inject a bar chart, but really didn’t want to have to go through the hassle of building one in Google Sheets or Excel, and then flipping it into a PNG graphic. I also didn’t want to generate something cool in Gemini or ChatGPT that I couldn’t easily update a year or two later. So I paused, opened up a GitHub Issue in this website theme’s repo, delegated the design work to Replit, and then delegated the implementation to my coding agent so the chart lives as a real part of the site. You’ll find it down below.

I have dabbled with all three, but as someone who loves to develop for the web I went all-in on Codex. Within a day or so I had had multiple jaw-dropping moments. For the first time — I could now delegate both the design and development of full page templates with multiple sections and complex functionality — to AI.

It was also catching bugs regarding edge-case scenarios that I’m certain very few people would think of in advance of them happening. E.g. “When a user does this, then this, then this, the screen will become scroll locked.”

Almost more impressive than its ability to do complex work rapidly with near-perfection is these two models’ abilities to follow my instructions, process, and preferred ways of collaborating. As the leader of a small business and a person who has has mentored and managed staff for over a decade — this is an incredibly important part of effective and enjoyable work. There’s few things more frustrating in a professional context than having to remind someone to follow a process or standard over and over again. In January, this was just as big of a problem with managing AIs as it was managing people. Now, AI follows my rules strictly — and often goes the extra mile to do so. It also sometimes will catch conflicting information and prompt me with ideas to improve my documentation and process.

So I’m impressed. A lot more in the digital world seems possible now. Have an idea: go make it. It’s simple. That said, much of professional work today has to be done in SaaS wrappers already developed by others, and AI still struggles to work well within those wrappers. It’s deeply inconsistent in depth of integration, and where AI does appear in software today, it’s often not giving you access to the best frontier models. So the vast majority of my coworkers, peers, and friends might read this and think I’m crazy. I believe this will change soon. The makers of the frontier models seem to have prioritized mastering the needs of developers first because so much is solved today by code. Solve for developers first, and you multiply your impact because now they are using your model to solve the professional needs of others. With that need essentially fulfilled, the big AI shops will shift their focus to other professional needs next.

I believe being a manager and small business owner has given me an advantage to working with AI. I’ve learned to have a (flexible) plan for everything, and to document as much as possible. I know how important it is to break up complex work into smaller, more manageable tasks. I know how important it is to document your progress, to work in the open, and to seek feedback early and often. I know how good it is to have a system for capturing and setting aside the ideas that emerge mid-task so no good idea goes to die, but so you also don’t constantly derail and delay yourself. I know how to ask good questions, and how to find the right problem to solve for. I understand how important it is to have empathy for the end user, and how to help others develop that sense of empathy. And I’ve learned how to coach and give feedback, and to verify that the recipient is learning from that feedback. These are the skills you need to masterfully leverage AI to have a big impact. These are the skills you need to be the “[[def:human in the loop|Human-in-the-Loop (HITL) as it pertains to AI agents is a design pattern where human intelligence is integrated into an agent's autonomous workflow to provide supervision, validation, or intervention at key decision points. Rather than allowing an agent to run entirely on autopilot, HITL ensures that high-stakes, ambiguous, or low-confidence tasks are paused for human review, combining AI's speed with human judgment.]].”

At the moment, technical skills like feeling comfortable in an IDE, working in a terminal, and living in GitHub will give you a big advantage. Long-term, they’ll remain helpful skills to have, but much less of a requirement. OpenAI’s release of the Codex desktop app feels like a step in that direction.

So that’s the background and context behind why I wanted to write this post. I feel like I have an unfair advantage, and want to share what I’ve learned. I hope to update this post as things evolve and accelerate. I imagine it will be funny to read some of this a few years from now. If you try any of my tips below, please share your experience by leaving a comment at the end of this page.

My tips for effectively managing AIs

This is a working list. As experimentation reveals new things, I’ll come back and update this.

Provide context

In project management, if a delegated task is assigned to a person without enough detail, or without a roadmap to learn more about the broader project a task falls within, there’s a very good chance the results will be rejected and the task will need to be reworked. If the feedback is vague, you can expect a second round of revision. The same idea applies when delegating to AIs. Context is everything. Take time to write detailed, thoughtful prompts. Provide background documentation and guidance on how to navigate it. Ensure context is being considered.

Provide rules

A lot of early AI experimentation was prompting ChatGPT to “write me a blog post” without any editorial guidance, styleguide, or examples to follow. So when the result came back well-written but not in their voice, the person thought it was cool but “could never work for us.” Back then, even if you gave it rules, the AI wasn’t very good at following them. This is no longer the case. Give clear, non-conflicting rules, and the AI will follow it with amazing precision. The risk here is being too explicit and not allowing the AI to use its own reasoning. That’s where you’ll need to find a balance. Be strict on the “how” where you need to be, but be flexible elsewhere. Always provide rules. Refine them routinely.

Tip: Learn about AGENTS.md files. They are a simple, open format for guiding coding agents.

Force planning

If a task is clearly going to take more than one step, I force a plan first. That includes building new pages and modules, but it also includes tricky debugging where it’s easy to go in circles. The key is not just having a plan — it’s storing it in the repo, so it survives context windows, machine switches, and “I’ll come back to this tomorrow.” I also make the agent stop after the plan and wait for approval before writing code. It’s a simple pause, but it prevents drift, and it gives you a moment to actually absorb what’s about to happen.

Tip: Learn about PLANS.md files. I have found them to be a fantastic tool for keeping a coding agent on track and focused for long, complex tasks.

Log observations and decisions

AI is great at speed. It’s not great at remembering the weird little rules and preferences that make a project coherent. So when something is discovered mid-task — a constraint, a gotcha, a “we tried this and it doesn’t work here,” or a preference that keeps coming up — log it. Otherwise you will relearn the same lesson over and over. Most of this can live inside the exec plan, but anything that should persist across the whole project should get promoted into AGENTS.md, a README, or whatever “source of truth” doc you actually rely on. The agent can write it up, but you should still review the wording, because language drift becomes process drift.

Park scope-creeping ideas

“Anything is possible” is a blessing and a trap. You’ll be halfway through one feature and suddenly have five more ideas that would make it better. Don’t chase them. Capture them. For me that often looks like an inline TODO or a quick draft GitHub Issue — just enough detail that the idea won’t die, but not so much that you derail the work in front of you. This is less about the agent and more about the manager. Your job is to protect momentum while respecting good ideas.

Pause to clean up slop

Slop is anything that builds up and starts to slow you down later: inconsistencies, leftovers from abandoned paths, unclear rules, clutter that makes the “real” guidance harder to find. The best defense is prevention — clear conventions up front, plus requiring the agent to check context, reflect, and self-test before asking you to review. But slop still happens, and sometimes you need to pause and clean it up on purpose. If you don’t, you’ll eventually pay for it with confusion, rework, and agents making avoidable mistakes. Treat cleanup as part of the workflow, not a guilty chore you’ll magically do “someday.”

Take breaks

It will be tempting to watch the AI work. It’s fascinating. You’re eager to jump on to the next thing, so if you wait a minute or two for it to finish, you’ll be able to begin sooner. I get it. But don’t do that. Take advantage of this new superpower and go for a walk. Go chat with someone. These breaks give your brain a chance to catch up, organize, and calibrate. You’ll find frequent breaks trigger some of your best ideas. You’ll find you can make better sense of all the noise. And you’ll feel less stressed, and happier.

Know when to say no / don’t work on everything

“Anything is possible” means you can easily try to conquer the entire world all at once. That leads to burnout, oversight, and mistakes. Keep your projects limited to the ones that matter most to you at any given moment. Strictly limit how long and how often you work. Don’t fill your free time with more screen time.

Now your turn: What works for you?

I’m still figuring this out, but these habits have made a huge difference for me. If you have your own rules, templates, or tips for managing AI (or you think I’m missing something), I’d love to hear it. Drop a comment — and if you have real examples you’re willing to share (repos, screenshots, short case studies), even better.

View full post