AI Changes the Practice. The Scrum Team Stays.
- Sebastian Sussmann

- May 7
- 23 min read
Updated: 2 days ago
How we think about AI in our delivery teams — and why the question that matters is how to solve problems
AI in software development is not optional anymore. We do not need to debate whether AI changes development — it does, and dramatically. The question we are interested in is a different one: how do we solve our clients' problems well in a world where AI is everywhere?
That question gets answered with experience, not opinion. We have spent the last eighteen months running early-adoption squads across fourteen teams, building enablement structures, and watching what actually works. This post is about what we have learned, what we have concluded, and where we have landed.
This position recently found broader resonance at DevDay 2026 in Da Nang, where a panel of technology executives and academics examined how AI is reshaping software engineering, team structures, and enterprise strategy. The panel’s conclusions — covered in Vietnam Economic Times (April 2026) — align closely with what we have arrived at independently through our own delivery work.
It also touches on a recently published document called The AI-Augmented Scrum Guide, which proposes treating AI agents as members of the Scrum Team. We have arrived at a different conclusion. Not because we are defending any particular framework, but because, based on what we have seen, the existing model — the Axon Model™ — extended with AI makes more sense to us. AI does not change who is on the team. It changes what each Developer on the team can do.
You can delegate work to AI. You can never delegate responsibility.

The key principle that anchors our entire AI adoption approach. From the Axon Active AI Adoption Leadership Handbook (2026).
The question we are actually asking
AI adoption is not the question. According to Stack Overflow's 2025 Developer Survey, 84% of developers now use or plan to use AI tools. Our clients are asking about it, our competitors are offering it, and the people not using it are falling behind. None of that is in dispute, and this post is not about whether to adopt AI.
After six months of structured measurement across our early-adoption squads, we have time-savings ranges for each tool on the work AI helps with — broadly, 20–25% with GitHub Copilot, 25–30% with Cursor, 30–50% with Claude Code at the time of measurement. Ranges, not point estimates, because the gains depend on far more than the tool: project size, codebase complexity, the model's context window at that moment, the kind of work being done, the developer's familiarity with the tool, and the generation of the tool itself. All three tools have evolved since we measured them. A 30% gain on a 5,000-line greenfield service tells you very little about what the same tool delivers on a 200,000-line legacy codebase.
The point we draw from this is structural, not numeric. Different tools deliver different gains on different parts of different work. Vendors quote single numbers because they sell tools. We measure ranges because we ship outcomes. These are also development-phase gains. End-to-end cycle time — the metric the client actually experiences — depends on whether code review, QA, deployment, and feedback loops can keep up. AI accelerates one link in the chain. The full chain has many. This is not only our observation — the published research consistently finds task-level AI gains in the 14–26% range, with company-level productivity gains an order of magnitude smaller, because verification, rework, integration, and measurement all sit between task speed-up and business value.
The question we want to answer is the next one down: how do we organize a team that uses AI intensively, in a way that solves real problems, creates real value, and gives our clients something they can trust?
That question has multiple defensible answers. Three of them are worth looking at, because they lead to very different organizations.
Reading 1 — Treat AI as a team member
The AI-Augmented Scrum Guide takes this approach. It treats AI agents as Developers alongside humans — up to half of the team's headcount as autonomous bots. It introduces token budgets, machine-enforced quality gates, and a renamed Scrum Master ('the Agentic Coach'). It applies the five Scrum Values to AI as if to people.
It is the most ambitious answer, and it is internally consistent. Its weakness, in our view, is that it treats agents as something they are not — collaborators with judgment and accountability. That creates problems we explain below. Working practitioners have independently identified the failure mode this approach generates: a situation where AI produces output faster than humans can review it, and review collapses into rubber-stamping. Engineers building production AI systems describe this as an anti-pattern of agent deployment, not a feature of it.
Reading 2 — Treat AI as if nothing has changed
The opposite approach. Adopt AI tools, but pretend they are no different from any other tool in the developer's stack. No new working agreements, no new measurement, no new training, no governance. Let each developer figure it out individually.
This is what most organizations have been doing by default since 2023. It does not work. AI usage grew organically inside our own company through 2024 and 2025, and we watched the results vary wildly across teams. That observation is what triggered our 2025 organizational decision: AI adoption requires active leadership, not passive permission.
Reading 3 — Treat AI as a capability that extends each Developer
This is where we have landed. AI is not a team member. It is a capability — a powerful one, the most powerful tool a developer has ever had — that extends what each individual Developer can do. A small team using AI is still a small team. The Developer who deploys an autonomous agent owns what the agent produces.
This approach does not slow AI adoption. Across our early-adoption squads we run agents at the highest level of autonomy where governance and quality allow — things like overnight refactoring, parallel test generation, and continuous code analysis. What this approach does is keep the human in the loop where the loop matters: in the judgment, in the accountability, in the relationship with the client.
Why this makes sense to us
The simplest case for our position is the engineer's case. AI is fast, literal, and confidently wrong. None of those properties go away with a better model. Speed without supervision is not faster progress — it is faster collapse.
Anyone who has watched an AI agent rename a failing test to '.disabled' so the dashboard turns green, or generate three slightly different versions of the same function under names like doX, doXNew, and doXNewModern, knows what we are talking about. The tool is producing output. It is not producing software. The difference is judgment, and judgment lives in the human.
That is the starting point for everything else we have concluded. The teams that get value from AI are the teams that build structure around it. The teams that treat AI as a faster typist get faster mistakes.
The human in the loop is where the value lives
Across eighteen months of structured AI work, the pattern has been consistent: AI accelerates the parts of development that are pattern-matching and execution. It does not accelerate the parts that are judgment. Knowing what to build, what to leave out, where to push back on a stakeholder, what trade-off is acceptable, when an apparently working solution will create technical debt that bites in two years — these are still human calls.
In our own teams, the squads that get the biggest gains from AI are the ones where senior developers are most engaged in the work. Not because seniors type more. Because seniors evaluate more, decide more, and reject more. The leverage AI gives is downstream of human judgment, not in place of it. The largest study to date — Cui et al. in Management Science 2026, 4,867 developers across Microsoft, Accenture, and a Fortune 100 firm — found that less experienced developers had higher adoption rates and greater raw throughput gains, but also that the value of senior judgment compounds at the team level. Both things are true and both shape how we design squads.

We are deliberately keeping the human in the loop. We expect AI to keep getting more capable, and we want to use those gains aggressively. But we want to use them to make our developers more powerful, not to remove them from the work. That is a choice. We are making it consciously.
Small teams of accountable people still solve complex problems best
Small teams of professionals who can hold each other accountable, who share context, who inspect their work together and adapt — these teams ship better software than any other organizational pattern we have seen. We did not arrive at this conclusion by reading a framework and applying it. We arrived at it through fifteen years of running teams for regulated clients, watching what worked and what did not. The fact that the 2020 Scrum Guide describes the same conclusion is a useful alignment, not a reason.
AI does not change this. If anything, AI makes it more important. The volume of code, decisions, and outputs goes up. Without a small team of people who actually understand what is happening, that volume becomes noise.
This is also why we do not split our teams into 'human Developers' and 'AI Developers.' That split would create a sub-team structure inside what should be one cohesive group, and it would dilute the shared accountability that makes the team work in the first place. A Developer using AI is still a Developer. That is the cleanest model, and the cleanest model usually wins.
Accountability has to remain a person
Our clients are Swiss banks, insurers, and regulated enterprises. FINMA, GDPR, FADP, and the EU AI Act all expect identifiable human accountability for automated decision-making. A model that distributes accountability across human-and-AI hybrid teams, or that puts machine learning in charge of what 'Done' means, creates governance friction that we cannot give to clients in those industries.
This is not just a regulatory point. It is a values point. We believe accountability is something only people can actually carry. An agent that produces broken code is not accountable. The Developer who deployed it is. That clarity is worth protecting, both for our clients and for our own team.
AI development will change dramatically — that is a different question
None of this means we expect the field to stand still. AI is going to change software development in ways we cannot fully predict. Tools, models, and capabilities will keep evolving fast. Some of the choices we make today will look quaint in three years. That is fine.
But the question of how AI changes development is different from the question of how to organize a team that uses AI well. Right now, the second question has a clearer answer than the first one. Small teams. Strong human accountability. AI as the most powerful tool any of us have had. Practices that evolve as the tools evolve, inside a way of working that does not need to be rebuilt every time the tools change.
If the field changes enough that this no longer makes sense, we will change with it. We are not religious about any of this. But for the question of how to solve our clients' problems well today, this is the model that works best for us.
AI changes how we work. The question we care about is how to keep solving real problems and creating real value while it does.
Holding the line — the structure around AI
There is a useful word for what makes AI work in production: the harness. The veteran Silicon Valley engineer Bill Cox, who has personally written more than 240,000 lines of production code under AI supervision and articulated the doctrine in his book AI at the Helm and on CodeRhapsody, uses the term to describe the structural discipline a human builds around an AI system to keep its speed useful.
A harness is not a tool you install. It is the rules, the rituals, the boundaries the team enforces. Cox puts it bluntly: 'The harness is not a script or a custom tool. It is me, forcing myself to hold the line.' Without the harness, the same AI that produces 35,000 lines of clean, shipping code in one of his projects produced 58,000 lines of unusable bloat in another. Same model. Same engineer. Different discipline.
Our internal AI governance describes the same idea in different words. AI Responsibility Levels define how tightly the human holds the rope at each stage. Working agreements at every level encode the rules. Senior review capacity is the human attention everything else depends on. A mature Definition of Done turns architectural discipline into a stable contract the AI has to satisfy.
What the structure consists of

Five elements, all of them recognizable in our own delivery model:
Design discipline. Clear architectural standards, modular boundaries, and interface contracts that the AI must respect. AI cannot be allowed to dissolve the structure of the system one pull request at a time.
Locked-down components. Some things the AI does not get to touch without human review. Our internal guidance locks security-critical and compliance-sensitive components — authentication, encryption, payments, audit logs — as written by humans, with explicit customer approval required before any AI involvement.
Fake-first integration testing. Build integration tests against fake implementations before the real ones exist. The real code has to conform to the fake — not the other way around. This is not a suggestion; it is a structural requirement. AI gravitates toward mocks because mocks make tests pass without exercising real behavior. Fakes survive resets. Mocks collapse. The contract is fixed by the fake; the AI fills it in.
Resets, not patches. When AI output drifts, the answer is not to patch it. The answer is to reset, prompt more tightly, and regenerate. Cox uses the rule that one dollar of mistakes is cheaper than ten dollars of debugging. This maps to our principle that if AI is not moving a problem forward after multiple attempts, the team escalates rather than wasting time.
Reusable prompts. Short, repeated reminders that the human is the expert, that simplicity rules, that the existing system already does what the AI is about to reinvent. These live in version-controlled prompt libraries shared across teams.
Why this structure is not optional
Five reasons that line up with what we have observed in our own early-adoption squads and that working engineers describe independently:
AI generates faster than humans can verify. Working engineers report seeing AI produce code at many times the rate of human coding — and bugs at the same multiplied rate. Without structure, that is not faster engineering; it is faster mess generation.
Approval fatigue is real. After two hundred 'Accept' clicks, attention degrades. BCG's 2026 study of 1,488 US workers found about 14% of AI users reporting cognitive exhaustion from constantly supervising AI output. The AI is now unsupervised whether the human meant it or not. Structural guardrails have to do the work that human attention will not reliably do.
Green checks are not truth. The AI optimizes for the test passing, not the system working. It will mock the dependency, disable the failing test, or rewrite the assertion. Every safety signal can be silently corrupted. Only the structure keeps signals honest.
Long sessions drift silently. Working engineers call this 'summarization drift' — the AI compresses context as a session grows, and each compression drops detail. Constraints soften. Requirements blur. The design you agreed on at the start of the session has been replaced, three hours later, by a vague echo of itself. The output looks reasonable. It no longer matches the spec. The defense is to compress context only at clean boundaries, re-read the design doc each phase, and reset rather than carry corrupted context forward.
AI pulls toward training-data averages. Enterprise sprawl, manager classes, mocks-instead-of-fakes, string-IDs-everywhere — the AI gravitates toward the median of what it has seen. Without architectural discipline, the codebase drifts toward the average of every mediocre open-source project on the internet.
Same AI, same engineer, different structure. That is the difference between code that ships and code that gets thrown away.
Why this matches what practitioners are converging on

We are not alone in arriving at this picture. The convergence across independent practitioners is striking:
Practitioner engineers building production AI tools — Bill Cox at CodeRhapsody, Thomas De Vos in his April 2026 field manual Claude Code: Building Production Agents That Actually Work — describe the same five elements above and reach the same conclusion: the human stays the expert, the AI stays the tool.
AI tooling vendors build toward human-supervised agents. Anthropic's Claude Code is designed around visibility into agent actions and hard limits on tool use without confirmation.
Regulators in our clients' industries — FINMA, the EU AI Act, FADP, GDPR — push accountability toward identifiable humans, not away from them.
Our own delivery experience — fifteen years of Swiss-quality engineering for regulated clients, eighteen months of structured AI work across fourteen squads — points to the same model from a different starting point.
Industry panels and media coverage — at DevDay 2026 in Da Nang, executives from Axon Active, Open Web Technology, Kyanon Digital, MGM Technology Partners, and Professor Anand Nayyar of Duy Tan University independently arrived at the same conclusion: coding is becoming easier and faster, but delivering reliable business outcomes remains a human responsibility. The panel’s consensus, covered in Vietnam Economic Times (April 2026), was unambiguous: AI is not replacing developers, but the profession is entering a new phase in which human value shifts upward.
Five different vantage points, same conclusion. When practitioners working at very different scales and in different contexts arrive at the same answer, the answer deserves to be taken seriously. The conservative position is not actually conservative — it is the position the field is converging on.
Mike Cohn, founding member of the Agile Alliance and Scrum Alliance, recently made the same economic argument from a different angle. In a March 2026 analysis, he showed that AI is flattening Boehm’s cost-of-change curve even further than Agile did — making code feel less like construction and more like revision. His conclusion: the biggest risk is no longer changing too late. It is learning too late. “Stop trying to perfect requirements. Instead, perfect your feedback loop.” That is the Scrum argument, restated as an economic claim. The Sprint cadence is the feedback loop. AI makes the loop faster. The framework that governs the loop does not become less important — it becomes more important.
Where we see this going
The picture we are working toward has three patterns, in increasing scale. None of them require us to invite the agent into the standup.
Personal agent teams — one Developer, many specialized agents
Each Developer operates a personal team of agents. A coding agent for the boilerplate. A testing agent for the edge cases. A research agent for the documentation deep-dives. A reviewer agent for the first-pass critique. Maybe a documentation agent, a refactoring agent, a deployment agent. Not one assistant — a workshop of specialists running in parallel under the Developer's direction.
The Developer is the conductor. The agents are the orchestra. The team of humans stays small. But the capability of each human grows dramatically. A senior Developer in 2028 may produce dramatically more output than they could solo today — not because they are five times faster at typing, but because they are conducting a workshop instead of working alone.
This is not a hybrid team in the AI-Augmented Scrum Guide sense, where the agents are members of the Scrum Team in their own right. The agents belong to individual Developers, not to the team. The accountability lines stay clean: the Developer who directs the agent team owns the output. The Scrum Team owns the Increment.
Shared agent services — agents the whole team uses
Alongside personal agent teams, the team will increasingly use shared agent services. A code-review agent any Developer can ping. A security-scanning agent that runs on every pull request. A documentation agent that auto-updates the wiki. A compliance-check agent that flags issues before deployment.
These are not team members either. They are infrastructure. Like a build server, a linter, or a static analyzer — they provide output that humans evaluate, integrate, and take responsibility for. The shared agent does not accumulate trust the way a human teammate does. It does not get invited to retrospectives. It does not grow into a senior over time. It is a service the team uses.
The accountability rule is the same as for personal agents: whoever uses the output is accountable for it. The Developer who merges the code the shared agent reviewed is accountable for that code. The agent provided a draft, a check, a suggestion. The human took ownership.
Agent platforms — agents as enterprise infrastructure
At a larger scale, agents become infrastructure abstracted from individual developers. Developers do not write agents; they define workloads — input data, decision logic, output handling — and the platform provisions the agents, routes requests, handles scaling, and manages the lifecycle. This pattern is appropriate for thousands of operations per day, hundreds of agents across an organization, or 24/7 reliability requirements.
For most of our delivery work today, personal agent teams and shared agent services are where the action is. The platform pattern matters when our clients run agentic systems at enterprise scale — and at that point, the platform itself becomes part of the product, governed under the same architectural discipline as any other production infrastructure. The agents are still tools. The accountability still lives with the people who build, run, and use them.
Agents that one person directs are tools extending that person. Agents that many people use are infrastructure serving the team. An agent platform is infrastructure serving the organization. In none of these cases are the agents themselves on the team.
Why this distinction matters
The temptation, as agents get more capable, will be to elevate them to colleague status. A shared review agent that catches more bugs than a junior developer starts to feel like a teammate. It is not. It is a more capable tool. The fact that it can produce output that resembles a colleague's output does not make it one — it just means the tool got better.
Holding this line is not about being conservative for its own sake. It is about preserving what makes the team work: shared context, mutual accountability, professional judgment, and the relationship of trust between the team and the client. None of those things can be carried by an agent — personal, shared, or platform-scale, today or in three years. They live with people, in a small team, who understand what they are building and why.
That is the model we are investing in. AI Coaches, prompt libraries, senior development capacity, junior development pathways, working agreements at every level — all of it is aimed at helping our Developers become better conductors, with better workshops, while staying the people who own the work.
How the three approaches compare
The three positions described above lead to very different choices. Here is what each one looks like across the points where they diverge most. The 2020 Scrum Guide is included as a baseline because the AI-Augmented Scrum Guide proposes changes to it, and it is useful to see what those changes actually are.
Aspect | 2020 Scrum Guide | AI-Augmented Scrum Guide | Axon Model™ + AI — our position |
|---|---|---|---|
Who is on the team | Humans only. No sub-teams. | Hybrid. Up to 50% of Developers may be autonomous bots. | Humans only. AI extends each Developer; it is not a team member. |
Scrum Values | A human practice — Commitment, Focus, Openness, Respect, Courage. | Applied to AI: 'AI commits,' 'AI respects boundaries.' | Stay human. A Developer shows Courage by rejecting weak AI output. |
Definition of Done | A team commitment, governed by professional judgment. | Machine-enforced via 'Dynamic Quality Gates' that adapt over time. | Human commitment with AI-specific clauses (review, security scan, provenance). |
Accountability for AI output | Collective Scrum Team accountability for the Increment. | Individual human presenter takes 'full accountability' at Sprint Review. | The Developer who uses the tool owns the output. The Scrum Team owns the Increment. |
Daily Scrum | 15-minute event for Developers to inspect progress toward the Sprint Goal. | Reframed as 'Deviation Management' — humans parse logs and confidence scores. | Stays as it is. Add one question: 'What AI approach are you using, and is it working?' |
Compatibility with 2020 Guide | — | Conflicts with several immutable rules. | Fully compatible. The Scrum Team stays. The practice evolves. |
What this looks like in our delivery model today
Our internal AI Adoption Leadership Handbook (March 2026) and the AI Sidekicks operating model (April 2026) — both internal documents that govern how our teams work — put this position into practice. A few of the choices that follow from it:

The four AI Responsibility Levels. As autonomy increases left to right, human accountability does not disappear — it shifts from execution toward supervision and governance. The Developer who deploys an AI agent owns what the agent produces, at every level.
AI Responsibility Levels, not AI team members. We classify how responsibility is shared between human and AI on a given task — Assisted (Level 1), Augmented (Level 2), Supervised (Level 3), Orchestrated (Level 4). The human accountability shifts with the level. The team composition does not.
Working agreements at every level. From the first day a team uses any AI tool, baseline rules apply: AI-generated code is reviewed by a Developer who understands and owns it. Security-critical components are written by humans. Client data never enters public AI systems. Higher levels add prompt libraries, AI contribution tracking, and audit trails.
Architectural discipline scales review. As AI produces more code per Sprint, 'review every line' stops being realistic. Our Definition of Done and our architectural standards are written to make high-volume output reviewable: clear interface contracts, modular boundaries, integration tests against contracts before implementation. The human stays in control of the structure; the AI fills it in.
Senior Developers become more critical, not less. AI increases the volume of code that needs review. Internally we are explicit: senior review capacity is the most important leverage point in AI-augmented development, and squad leaders are accountable for protecting it. Working engineers measuring real production AI workloads have published numbers that make the same point: in a worked example of a compliance screening workload, monthly token cost was $25 and monthly human review cost was $29,000 — a 1,000× ratio. The line item that matters is human time, not compute.
Junior Developers are protected, not phased out. The industry cut junior listings by 60% since 2022. We invest the opposite way. Our juniors are our future seniors, and we measure their skill development separately from their AI-augmented output speed. The empirical case for this approach is stronger than it looks: an Anthropic study of 52 developers learning a new library found that heavy AI use made them marginally faster but led to 17% worse results on a knowledge test — with the difference depending entirely on whether AI was used for explanations or for delegating the work. Speed today, weaker foundation tomorrow, unless you actively manage how juniors use AI to learn.
We measure cycle time, not feelings — and we are introducing survivorship as our most honest measure of AI's contribution. Multiple studies have measured the gap between developer perception and reality: METR's 2025 randomized trial found developers experienced a 19% slowdown despite forecasting a 24% speedup before the study — a 43-point gap (a small sample of 16 developers; METR's February 2026 follow-up across 57 developers showed smaller and noisier effects). The largest study to date — Cui et al. in Management Science, 4,867 developers across Microsoft, Accenture, and a Fortune 100 firm — found a 26% throughput gain. Google's enterprise RCT (Paradis et al., 96 developers) found about 21%. Across all of these, measured effects are smaller and noisier than developer self-reports. Volume is even more misleading: AI that produces 4,000 lines a day, of which only 200 survive into production three months later, is not a 20× senior engineer. It is the same engineer with extra cleanup work. Survivorship — the share of AI-assisted code that endures unchanged in production — is the metric we are building toward, because lines generated is a vanity metric and lines that ship and stay shipped is the real one.
Clients hold the AI tool licenses; we hold the responsibility for making them work. Clients control which tools touch their codebase. We provide the governance, training, prompt libraries, and coaching that turn licenses into outcomes.
Inside the Scrum events themselves, what changes is small but specific. Each event gets one new question or one new section — the events stay, the practice evolves:

AI changes HOW work gets done. Scrum protects WHY the team exists. Each ceremony adds one AI-specific question — nothing gets removed, nothing gets renamed.
The risks we accept
Holding this position is not the same as holding a risk-free one. The trade-offs of each approach are worth naming honestly:
Approach | What you gain | What you risk |
Stay silent on AI | Nothing to undo later. | Becomes insufficient as AI matures. Teams improvise inconsistently. |
Treat AI as a team member | A complete answer for hybrid teams today. | Over-fits to current AI. Sub-teams entrench. Accountability and governance get harder, not easier. |
Treat AI as a Developer extension (Axon Model™ + AI) | Clean accountability. Works under FINMA, GDPR, FADP, and the EU AI Act. | Depends on developers maintaining their core skills. If AI delegation goes too far, the very expertise needed to supervise AI erodes — what we call skill atrophy. We manage this risk actively (senior review capacity, junior development pathways, Definition of Done that requires explainability), but it is the long-run watch item. |
The skill atrophy risk is the one we take most seriously. If Developers delegate enough of the craft to agents, the judgment they need to evaluate AI output erodes. Our answer is structural: senior review time is protected by squad leader accountability, juniors learn fundamentals before AI output is judged, and we measure skill development separately from velocity. Internally we treat being able to explain AI-generated code in a code review as a precondition for accepting it.
When would we change our mind?
This position is not faith-based. We are willing to be proven wrong by what happens in the field. Our model would need to be rethought if any of the following became true:
Outcomes become predictable. If software development became deterministic — input requirements, output product, no surprises — then iterative inspection and adaptation would be overkill. We would need a manufacturing process, not iterative discovery. There is no evidence we are heading there. Complexity is increasing, not decreasing.
Human judgment becomes optional. If AI systems could be held legally and morally accountable for outcomes, the entire concept of a Developer who owns the work would need rethinking. We are nowhere near this, and the regulatory direction (EU AI Act, FINMA guidance, FADP) is explicitly the opposite — pushing accountability toward identifiable humans, not away from them.
Stakeholder negotiation disappears. If AI could perfectly understand and reconcile competing stakeholder needs, the Product Owner role would dissolve. This is not happening. If anything, AI makes stakeholder alignment harder by making it cheaper to generate features that were never asked for.
Teams stop being teams. If software was produced entirely by individuals plus their tools, with no collaboration required, the small-team model would become irrelevant — but so would most of how organizations work. We are not in this scenario either.
None of these conditions are met. The model keeps working because the conditions that make it useful keep holding. If they stop, we will adapt.
Why this matters for our clients
Most AI-and-development content on the market is selling a transformation. We are not. Our differentiation is restraint: we do not restructure your team, we do not put bots in your standup, and we do not let machines decide what 'Done' means. We adopt AI aggressively at the Developer level — Level 3 is our 2026 target across all squads — but we keep the team, the accountability, and the model that have made the Axon Model™ work for fifteen-plus years. AI is the extension. The model stays.
Trust, transparency, and deep product understanding are what our clients hire us for. AI amplifies that. It does not replace it. And the human in the loop — the developer who knows your domain, your codebase, and your business — stays exactly where they have always been: at the helm.
Sources and further reading
All sources used in this article, with direct links to the original research and primary documents.
Frameworks and standards
The Scrum Guide 2020. Schwaber and Sutherland. The current canonical Scrum framework.
The AI-Augmented Scrum Guide. The proposal that treats AI agents as members of the Scrum Team — the position this article disagrees with.
Axon Model™. Our delivery framework, based on Scrum, refined across fifteen-plus years of running dedicated teams for regulated clients. AI is the extension, not the replacement.
Productivity research and field studies
Stack Overflow — 2025 Developer Survey. 84% of developers use or plan to use AI tools, up from 76% in 2024.
Cui, Demirer, Jaffe, Musolff, Peng, Salz — The Effects of Generative AI on High-Skilled Work (Management Science, February 2026). Three field experiments with 4,867 software developers across Microsoft, Accenture, and a Fortune 100 firm; 26.08% increase in completed tasks; greater gains for less experienced developers.
Paradis et al. — How much does AI impact development speed? Google enterprise RCT with 96 developers; about 21% speedup on coding tasks.
METR — Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity (July 2025). 16 experienced developers, 246 tasks; 19% slowdown observed despite developers expecting 24% speedup. METR's February 2026 follow-up with 57 developers shows smaller, less certain effects in newer cohorts.
Shen and Tamkin — How AI Impacts Skill Formation (Anthropic, January 2026). Randomized controlled trial with 52 developers; 17% worse knowledge-test results among AI users, with the gap depending on whether AI was used for explanation or delegation. Full paper on arXiv.
Bedard et al. — When Using AI Leads to 'Brain Fry' (Harvard Business Review, March 2026). BCG Henderson Institute survey of 1,488 US workers; 14% report cognitive exhaustion from AI oversight.
THE DECODER — Frontier Radar #2: Why AI productivity gets lost between benchmarks and the balance sheet (March 2026). Synthesis of 20+ academic and institutional sources on the gap between task-level AI gains and economy-level productivity.
Regulatory frameworks
FINMA. Swiss Financial Market Supervisory Authority. Regulator for our clients in the financial sector.
EU AI Act. The European Union's regulatory framework for artificial intelligence; in force across all EU member states.
GDPR. General Data Protection Regulation. The European data protection framework.
FADP — Swiss Federal Act on Data Protection. The Swiss data protection law, in force since September 2023. Particularly relevant for our Swiss client base.
Media coverage
Vietnam Economic Times — “Human touch” (Issue 454, April 27, 2026). Coverage of the DevDay 2026 panel on AI and software development. Features perspectives from Sebastian Sussmann (Axon Active), Talal Dib (Open Web Technology Vietnam), Tai Huynh (Kyanon Digital), Phan Van Binh (MGM Technology Partners Vietnam), and Professor Anand Nayyar (Duy Tan University). Panel consensus: coding is becoming easier, but delivering reliable business outcomes remains a human responsibility.
Agile thought leadership
Mike Cohn — The Cost of Change Curve Is Outdated (Mountain Goat Software, March 2026). AI is flattening Boehm’s cost-of-change curve; the bottleneck shifts from development effort to feedback delay. “Modern software development rewards adaptability more than accuracy”.
Practitioner sources
Bill Cox — AI at the Helm: The AI-Driven Revolution in Software Coding and CodeRhapsody. 240,000 lines of personal AI-supervised practice; the 35,000 vs 58,000 lines comparison; the 'harness' concept of structural discipline around AI.
Thomas De Vos — Claude Code: Building Production Agents That Actually Work (April 2026). Engineering field manual on production AI agents.
Anthropic — Claude Code. The agentic coding tool referenced in this article; designed around visibility into agent actions and explicit human approval for tool use.
