Posted 9 mins read
tl;dr: "The New SDLC With Vibe Coding" gets the big shift right: more software work starts from intent, context, and agent supervision instead of raw typing, but the real bottleneck moves to review, validation, and operational judgment.

I spent some time going through The New SDLC With Vibe Coding, subtitled From ad-hoc prompting to Agentic Engineering, by Addy Osmani, Shubham Saboo, and Sokratis Kartakis. It is a paper about how AI changes the software development life cycle. The short version is that I think it is mostly right about the direction of travel, but a bit too polished about the messy reality of working this way day to day.

The core idea is simple. The old SDLC assumed that the expensive part of software was producing code. The new one assumes that code generation is getting cheaper, while context, review, orchestration, and judgment become more important. That part tracks with what I see in real work. I can get code faster than before. What I cannot get automatically is confidence.

Why this paper matters now

There is already too much shallow AI commentary floating around, so I only pay attention when somebody tries to describe how the whole workflow changes instead of just saying developers will be "more productive." This paper at least tries to do the harder thing. It talks about planning, architecture, implementation, testing, review, deployment, maintenance, and the changing role of the developer across all of it.

That is the right frame. AI coding tools are not just autocomplete with better marketing. They change where effort goes. If code becomes cheaper to produce, the failure modes move upstream and downstream. Bad requirements get amplified faster. Weak architecture gets implemented faster. Sloppy review becomes more dangerous because there is now more output to review.

So even before getting into whether every claim lands perfectly, I think the paper asks the right question: if code is no longer the main constraint, what is?

The paper's main argument

The paper's argument, as I read it, has four parts.

First, the traditional SDLC is under pressure because AI can now participate in every phase, not just implementation. Requirements can be turned into drafts, architecture can be sketched, test cases can be proposed, refactors can be executed, and maintenance work can be accelerated.

Second, the developer's job shifts from syntax production toward intent specification. The paper comes back to this repeatedly. Instead of spending most of my time manually expressing every implementation detail, I spend more time describing what should happen, constraining the problem, correcting the model, and evaluating whether the result is actually acceptable.

Third, context engineering and harness engineering become first-class skills. That means the quality of the result depends less on whether I can type quickly and more on whether I can provide the right inputs, guardrails, tools, policies, and feedback loops around the model.

Fourth, the economics change. The paper contrasts low-effort vibe coding with more disciplined agentic engineering. That distinction is useful. One approach feels cheap at the start and expensive later. The other costs more to set up but scales better because it has structure, repeatability, and better controls.

I think that is the best part of the paper. It avoids the childish version of the conversation where AI either replaces everyone or changes nothing.

What I found useful

The section on the shift from syntax to intent is the strongest one. I do think "intent as interface" is a real idea, not just branding. A lot of modern engineering work is starting to look like defining a target state clearly enough that a machine can help move toward it. That still requires technical depth. In some cases it requires more of it, because vague intent produces plausible garbage at high speed.

The paper is also right to emphasize context engineering. In practice, the difference between a useful coding agent and an annoying one is usually not the model alone. It is whether the model has the right repository context, constraints, architecture hints, acceptance criteria, and access to tools. A strong engineer with a mediocre prompt is often less effective than a strong engineer with a good harness.

I also liked the framing around developers acting as conductors and orchestrators. That matches what happens once you stop treating the model like a fancy text box and start treating it like a system component. You are not just asking for code. You are deciding what should be delegated, what needs synchronous oversight, what should be sandboxed, and what must never be trusted without verification.

The vibe coding spectrum is another useful inclusion. I am glad the paper does not pretend every AI-assisted workflow is the same. There is a big difference between hacking together a disposable prototype and producing something that has to survive security review, on-call rotations, and future maintenance.

Where I think the paper is too clean

The paper is strongest when it talks about direction and weakest when it risks sounding frictionless.

My main issue is that AI does not remove engineering difficulty. It often redistributes it into areas that are harder to measure and easier to underestimate. If a model writes 500 lines in minutes, that does not mean I saved the time I would have spent writing 500 lines. It may mean I now need to review 500 lines, validate assumptions I did not make explicitly, and test behavior that looks reasonable but is subtly wrong.

That is where the "hidden debt of vibe coding" section matters, and I think it could have gone even harder. The debt is not just bad code style or some generic maintainability problem. It is unclear intent, shallow understanding, weak traceability, and fragile systems that nobody can confidently modify later because too much of the reasoning lived in a transient chat instead of durable project artifacts.

I also think papers like this tend to understate operational reality. Production systems are not judged by whether code appeared quickly. They are judged by whether alerts stay quiet, rollbacks are rare, incident handling is sane, and future changes remain safe. Faster implementation is useful, but only if the surrounding controls keep up.

Security is another area where optimism needs a leash. The document clearly acknowledges security risks around AI-generated code, which is good. But the practical burden remains human. Someone still has to verify dependency choices, auth logic, secret handling, authorization boundaries, unsafe defaults, and deployment blast radius. Models can help inspect those things. They do not remove the need for somebody accountable to care.

What changes in the SDLC, in plain English

The paper breaks the lifecycle into the usual phases and argues that AI touches all of them. I think that is correct, but the impact is uneven.

In requirements and planning, AI is useful for turning fuzzy ideas into candidate scopes, edge cases, and drafts. That is helpful, but it does not solve the hard problem of deciding what actually matters.

In design and architecture, it can generate options and tradeoff summaries quickly. That is great for exploration. It is much less great when teams confuse generated architecture prose with actual architectural clarity.

In implementation, the gains are obvious. This is where the tools feel magical because they remove the most visible manual work. But this is also where people get overconfident, because clean-looking code is not the same thing as a well-understood system.

In testing and QA, AI can help generate tests, suggest missing cases, and speed up coverage work. I find that useful. But the test plan still reflects the assumptions I choose to encode. A model can multiply good testing judgment or multiply shallow testing judgment.

In review, deployment, and maintenance, the real bill shows up. If output volume rises, review quality has to rise with it. If more changes make it to production faster, monitoring, rollback discipline, and operational ownership matter more, not less. Maintenance also becomes more dependent on good context because future engineers need to understand not just what the code does, but why it exists.

My take on the economics

One of the more interesting parts of the paper is the contrast between low CapEx, high OpEx workflows and high CapEx, low OpEx workflows. That sounds abstract, but it is actually practical.

Throwing prompts at a model with no structure is cheap at the start. Anyone can do it. Sometimes that is exactly right. If I am building a throwaway script for myself, I do not need a whole system around it.

But once the work matters, the economics flip. Good repository structure, clear acceptance criteria, tool permissions, test scaffolding, review discipline, and reusable context all cost effort up front. They are the setup cost for using agents safely and repeatedly. That investment is boring compared to the demo, but it is where sustainable leverage comes from.

This is why I think "harness engineering" is probably the most important phrase in the paper. The model is only one part of the system. The surrounding process decides whether the output is a liability or an advantage.

What I would actually take from it

If I had to reduce the paper to a few practical lessons, they would be these.

Write better requirements, because ambiguity compounds faster now.

Treat context as infrastructure. Good docs, clear boundaries, useful tests, and explicit constraints are no longer nice extras. They directly affect the quality of machine-generated work.

Invest in review and verification, not just generation. If output gets cheaper, validation becomes the scarce resource.

Use vibe coding where the risk is low and the time horizon is short. Do not pretend that prototype habits automatically scale into production habits.

Build harnesses, not just prompts. If a workflow matters, it deserves repeatable tools, guardrails, and acceptance checks.

Final thought

I think The New SDLC With Vibe Coding gets the big picture right. The SDLC is changing, and the center of gravity really is moving from manual code production toward intent, context, orchestration, and judgment.

What I would add is that this shift does not make engineering less demanding. In many ways it raises the bar. The hard part is no longer proving that I can write code. The hard part is proving that I can shape systems, constrain automation, and keep quality from collapsing under the weight of faster output.

That is the version of the "new SDLC" I believe in. Not magic. Not replacement. Just a new distribution of effort, with new leverage and new ways to fail.