Chimp with a Machine Gun

In Better Call Saul, Chuck McGill tells his younger brother Jimmy:

"Slippin' Jimmy with a law degree is like a chimp with a machine gun."

It is a deliberately ugly line. Chuck uses it to dehumanise Jimmy and to justify the lengths he is willing to go to keep him out of the law. The metaphor is partly fair: Jimmy genuinely does take risks, bend rules, cut corners, and put real people at real risk. It is also partly unfair: it leaves out everything human about Jimmy, and treats capable of doing damage as the same thing as guaranteed to do damage. That tension is the whole point of the show.

I keep coming back to that line, but for a different reason. I keep meeting people who fit it.

Today, in tech, the chimp is not a person. It is the situation. Generative AI — ChatGPT, Claude, Cursor, Lovable, Copilot, Antigravity, the next one that will land next month — has put a working machine gun into the hands of anyone with an internet connection and a laptop. People who have never written software are now shipping software. I am not writing this in vandalism mode.

I am writing it because I have started to see how this ends. The bullet usually finds the person holding the gun — their savings, their time, their reputation, their next job. The people they were trying to build something for usually walk away unharmed. The cost lands on the builder.

What I am not saying

I am not saying do not try. People should try. Trying is the only way anything new ever happens.

I am not saying you have to hire experts before you do anything. Elon Musk did study mechanical engineering before he investing Tesla, and he did study aerospace before he started SpaceX, though he hired the experts. I am asking that of everyone. Most ideas do not yet have the budget to hire anyone, and that is fine.

What I am asking is much smaller. Before you do anything serious — before you spend money you cannot get back, before you put a product in front of a customer, before you bet a year of your life on it — spend some time on preparation. That is the only ask in this whole post.

What I am saying

The LLM tooling pitch right now is roughly: use our product, and you can ship a SaaS, a PaaS, a web application, a workflow, a side business — without hiring a software engineer. Technically, that is not a lie. You can. The screen will load. The button will work. Stripe will probably accept the payment.

What the pitch does not say out loud is the same thing private financial institutions do not say loudly when they sell you stock-market investments: here is the actual risk. That part lives in the small print. It is not in the marketing video.

It is on us — not on them — to stay aware of the risks and make the decision. Not everyone wins in the stock market. Even experience does not guarantee a win. The people who consistently come out on top are the ones with expertise — the ones who know when to play, and when not to. Software is the same.

Why the LLM is not the engineer it looks like

There is a structural reason for this, and it has nothing to do with whether the model is "smart enough."

LLMs are trained on data. They are not trained on the process that generated the data. When ChatGPT or Claude is trained on decades of human-written code, what it sees is the code — committed to GitHub, posted on Stack Overflow, embedded in documentation. What it does not see, because most of it was never written down, is the considerations behind the code. The trade-offs. The bug that woke somebody up at 2 a.m. and quietly led to that one defensive if statement. The cost calculation that ruled out a different approach. The security review that removed a feature. The conversation that decided this is a queue, not a database write.

The code survived. The thinking did not.

So when you ask an LLM for "a full-stack application," what you get back is statistically excellent code — on the common case. The kind of pattern that has been written ten thousand times before. On niche libraries, unusual constraints, or anything genuinely new, the quality degrades faster than the demo will let you see. And in either case, what you do not get is any assurance that the code is safe, non-vulnerable, performant under load, cheap to maintain, or unlikely to turn into a six-figure monthly cloud bill in eighteen months. That is not the model's responsibility. The model is not on the hook for the outcome. You are.

Some readers will say: yes, but if your prompt is good, the output is good. They are not wrong. The prompt shapes the output. And modern agentic models go further — they can iterate on their own prompts, run the code, see what fails, refine, and try again. That has narrowed the gap between what the user asked for and what the model actually attempts.

But the initial framing still has to come from somewhere. Whether to build at all. What constraints actually matter. Which trade-offs are acceptable. Whose data is on the line. None of that lives in the model. So the question still stands: where is the first prompt supposed to come from? A trained developer with ten years of experience misses considerations all the time — that is exactly why production bugs exist. How is a beginner, or a non-technical founder, supposed to know what to put into the prompt that starts the loop? And even if they get that prompt right — can we honestly say the LLM will return exactly the right code with no hallucination? Hallucination rates have come down a lot with retrieval, tool use, and reasoning chains, but they are not zero. On the sharp edges that actually matter — specific API versions, auth flows, security corner cases, infrastructure pricing — they are still high enough to draw blood. You and I both know the answer.

The cooking analogy

I keep using this one because it lands.

After enough YouTube cooking videos and saved recipes, plenty of people start cooking real meals at home. Some of them get good. A few get very good. None of that, by itself, makes them ready to open a restaurant.

A restaurant is not a cooking problem. It is a kitchen problem. A supply-chain problem. A costing problem. A hygiene problem. A customer-flow problem. A staffing problem. A tax problem. Cooking well is one of maybe twenty skills you need, and on its own, it does not bring the business.

Being good at one thing does not make you good at the things sitting next to it. AI does not change that.

What changes is who is hiding that gap from you. The AI itself does not hide it. The companies selling the AI do. When the tool feels like it is helping you, the people it is really built to help are the investors who funded the company building it. You are paying for the privilege. You are also being measured, profiled, and routed.

Your digital footprint — every post, every contribution, every conversation you have ever had on these platforms — has already been read and weighted. The system knows what to sell you before you know you want it. So the offer arrives perfectly shaped to you: first as a free tier, then as a small subscription that feels fair, then as a larger one you cannot easily walk back. Sometimes ending in a contract or a debt you did not quite realise you signed up for.

Be cautious about who is selling, what they are selling, and who they are really selling for.

What I keep seeing

This is not theoretical. Four things from the last few months alone.

1. The ex-colleague with the ML idea. A friend who moved from a consulting team had a genuine machine-learning product idea. He started building it in Lovable. When he got stuck on the payment gateway and the ML integration, he reached out to me. He told me, sincerely, that he believed he could run the whole thing on roughly ₹10,000 a month using free tiers, ChatGPT, and Gemini. I sat with him and worked out a rough infrastructure estimate against his actual workload — storage, compute, ML inference cost, payment gateway fees. The number was nowhere near ₹10,000. If the business did not pick up, there was nothing to recover from. He paused the project.

2. The Databricks workflow on my current engagement. We are building a workflow that pulls data from external sources like AWS, runs it through a medallion architecture, and lands the cleansed data in the gold layer. The design was generated cleanly in Cursor. The stories went into the sprint. Development is almost done with very little intervention from me. From the beginning I had a feeling the design was off, but I am not the deepest expert in this exact stack, so I did not have airtight arguments to push back. We shipped roughly the design. Then I started watching the run times, the resource utilisation, the actual cost. It is not great. The damage is done. That is exactly the case I am trying to make: an actual expert in this stack would have flagged this before a single line of pipeline code was written. Why would anyone on the team worry about it now? The client is paying.

3. The senior architect with the Copilot story. a senior solution architect in my current company, a Java expert by background, told me he uses Copilot to generate React code, and that because he can do that, he can train juniors to do the same. He may genuinely be able to ship React this way. The part I cannot reconcile is the second half. If your training plan for juniors is learn to prompt Copilot, I have a quiet question about your architect and mentor skills too.

4. My nine-year-old son. With Google's Antigravity tool, he built a car game in Python, got it running, and pushed the code public — without yet knowing what a print statement does. I am not upset about this. I am proud. I am also extremely aware of what just happened. A child shipped code he cannot read. Now multiply that by every adult with an idea and a credit card.

None of these four people are doing anything wrong. They are trying. Trying is good. The worry is not the trying. The worry is everything that is not on the screen — all the parts of software that a working demo does not show.

Why I think I get to say this

I started as a UX designer. Then a frontend developer. Then a backend developer. Then a full-stack developer. Then a microservices architect. Now I work in data engineering.

I do not list this with pride. Every single time I stepped into a new layer, I did not know much. I learned the same way most honest engineers do — make it work first, make it better later, learn from people who already had the scars, and then, mostly, learn from my own mistakes. Production has been my best teacher, and it has charged me tuition every year.

That is the only credential behind this post. I have been the beginner in every one of these rooms, which means I know what beginners do not know — because I once did not know it either.

How I actually use AI

I should be clear about something before this post turns into a sermon. I use AI heavily. Every day. This is not a complaint about the tools. It is a complaint about using the tools without the layer underneath.

The way I think about it is simple: the vision is mine. The execution is faster, safer, and cheaper because of AI. I do not ask the model to tell me what to build. I ask it to help me build what I already know I want, and then I push back on what it gives me. Four examples from the last few months on real engagements:

Security and infrastructure review. I have used AI to surface security gaps and architecture improvements that genuinely reduced a client's cloud spend. Not by guessing — by pointing me at patterns I would have caught on my own with more time. I read the output, I argued with it, I kept what was right.
Compressed delivery. I shipped a complete UI application in a single sprint that was originally planned for an entire quarter. Full testing, performance budget, security checks — all of it. It was not magic. It was AI doing the rote parts at speed while I drove the design, the trade-offs, and the review.
End-to-end test automation. I built out Cucumber-based E2E automation with much better structure and far more reusability than I would have written by hand under that time pressure. The design choices were mine. The expansion was the model's.
Low-cost AI agent on Databricks. I built a working Databricks AI agent for a proof-of-concept at a cost that would not have been reachable on a hand-written timeline. The scope was kept deliberately small, because the goal was a PoC and not a production launch — and knowing where to draw that line is itself part of the discipline.

A beginner could attempt every one of those four things tomorrow. With AI, they could probably produce something that runs. But mine ran better, faster, and cheaper — and someone with more experience than me, of which there are plenty, would do all four better still. That is the entire point.

Experience is not just what you delivered. It is how you delivered it — what you noticed, what you skipped, what you refused to ship, what you knew to ask the model in the first place, and what you knew to throw away when the model got it wrong. The how is what the LLM cannot hand you. It is what production has handed me, year after year. It is what the rest of this post is trying to protect.

What "just build a full-stack app" actually contains

When the LLM marketing video says build a full-stack app, here is what it is quietly skipping over:

UX design — usability, accessibility, engagement, business impact
Infrastructure and architecture design
Cloud and hosting selection
Database choice and schema design
Frontend development
Backend development
Observability, security, reliability, scalability
Automation and CI/CD
Documentation
Failure isolation
Versioning and backward compatibility
Microservice strategy, if you go that way
Running cost — and the ongoing work to keep it from eating you alive

Each of those is its own discipline with its own checklist that takes years to internalise. Below are five of those layers, in the smallest form I can put them — five points each. Every one of these points is the surface of a much longer list, but this is enough to show you what is actually under the marketing copy.

UX design. The screen people see is the result of decisions made before any tool was ever opened. The five things that matter most:

Understand your users — who they are, what they need, how they actually behave, not how you imagine they behave
Define the problem first — articulate what you are solving before you touch wireframes or a actual code
Run a real process — empathy, definition, ideation, prototyping, testing, not a one-shot draft
Design for usability and accessibility together — easy to use, and usable by people regardless of ability, device, or connection
Balance the three pulls — user needs, business goals, technical constraints; designs that ignore any one of these die in production

Software design and architecture. Architecture is what is expensive to change later. Get the cheap things wrong and you fix them in an afternoon. Get the architectural things wrong and you live with them for years.

Design principles — separation of concerns; the rules that keep code readable, testable, and changeable as the team grows
Design patterns — the standard catalogue (Factory, Strategy, Observer, Adapter, Repository, and the rest); knowing what you are reaching for instead of reinventing it badly
Architectural styles and patterns — monolith vs microservices vs serverless; layered, hexagonal, clean, event-driven, CQRS; choosing the shape that fits the problem, not the trend
Domain-driven design — bounded contexts, aggregates, ubiquitous language; designing the code around the actual business, not around whatever framework is fashionable this year
Decision records and diagrams — Documenting why the architecture is the way it is, so the next person does not undo it by accident

Database and schema planning. Code is rewritten every sprint. Schemas outlive the code that reads them — sometimes by decades. Five decisions worth making before you write the first migration:

Database choice — relational, document, key-value, wide-column, search, time-series, OLAP, graph, or some combination; pick the engine that matches the access pattern, not the one your team already knows
Schema and data modelling — normalisation vs denormalisation, surrogate vs composite keys, audit columns, soft delete, JSON columns; which tables are source-of-truth and which are derived
Indexes and query design — single-column, composite, covering, partial, full-text; reading EXPLAIN; eliminating full table scans; keyset over offset pagination at scale
Transactions and consistency — ACID, isolation levels, optimistic vs pessimistic locking, what must be atomic, what can tolerate eventual consistency, how conflicts are resolved
Backups, migrations, and retention — backup strategy, tested restore drills, online schema changes, rollback plans, retention windows, and a written RTO/RPO target before a disaster forces you to write one

Frontend. The browser is the most hostile runtime in mainstream computing. Five layers that decide whether a frontend feels like a product or a toy:

Rendering and page architecture — CSR, SSR, SSG, ISR, streaming; SPA vs MPA; pick what matches the workload, not the trend
Core building blocks — app shell, router, component layer, state store, data-fetching layer, cache, persistence
Data and state management — server state vs client state vs ephemeral UI state; what survives a refresh, what lives in the URL
Mandatory UI states — loading, success, error, empty, stale, offline, unauthorized, optimistic-pending; every screen needs all of them, and most do not
Performance discipline — bundle splitting, lazy loading, virtualisation, image and font handling, watching the web vitals

Backend. The backend is the part the user never sees and never forgives when it breaks. Five concerns that separate a hobby backend from a production one:

API contracts and versioning — what the backend promises, what changes safely, what a retry does, what an empty response means
Authentication and authorization — identity, sessions, JWT/OAuth/API keys, tenant isolation, least privilege
Reliability primitives — timeouts (always), retries with backoff and jitter, idempotency, circuit breakers, dead-letter queues
Performance and scale levers — caching at multiple layers, async work via queues, read replicas, horizontal scaling, connection pool sizing
Observability and operations — structured logs with correlation IDs, RED/USE metrics, distributed tracing, runbooks, SLOs

Data engineering, with AI and ML on top. This is where a lot of the current LLM-in-production work actually lives, and it is the most invisible from the marketing video:

Source-to-store pipelines — extracting from operational databases, APIs, files, and streams; cleaning, joining, deduplicating; landing the result somewhere queryable
Data quality and governance — schemas, data contracts, lineage, PII classification, retention windows, audit trails
Layered storage — bronze/silver/gold (or raw/staging/curated/serving); keeping OLTP and OLAP separated; choosing the right file formats (Parquet, Delta, Iceberg)
Cost and performance — partitioning, clustering, compute right-sizing, query-cost monitoring; the single biggest invisible bill in modern stacks
AI/ML integration — feature stores, training-data versioning, model serving, evaluation, drift monitoring; for LLM systems, prompt/output guardrails, eval datasets, and human-in-the-loop review for anything sensitive

This last layer is where "the LLM ships it" hides the most cost and the most risk. The model can write the SQL. It cannot tell you whether the partition strategy will cost you ₹3 lakh next quarter, whether your training data is leaking PII, or whether your model has been silently drifting since last Tuesday.

Five layers. Five tiny lists. That gap — between the demo working and the product surviving in production — is the gap the LLM closes very poorly. It is excellent at the first part. It is almost useless at the second.

The actual takeaway

Build the idea. Try the new skill. Use the tools. Use AI — I use it every day, and the people telling you not to are either lying to you or selling you something else.

The line I want to leave you with is simpler than the rest of this post:

If you understand the fundamentals and you are driving the work, you are in control.

If you do not understand, and the thing is somehow working, you are not in control. The tool is.

When you are not in control, one of two things happens. Either you keep paying — directly in cloud bills, indirectly in lost time, lost customers, lost sleep — until the project either turns a corner or breaks you. Or you ship something dangerous in a domain where being wrong is expensive — money, data, reputation, somebody else's career, your own career — and you find out in public.

Jimmy McGill ends Better Call Saul serving 86 years in federal prison — not because the law degree was a trick, and not because he was incapable. The gap between what he could do and what he should have done finally caught up with him, in the most expensive way it could. The bullets found Jimmy too.

Anyone telling themselves AI will solve my problem should sit with that ending. AI will not solve the problem. It will quietly trade it for a different, more expensive one — the one the demo never showed you.

Do not let an LLM put a machine gun in your hands and tell you that you are now a marksman.

You can absolutely become one. Just give yourself the time first — time to learn how the gun works. Time to learn what the target actually is, where it is, and how to hit it with fewer bullets. Time to figure out which gun you actually need.

And, eventually, the question nobody asks early enough: do you need a gun at all? Or did you only ever want to hit the target?