Threat Model & Reference

Vibe coding security: a practical threat model.

By Gil Salu, Cyaxios

Generating an application by prompting an AI agent moves faster than any human team. The security exposure moves faster too. This is a vendor-neutral reference: what new risks AI-generated code introduces, how those risks map to existing regulation, and the controls that contain them.

What is vibe coding?

Vibe coding describes building software by describing intent to a code-generating model and accepting the output, rather than writing each line by hand. The developer steers; the model produces functions, schemas, configuration, and infrastructure. The reviewing burden shifts from authoring code to inspecting code, and at the volume a model emits, that inspection is rarely complete.

The security problem is structural, not a matter of any one model being careless. A code generator optimizes for output that satisfies the prompt and passes whatever check it can see. It has no standing knowledge of your data classification, your regulatory perimeter, or which fields are sensitive. So it reaches for the convenient pattern: log the whole object, grant the broad scope, embed the literal value. Each choice is locally reasonable and globally wrong.

Speed changes the math in a second way. Traditional secure development assumes a human reads every line before it ships, so review scales with the team. Generated code arrives in bulk, often faster than anyone can read it, so the proportion that gets a careful human pass falls. The controls that used to be a backstop become the primary line of defense. A threat model that assumed thorough manual review no longer holds.

Is AI-generated code less secure?

Not categorically, but it fails in patterned ways that differ from hand-written code. Human developers accrete a codebase slowly and carry context about why a boundary exists. A model produces large, plausible blocks in one pass with no memory of prior decisions and no incentive to ask. The result is code that runs on the first try and quietly violates an invariant the model was never told about. The defect rate is not necessarily higher; the defects are more uniform, more confidently wrong, and more likely to ship because the output looks finished.

The uniformity matters for defenders. Because a generator reproduces the same convenient patterns across many files, a single class of flaw tends to appear in many places at once rather than as one isolated mistake. That cuts both ways. It means a single review finding often points to a systemic issue worth a codebase-wide check, and it means a control that catches the pattern once can catch it everywhere. The right response is not to distrust generation, but to assume the patterned failures will occur and to put automated controls where manual review used to sit.

A threat taxonomy for vibe-coded applications

The exposures below recur across AI-generated codebases regardless of language or framework. Treat them as a checklist for reviewing agent output and for scoping a threat model.

Secrets and PII in code and logs

Hardcoded API keys, connection strings, and tokens; full request objects, stack traces, and user records written to log streams in cleartext. Maps to CWE-532 (insertion of sensitive information into log files).

Insecure defaults

Permissive CORS, disabled TLS verification, debug mode in production, world-readable storage, and missing authentication on generated endpoints. The model picks whatever defaults make the example run.

Injection

String-concatenated SQL, shell commands built from user input, and unescaped output rendered to a page. Classic OWASP Top 10 categories that a generator reproduces from training-data patterns.

Over-broad agent data access

An autonomous agent granted a wildcard database role, a long-lived admin token, or unrestricted filesystem access because that was the path of least resistance to a working demo. Violates least privilege at the identity layer.

Supply-chain exposure

Dependencies pulled in by name without pinning, typosquatted or hallucinated package names, and transitive packages no one reviewed. The generator suggests imports faster than anyone vets them.

Prompt-injection data exfiltration

Untrusted content reaching an LLM in the application loop can redirect tool calls to leak data or take unintended actions. Covered by the OWASP Top 10 for LLM Applications as prompt injection and insecure output handling.

Two of these are amplified specifically by AI authorship. Secrets and PII in logs grow because a model that cannot tell which field is sensitive logs all of them. Over-broad access grows because the model resolves a permission error by widening the grant rather than narrowing the request.

How a log leak happens in practice

The most common exposure is also the most mundane. Asked to add diagnostics around a login, a generator writes a line that serializes the whole user object. That object holds an email, an IP address, a session token, and whatever else the record carries. The line runs, the test passes, and the code ships. From then on, every login writes personal data and a live credential in cleartext to a log stream that is often forwarded to a third-party aggregator outside the application's trust boundary.

# Generated diagnostic: serializes the entire object
logger.info(f"login: {user.__dict__}")

# Resulting log line, in cleartext, forwarded downstream:
# {"email": "...", "session": "sk_live_...", "ip": "..."}

Nothing here is exotic. There is no exploit, no attacker, no clever payload. A reasonable-looking line of generated code turns routine telemetry into a standing disclosure. This is the category that field-level controls exist to close, and it is the one a partial review is most likely to miss because the code behaves correctly.

What regulations apply to vibe-coded apps?

Generated code is held to the same standards as hand-written code. No statute exempts software because a model produced it. The obligations below are the ones most often triggered by the threats above.

GDPR Article 25 and Article 32

Article 25 requires data protection by design and by default: technical measures that minimize personal data and limit processing must be built in, not bolted on. Default-permissive generated code is the opposite of by-default protection. Article 32 requires security of processing appropriate to the risk, naming encryption and pseudonymization as example measures. PII written to plaintext logs is a direct Article 32 concern.

PCI DSS

Any application that touches cardholder data inherits PCI DSS scope. Logging a primary account number or authentication value, even incidentally through a dumped object, is a control failure. Generated logging code is a common source of accidental cardholder-data capture.

SOC 2

SOC 2 evaluates controls against the Trust Services Criteria, with security as the baseline. An auditor looks for evidence that access is least-privilege, that sensitive data is protected at rest and in transit, and that change is reviewed. Unreviewed agent output and over-broad grants undercut exactly those assertions.

OWASP guidance

The OWASP Top 10 remains the reference enumeration for web application risk, and the OWASP Top 10 for Large Language Model Applications extends it to LLM-in-the-loop systems. Together they cover the injection, access, and prompt-injection categories in the taxonomy above.

NIST AI Risk Management Framework

The NIST AI RMF gives a voluntary structure for governing AI risk through its Govern, Map, Measure, and Manage functions. For teams shipping AI-generated code it offers a way to document where generated components sit, what could go wrong, and which controls apply, which is increasingly what auditors and customers ask to see.

These obligations are not mutually exclusive. A single mishandled field can trigger several at once: a logged card number is a PCI DSS failure and, if that number belongs to an EU resident, a GDPR Article 32 failure, and the absence of a control to prevent it weakens the SOC 2 security assertion. Mapping each threat in the taxonomy to the obligations it touches is the work a defensible AI-risk program documents, and it is the artifact regulators and enterprise buyers increasingly expect before they trust generated software.

How do you secure AI-generated code?

No single control is sufficient; the practices below compose into defense in depth. They apply whether the code was written by a person or a model, but they matter more when authorship is automated and review is partial.

Baseline mitigations

Secure defaults: ship configurations that fail closed. Authentication on by default, TLS verification on, debug off, storage private. Generated code inherits whatever your templates and scaffolds establish.
Least privilege: give each agent, service, and token the narrowest scope that works, with short lifetimes. Resolve a permission error by tightening the request, never by widening the grant.
Secrets management: keep credentials out of source and out of logs. Use a managed secret store and injected environment, and scan commits for secrets. See the OWASP Secrets Management Cheat Sheet.
Log hygiene and field sealing: log structured events with named fields rather than whole objects, redact or encrypt sensitive fields at the point they are written, and never serialize raw request or user objects.
Review agent output: treat generated code as untrusted input. Gate it with static analysis, dependency scanning, and a human or automated check for the taxonomy categories before it merges.

Of these, log hygiene is the one most often skipped, because the leak is invisible until someone reads the logs. Logging named fields instead of whole objects, and sealing the sensitive ones before they reach disk, removes an entire class of exposure that generated logging code reliably introduces.

Order matters when applying them. Secure defaults and least privilege are preventive: they shrink the blast radius before any code is written, by making the convenient pattern also the safe one. Secrets management and log hygiene are containment: they limit what a mistake can expose. Review of agent output is detection: it catches what the first two layers missed. A program that invests only in detection will drown in findings, because every generated file becomes something to inspect. Putting the preventive and containment layers first means the generator inherits guardrails rather than producing problems for reviewers to find later.

Where sealing logs at the source fits

Field-level sealing is one mitigation in the list above, and it is the niche the TN protocol addresses. TN is an open attested-logging protocol: each entry is signed at the source so its origin can be proven, sensitive fields are encrypted per authorized reader before the bytes reach disk, and entries are chained so the record can be verified after the fact. The encryption happens in-process, so a logging call written by an agent seals its sensitive fields whether or not the agent understood they were sensitive. That covers the secrets-and-PII-in-logs category specifically; it does not address injection, supply chain, or access scope, which need the other controls above.

For deeper background, see the explanation of DRM for logs and how attested logging works. The standards referenced here are linked below.

References

OWASP Top 10: owasp.org/www-project-top-ten
OWASP Top 10 for Large Language Model Applications: owasp.org/www-project-top-10-for-large-language-model-applications
NIST AI Risk Management Framework: nist.gov/itl/ai-risk-management-framework
GDPR Article 25, Data protection by design and by default: gdpr-info.eu/art-25-gdpr
GDPR Article 32, Security of processing: gdpr-info.eu/art-32-gdpr
CWE-532, Insertion of Sensitive Information into Log File: cwe.mitre.org/data/definitions/532.html
OWASP Secrets Management Cheat Sheet: cheatsheetseries.owasp.org

← Back to home