Written by 10:20 am Blog Views: 0

Claude-Generated Code Exposes Firefox Vulnerabilities: What Developers Must Know

Cybersecurity vulnerability warning signs representing AI-generated code risks

In early 2025, security researchers discovered that a popular Firefox extension, built almost entirely with AI-generated code, contained critical vulnerabilities that could expose user browsing data and allow cross-site scripting attacks. The extension had passed Mozilla’s review process, accumulated thousands of users, and looked perfectly professional. The code was clean, well-structured, and deeply flawed in ways that only became apparent under targeted security analysis.

This wasn’t an isolated incident. It was a signal of something the security community has been warning about since AI coding assistants went mainstream: AI generates code that looks right, compiles correctly, and passes basic tests, while harboring security vulnerabilities that even experienced developers can miss during review. And as AI-generated code proliferates across production systems, the attack surface is expanding faster than our ability to audit it.


The Firefox Extension Incident: What Actually Happened

The vulnerable extension was a productivity tool, the kind of browser add-on that millions of people install without a second thought. Its developer, working solo, had used an AI assistant to generate the majority of the extension’s codebase, including the content scripts that interact with web pages and the background service worker that manages state.

The AI produced code that followed Firefox’s WebExtension APIs correctly. The manifest file was properly structured. The permissions requested were reasonable for the extension’s functionality. On the surface, everything looked right.

The problems were subtle and systemic:

  • Unsafe innerHTML usage in content scripts. The AI-generated code used innerHTML to inject UI elements into web pages. This is a classic XSS vector, if any part of the injected content is derived from untrusted data, an attacker can execute arbitrary JavaScript in the context of the page. The AI had chosen innerHTML over safer alternatives like createElement and textContent because it produces more concise, readable code.
  • Overly broad message passing. The extension used the browser’s message-passing API to communicate between content scripts and the background worker. But the AI hadn’t implemented origin checks on incoming messages, meaning any web page (or other extension) could send messages to the background worker and trigger privileged operations.
  • Stored data without sanitization. User preferences and cached data were stored using the browser’s storage API without sanitization. An attacker who could inject malicious data into the storage (via the message-passing vulnerability) could achieve persistent XSS, code that executes every time the extension loads.
  • No Content Security Policy. The extension’s manifest didn’t include a Content Security Policy (CSP), which would have mitigated several of the other vulnerabilities by restricting what scripts could execute.

Each of these issues is well-known in the browser extension security space. None of them are exotic or novel. And that’s precisely the point, AI generated code that worked perfectly from a functional standpoint while ignoring security best practices that any extension security specialist would consider fundamental.

Why AI Generates Insecure Code Patterns

Understanding why AI produces insecure code requires understanding how these models learn. AI coding assistants are trained on massive datasets of existing code, open source repositories, documentation, tutorials, Stack Overflow answers, and more. This training data represents how code is actually written in the wild, not how it should be written from a security perspective.

The uncomfortable truth is that most code in the wild is insecure. The majority of tutorials skip security considerations for brevity. Stack Overflow answers optimize for solving the immediate problem, not for defense in depth. Open source projects vary wildly in security maturity. AI models absorb all of these patterns and reproduce them with supreme confidence.

The specific reasons AI favors insecure patterns

There are several structural reasons why AI coding assistants tend toward insecure code:

  1. Training data bias: Insecure code is far more common than secure code in training datasets. For every properly parameterized SQL query in the training data, there are dozens of examples using string concatenation. AI learns what’s common, not what’s correct.
  2. Optimization for readability: AI models are tuned to generate code that looks clean and is easy to understand. Security-hardened code is often more verbose and complex. Given the choice between innerHTML = data and a five-line safe DOM manipulation, AI will choose the shorter version because it scores higher on readability metrics.
  3. Context window limitations: Security often requires understanding the broader context, where data comes from, who can access an endpoint, what trust boundaries exist. AI generates code based on the immediate prompt context, which typically doesn’t include the full threat model.
  4. No adversarial thinking: Good security engineering requires thinking like an attacker. AI models don’t think adversarially, they predict the most likely next token. They generate code that handles the happy path well and the attack path poorly.
  5. Deprecated patterns in training data: Security best practices evolve. A technique that was acceptable five years ago may be considered vulnerable today. Training data includes code from all eras, and AI doesn’t reliably distinguish between current and outdated security practices.

The Most Common Vulnerability Types in AI-Generated Code

Based on published security audits and vulnerability reports, certain categories of security flaws appear consistently in AI-generated code. Every developer using AI tools should understand these categories and actively audit for them.

1. Cross-Site Scripting (XSS)

XSS remains the most common vulnerability in AI-generated web code. AI models consistently choose innerHTML, document.write, and template literal injection over safer DOM manipulation methods. When generating React or Vue components, AI sometimes bypasses the framework’s built-in XSS protections by using dangerouslySetInnerHTML or v-html when simpler, safer alternatives exist.

What to watch for: Any AI-generated code that inserts dynamic content into the DOM. Check whether the content is sanitized, whether the insertion method is safe, and whether a Content Security Policy is in place as a defense layer.

2. SQL Injection

Despite decades of awareness, AI models still generate SQL queries using string concatenation or template literals instead of parameterized queries. This is especially common when the AI is generating quick scripts, admin panels, or internal tools, contexts where developers might lower their guard.

What to watch for: Any SQL query that includes variables directly in the query string. Even if the AI uses an ORM, check that it’s using the ORM’s parameterized methods rather than raw query features.

3. Authentication and Authorization Bypass

AI-generated APIs frequently implement authentication at the route level but miss authorization checks for specific resources. A user might be authenticated (logged in) but not authorized (allowed to access that specific record). AI models tend to generate middleware that checks “is the user logged in?” without checking “does this user have permission to access this specific resource?”

What to watch for: API endpoints that accept resource IDs (like /api/users/:id or /api/documents/:id) without verifying that the authenticated user owns or has access to that resource. This is known as Insecure Direct Object Reference (IDOR).

4. Hardcoded Secrets and Credentials

AI models frequently generate code with placeholder API keys, database credentials, or JWT secrets embedded directly in the source code. While these are obviously meant to be replaced, developers sometimes ship the code without updating them, or worse, replace them with real credentials in the same file rather than using environment variables.

What to watch for: Any string that looks like a credential, key, or secret in AI-generated code. Ensure all sensitive values are loaded from environment variables or a secrets manager, never from source files.

5. Insecure Deserialization

When AI generates code that processes incoming data, JSON payloads, form submissions, file uploads, it often trusts the data structure implicitly. Type checking is minimal. Validation is superficial (checking that a field exists, not that its value is safe). And in languages like Python or PHP, AI sometimes uses deserialization functions (like pickle.loads or unserialize) that can execute arbitrary code if given malicious input.

What to watch for: Any code that deserializes, parses, or processes data from external sources. Verify that input validation is thorough, type-safe, and defensive.

6. Path Traversal

AI-generated file handling code rarely includes path traversal protections. When building features that serve files, process uploads, or read from user-specified paths, AI typically constructs file paths using direct concatenation without validating that the resulting path stays within the intended directory.

What to watch for: Any file operation where the path includes user-supplied input. Verify that the code normalizes the path and confirms it falls within the expected directory before performing the operation.

Vulnerability TypeAI FrequencyTypical SeverityDetection Difficulty
Cross-Site Scripting (XSS)Very HighMedium to HighLow, automated tools catch most cases
SQL InjectionHighCriticalLow, well-known patterns
Auth/Authz Bypass (IDOR)HighHigh to CriticalMedium, requires understanding business logic
Hardcoded SecretsVery HighVariableLow, secret scanning tools are mature
Insecure DeserializationMediumCriticalMedium, depends on language and context
Path TraversalMediumHighMedium, requires code path analysis
SSRFMediumHighHigh, subtle and context-dependent
Race ConditionsLow-MediumVariableHigh, hard to detect statically

How to Audit AI-Generated Code for Security

The standard code review process isn’t sufficient for AI-generated code. Human reviewers tend to skim code that looks clean and well-structured, which AI code always does. You need a more systematic, security-focused review process.

Step 1: Identify All Trust Boundaries

Before reviewing a single line of code, map out where data enters and leaves the system. Every point where external data is received, HTTP requests, WebSocket messages, file uploads, browser messages, database reads from shared tables, is a trust boundary. AI-generated code at these boundaries needs the most scrutiny.

Step 2: Trace Data Flow

For each trust boundary, trace how external data flows through the application. Follow it from the point of entry through every transformation, storage, and output. At each step, ask: Is this data validated? Is it sanitized? Is it encoded appropriately for its destination? AI often validates data at the entry point but fails to sanitize it again before use in a different context (like going from a JSON payload to a SQL query to an HTML template).

Step 3: Check Authentication and Authorization Separately

Review every protected endpoint or function twice: once for authentication (is the user who they claim to be?) and once for authorization (is this user allowed to perform this specific action on this specific resource?). AI models almost always get authentication right and frequently get authorization wrong.

Step 4: Look for What’s Missing

The hardest part of reviewing AI code is identifying what it didn’t do. AI generates what you asked for, it doesn’t add things you didn’t ask for, even if they’re critical. Common omissions include:

  • Rate limiting on authentication endpoints
  • CSRF tokens on state-changing requests
  • Security headers (HSTS, X-Content-Type-Options, X-Frame-Options)
  • Input length limits
  • Logging and monitoring for security events
  • Account lockout after failed attempts
  • Secure cookie flags (HttpOnly, Secure, SameSite)
  • Error messages that don’t leak internal details

The most dangerous AI-generated code isn’t the code with bugs, it’s the security controls that were never generated in the first place.

Security Review Checklist for AI-Generated Code

Use this checklist every time you integrate AI-generated code into a project. Not every item applies to every piece of code, but scanning the full list ensures you don’t miss critical categories.

Input Handling

  • ☐ All user input is validated against an expected format (whitelist, not blacklist)
  • ☐ Input length is bounded
  • ☐ Input types are enforced (number, string, email, etc.)
  • ☐ File uploads validate file type, size, and content (not just extension)
  • ☐ No use of eval(), exec(), or dynamic code execution with user input

Output Encoding

  • ☐ HTML output is properly escaped
  • ☐ No use of innerHTML, document.write, or dangerouslySetInnerHTML with dynamic content
  • ☐ URLs in links and redirects are validated (no javascript: protocol)
  • ☐ JSON responses use proper Content-Type headers
  • ☐ Error messages don’t expose stack traces, file paths, or database details

Database Operations

  • ☐ All SQL queries use parameterized statements or prepared queries
  • ☐ ORM queries don’t use raw SQL with string interpolation
  • ☐ Database credentials are not in source code
  • ☐ Database user has minimal required permissions

Authentication and Session Management

  • ☐ Passwords are hashed with bcrypt, scrypt, or Argon2 (not MD5 or SHA-256)
  • ☐ Session tokens are cryptographically random and sufficiently long
  • ☐ Cookies use HttpOnly, Secure, and SameSite flags
  • ☐ JWT secrets are sufficiently long and stored securely
  • ☐ Token expiration is implemented
  • ☐ Rate limiting exists on login and registration endpoints

Authorization

  • ☐ Every endpoint that accesses a resource verifies the user owns or has access to that resource
  • ☐ Admin endpoints have role-based access control
  • ☐ IDOR vulnerabilities are tested by accessing resources with different user contexts
  • ☐ Horizontal privilege escalation is tested (user A accessing user B’s data)

Configuration and Infrastructure

  • ☐ No hardcoded API keys, passwords, or secrets in code
  • ☐ HTTPS is enforced
  • ☐ CORS policy is restrictive (not *)
  • ☐ Security headers are set (CSP, HSTS, X-Content-Type-Options)
  • ☐ Debug mode is disabled in production configuration
  • ☐ Dependencies are up to date and free of known vulnerabilities

Tools for Scanning AI-Generated Code

Manual review is essential but insufficient. You should also integrate automated security scanning into your workflow, especially when working with AI-generated code. Here are the categories of tools you need:

Static Application Security Testing (SAST)

SAST tools analyze source code without running it, identifying potential vulnerabilities based on code patterns. They’re particularly effective at catching the kinds of issues AI generates, SQL injection, XSS, hardcoded secrets, because these follow recognizable patterns.

ToolLanguagesBest ForCost
Semgrep20+ languagesCustom rules, CI/CD integrationFree (OSS) / Paid (Team)
SonarQube30+ languagesComprehensive analysis, quality gatesFree (Community) / Paid
CodeQLC/C++, C#, Go, Java, JS, Python, RubyDeep semantic analysis, GitHub integrationFree for open source
BanditPythonPython-specific security issuesFree (OSS)
ESLint SecurityJavaScript/TypeScriptJS-specific security patternsFree (OSS)
BrakemanRuby on RailsRails-specific vulnerabilitiesFree (OSS)

Secret Detection

Because AI frequently generates code with placeholder credentials, secret detection should run on every commit. Tools like GitLeaks, TruffleHog, and detect-secrets can be integrated into pre-commit hooks to catch secrets before they enter your repository. GitHub’s built-in secret scanning also catches many common patterns.

Dependency Scanning

AI often suggests dependencies, npm packages, Python libraries, Go modules, without checking whether those dependencies have known vulnerabilities. Tools like npm audit, Snyk, Dependabot, and OWASP Dependency-Check should run automatically to flag vulnerable dependencies before they reach production.

Dynamic Application Security Testing (DAST)

DAST tools test running applications by simulating attacks. They’re valuable for catching issues that SAST misses, particularly authentication/authorization flaws and configuration issues. OWASP ZAP and Burp Suite are the most widely used options. Running DAST against AI-generated APIs is especially important because AI tends to miss authorization controls that only manifest at runtime.

Browser Extension-Specific Tools

If you’re building browser extensions with AI assistance, as in the Firefox incident, use specialized tools. Extension Analyzer checks manifest permissions against actual usage. CRXcavator scores extension risk. And manual review of the manifest.json permissions against the minimum required set is a must.

The Responsibility Question

When AI-generated code causes a security breach, who is responsible? This question doesn’t have a settled legal answer yet, but the practical answer is clear: the developer who shipped the code is responsible.

AI coding assistants are tools, not authors. They don’t sign off on pull requests. They don’t deploy to production. They don’t make architectural decisions. The developer who uses AI to generate code, reviews it (or doesn’t), and ships it to production bears the same responsibility they would if they’d written the code themselves.

This has significant implications:

  • “The AI wrote it” is not a defense. In a security incident, saying your coding assistant generated the vulnerable code is like saying your calculator gave you the wrong answer. You’re the engineer. You’re responsible for verifying the output.
  • Review standards should be higher, not lower, for AI code. Because AI generates confident-looking code with potential hidden flaws, the review bar should be elevated. This is the opposite of what most teams do, they review AI code less carefully because it looks clean.
  • Security training is more important than ever. Developers who use AI to generate code they don’t fully understand need deeper security knowledge to identify what the AI got wrong. The ability to spot security issues in code you didn’t write is now a critical skill.

AI can generate the code, but it cannot accept the consequences when that code fails. That responsibility remains entirely human.


Building Secure-by-Default Prompts

One of the most effective ways to improve AI-generated code security is to improve how you prompt the AI. Most developers prompt for functionality: “Build me a login endpoint.” Security-conscious developers prompt for secure functionality: “Build me a login endpoint with rate limiting, bcrypt password hashing, secure session management, and CSRF protection.”

Here’s a framework for building security-aware prompts:

The Secure Prompt Template

When asking AI to generate any code that handles user data, network communication, or authentication, include these elements in your prompt:

  1. Functional requirement: What the code should do.
  2. Security requirements: Explicitly list the security controls you expect (input validation, output encoding, parameterized queries, etc.).
  3. Threat context: Tell the AI what threats to defend against. “This endpoint will be exposed to the public internet and should be hardened against injection attacks and brute force attempts.”
  4. Negative constraints: Tell the AI what NOT to do. “Do not use innerHTML. Do not use string concatenation for SQL. Do not hardcode any credentials.”
  5. Standards reference: Point the AI to a security standard. “Follow OWASP Top 10 recommendations” or “Conform to the project’s existing security middleware patterns.”
Example: Secure prompt vs. standard prompt

Standard prompt (insecure):

“Create a Node.js Express endpoint that lets users update their profile with a name and email.”

Secure prompt (much better):

“Create a Node.js Express endpoint for authenticated users to update their own profile (name and email only). Requirements: validate that the authenticated user can only update their own profile (not other users’ profiles), validate email format and name length (max 100 chars), use parameterized queries for any database operations, return generic error messages without internal details, include rate limiting of 10 requests per minute per user, sanitize all output, and use the existing auth middleware from our project. Do not use any raw SQL string concatenation.”

The second prompt produces dramatically more secure code because it explicitly tells the AI what security controls to implement. Without these instructions, AI will generate the fastest functional solution, which is rarely the most secure one.

Integrating Security into the AI-Assisted Workflow

Security can’t be an afterthought that you bolt on after the AI generates code. It needs to be woven into the entire workflow. Here’s how to structure an AI-assisted development process that takes security seriously:

Before Generation

  • Define the threat model for the feature you’re building
  • Identify trust boundaries and data flows
  • Write security requirements into the prompt
  • Specify negative constraints (what the code must NOT do)

During Generation

  • Review code in small chunks, not large dumps
  • Ask the AI to explain its security decisions
  • Challenge any pattern that looks too simple for a security-sensitive context
  • Regenerate with more specific security constraints if the first output is inadequate

After Generation

  • Run SAST tools on the generated code
  • Run secret detection
  • Apply the security review checklist (above)
  • Write targeted security tests (auth bypass, injection, XSS)
  • Have a human with security expertise review the code separately from the functional review

In Production

  • Monitor for anomalous behavior that could indicate exploitation
  • Run periodic DAST scans against live endpoints
  • Keep dependencies updated and respond promptly to security advisories
  • Maintain an incident response plan that accounts for AI-generated code vulnerabilities

The Path Forward

AI-generated code isn’t going away. Its volume will only increase. The Firefox extension incident is an early warning, not a one-off. As AI tools become more capable and more developers rely on them for larger portions of their codebase, the security implications will grow in proportion.

The solution isn’t to stop using AI coding assistants. They provide genuine productivity benefits and can even improve code quality when used thoughtfully. The solution is to treat AI-generated code with appropriate skepticism, the same skepticism you’d apply to code from a talented but security-unaware junior developer. The code will work. It will be well-formatted. It might even be elegant. But you should never assume it’s secure.

The developers and teams that will navigate this transition successfully are the ones building security into their AI workflows from the start: writing security-aware prompts, running automated scans, following rigorous checklists, and never letting the cleanliness of AI-generated code lull them into a false sense of security.

Because in the end, the attackers targeting your application don’t care whether a human or an AI wrote the vulnerable code. They only care that it’s there.

Last modified: March 8, 2026

Close