Table of Contents
Related Blogs
Introduction: AI Code Security and Its Emerging Risks
Large Language Models (LLMs) and AI-assisted coding tools offer immense potential in accelerating development cycles, reducing costs, and improving productivity. However, this acceleration comes at a cost: AI-generated code introduces significant security risks, many of which remain poorly understood, inadequately mitigated, and largely unregulated. The vulnerabilities inherent in AI-generated code are difficult to identify and remediate with limited frameworks and methodologies designed to evaluate, secure, and govern AI-driven software development.
Artificial intelligence (AI) code generation models are susceptible to producing insecure code; however, studies indicate that users perceive AI-generated code as being more trustworthy than human-generated code and there are limited frameworks detailing how to identify and address issues in AI-generated code. This causes organizations to have a significant blind spot where they incur substantial vulnerabilities and risks.
For example, the following studies explore the extent of security concerns with AI-generated code:
- Out of 130 code samples generated using InCoder and Github Copilot, 68% and 73% of the code samples contained vulnerabilities when checked manually.
- ChatGPT was used to generate 21 programs in five different programming languages and tested for CWEs, showing that only five out of 21 were initially secure. Only after specific prompting to correct the code did an additional seven cases generate secure code.
- An average of 48% of the code produced by five different LLMs contains at least one bug that could potentially lead to malicious exploitation.
Despite these results, there are early indications that users perceive AI-generated code to be more secure than human-written code. This “automation bias” towards AI-generated code means that users may overlook careful code review and accept insecure code as it is. For instance, in a 2023 industry survey of 537 technology and IT workers and managers, 76% responded that AI code is more secure than human produced code.
What Makes AI Code Vulnerable
Generative AI systems have known vulnerabilities to several types of adversarial attacks. These include data poisoning attacks, in which an attacker contaminates a model’s training data to elicit a desired behavior, and backdoor attacks, in which an attacker attempts to produce a specific output by prompting the model with a predetermined trigger phrase. In the code generation context, a data poisoning attack may look like an attacker manipulating a model’s training data to increase its likelihood of producing code that imports a malicious package or library.
A backdoor attack on the model itself could dramatically change a model’s behavior with a single trigger that may persist even if developers try to remove it. This changed behavior can result in an output that violates restrictions placed on the model by its developers (such as “don’t suggest code patterns associated with malware”) or that may reveal unwanted or sensitive information. Researchers have pointed out that because code generation models are trained on large amounts of data from a finite number of unsanitized code repositories, attackers could easily infiltrate these repositories with files containing malicious code or purposefully introduce new repositories containing vulnerable code.
Depending on the code generation model’s interface or scaffolding, other forms of adversarial attacks may come into play such as indirect prompt injection, in which an attacker attempts to instruct a model to behave a certain way while hiding these instructions from a legitimate user. Compared to direct prompt injection (otherwise known as “jailbreaking”), in which a user attacks a generative model by prompting it in a certain way, indirect prompt injection requires the model to retrieve compromised data—containing hidden instructions—from a third-party source such as a website.
In the code generation context, an AI model that can reference external webpages or documentation may not have a way of distinguishing between legitimate and malicious prompts, which could hypothetically instruct it to generate code that calls a specific package or adheres to an insecure coding pattern.
Finally, code generation models may be more effective and useful if they are given broad permissions, but that in turn makes them potential vectors for attack that must then be further secured. Most AI-generated code in professional contexts is likely flowing through a development pipeline that includes built-in testing and security evaluation, but AI companies are actively working on strategies to give models—including code-writing models—more autonomy and ability to interact with their environment.
Generative AI systems, particularly those used for code generation, are inherently vulnerable to various adversarial attacks, including data poisoning, backdoor manipulation, and prompt injection. These attacks exploit weaknesses in training data, model behavior, and external dependencies, enabling attackers to introduce malicious code or bypass safeguards. The inability of AI models to consistently differentiate between legitimate and malicious inputs further exacerbates these risks, especially in contexts where models interact with external resources or operate with broad permissions.
Best Practices Framework for AI-Generated Code
Implementing a set of best practices enables organizations to mitigate risks, ensure compliance, and maintain effective security measures. Below is a proposed compliance framework for maintaining secure AI-generated code.
| Category | Practice | Objective | Action Steps | Outcome |
| Testing and Validation | Testing of LLMs and their outputs | Identify vulnerabilities and ensure consistent, secure AI-generated code | Conduct adversarial testing, simulate diverse real-world prompts, and validateedge cases | Minimized risk of insecure outputs and improved code consistency |
| Tool Integration | Use SAST and SCA tools | Detect vulnerabilities in source code, dependencies, and runtime environments | Embed tools in CI/CD pipelines, automate scans, and remediate identified issues | Comprehensive security coverage throughout the development lifecycle |
| Access Control | Implement Role-Based Access Control (RBAC) | Restrict unauthorized access to AI systems and sensitive functionalities | Define granular roles and permissions, enforce segmentation, and maintain detailed access logs | Minimized risk of misuse and enhanced accountability |
| Policy and Compliance | Automate policy enforcement with compliance templates | Align with organizational and regulatory requirements | Deploy templates for GDPR, HIPAA, and PCI DSS compliance, and validate outputs against security standards | Consistent adherence to policies and reduced compliance violations |
| Traceability | Maintain a software chain of custody | Ensure traceability and accountability for AI-generated code | Track code origin, modifications, and deployments, and use governance tools to flag deviations | Enhanced ability to trace vulnerabilities and maintainauditability |
| Resiliency Measures | Incorporate progressive delivery and rollback strategies | Detect and mitigate vulnerabilities in production environments | Perform canary testing, blue-green deployments, and enable automated rollbacks for identified issues | Reduced impact of vulnerabilities on production systems |
| Centralized Management | Use a unified platform to manage tools and processes | Streamline security workflows, monitor risks, and prioritize remediation | Aggregate SAST, DAST, and SCA data, automate workflows, and provide visibility through dashboards | Improved operational efficiency, risk mitigation, and collaboration between teams |
Reviewing Tools and Processes with Digital.ai
Digital.ai Release and Deploy integrates a comprehensive suite of tools, alerts, and policies to uphold best practices for securing AI generated code. Here is a list of Digital.ai’s relevant out-of-the-box integrations and native functionality, which are used to enforce security for AI-generated code.
Digital.ai’s Integrations for AI-Generated Code Security
- Application Security Testing – Checkmarx facilitates SAST scans, and Black Duck executes SCA scanning.
- Policy Enforcement – Open Policy Agent (OPA) implements policy-as-code across the CI/CD pipeline.
- Continuous Delivery – ArgoCD and Argo Rollouts facilitate continuous and progressive delivery as well as GitOps workflows.
Digital.ai’s Native Functionality for AI-Generated Code Security
- Role Based Access Control – Enforces least privilege among LLMs and users while enforcing compliance mandates.
- Auditing and Compliance Tracking – Reviews all activities across tools, users, and environments while assessing how it impacts compliance attainment.
- Analytics and Workflows – Assesses environmental and security trends and delegates tasks and facilitates workflows to address issues as they arise and enforce compliance mandates.
Comprehensive Use Case: Enforcing AI Code Security
By integrating the aforementioned tools, configuring alerts, and enforcing policies, Digital.ai Release and Deploy ensures that AI-generated code is secure, compliant, and operationally efficient. This is further examined across all potential use cases for Release and Deploy, which are listed below.
Application Security Testing
- Checkmarx’s SAST scans AI-generated code for insecure patterns in applications. These scans are configured to trigger automatically after code commits in CI/CD pipelines, providing immediate feedback to developers and minimizing the risk of downstream vulnerabilities.
- Black Duck SCA identifies vulnerabilities in third-party dependencies recommended by AI models. These dependencies, often libraries with known CVEs or typosquatted malicious packages, are scanned during the build phase of the pipeline. By flagging and remediating risky dependencies early, the bank ensured that insecure libraries were excluded from production environments.
Compliance Enforcement
- Open Policy Agent (OPA) provides policy-as-code capabilities, integrating with the CI/CD pipeline. Policies are written in Rego to automate enforcement of critical standards like GDPR, PCI DSS, and internal governance frameworks. This is based on the severity of risk and prescribed investigative and recovery measures and conformation to best practices and compliance requirements. For example, OPA policies may block deployments of AI-generated APIs unless they enforce HTTPS and implement robust access controls to adhere to compliance requirements. Additionally, OPA validates runtime configurations to ensure sensitive data was encrypted both in transit and at rest. Policies also restricted the inclusion of high-risk libraries, defined as those with CVE scores above 7.0, ensuring adherence to best practices for secure development.
- RBAC mitigates risks by defining granular permissions for users interacting with AI systems and deployment pipelines. Developers are limited to generating and testing code in isolated environments, while security teams were granted access to review, audit, and approve outputs. Administrators have the ability to modify policies, manage tools, and oversee deployments. This segmentation minimized the potential for accidental or malicious misuse of AI-generated code, with detailed access logs providing accountability for every action taken within the system.
- Digital.ai’s monitoring dashboards provide a centralized view of an organization’s security posture, aggregating data from SAST, SCA, and observability tools. Alerts are configured to notify teams of high-severity vulnerabilities, policy violations, or runtime issues. Compliance templates provide repeatable workflows to enforce compliance standards. Additionally, the software chain of custody tracks every change, from AI-generated outputs to deployment, ensuring traceability and compliance. For instance, if an AI-generated script introduced a vulnerability, the chain of custody identified the specific prompt and dependencies involved, allowing the bank to address the root cause efficiently.
Summarizing the Use Case
This integrated procedure identifies vulnerabilities at every stage of the software development lifecycle, enforcing compliance, and maintaining operational stability. Policies ensure alignment with regulatory standards, while RBAC, governance tools, and chain of custody provided accountability and auditability. Automated workflows reduce manual effort, streamline operations, and accelerate remediation efforts. By leveraging these tools, AI-generated code is secured, risks are minimized, and compliance mandates are followed.
Discover how Digital.ai Release can secure your AI-generated code from development to deployment.
Explore
What's New In The World of Digital.ai
Securing AI-Generated Code with Digital.ai Release
Introduction: AI Code Security and Its Emerging Risks Large Language…
The Return to Bare Metal: Why We’re Done Pretending
For the better part of two decades, we’ve been sold…
Deliver with Evidence: Safer Orchestration, Smarter Rollouts, and Scalable Processes (Release 25.3)
According to recent surveys, 31% of DevOps leaders said a…