AI Security Risks and Recommendations

It is easy to assume that large language models (LLMs) and generative AI (GenAI) security products are a mysterious box of magic. While, in general, interactions with these models are abstract; you make an API call to a remote endpoint and receive a response without much exposure to the security controls around the model, there are security risks of AI to consider when using them.

This is particularly true when managing and hosting models on your own infrastructure. While it would be hard to self-host or manage a complex LLM model, many specialized models trained for a specific task are widely available.

Designing an AI system brings many novel areas of concern and well-known security challenges manifesting in new components. Designing a secure system goes beyond finding a vulnerability in a specific model or component and requires thoughtful consideration of the AI components within the overall system architecture.

In this article, we will introduce the AI security risks that should be understood to help security practitioners mitigate security gaps that come with machine learning (ML), LLMs and AI products.

AI Security Risks

The problem with AI from a security perspective is that it is relatively new, and there is not a lot of good guidance on how to implement it securely. As a security professional—what are the concerns you need to consider?

There are a handful of specific and niche security issues to consider while implementing such models. While many of the security issues associated with ML models are “common” from a general security perspective, they are entirely new manifestations in the context of AI software and systems and consequently present new AI security risks.

1. Model Provenance

In an open-source model marketplace—like the Hugging Face ecosystem—it is extremely easy to pull down and select any model. The models here are based on a handful of baseline models, but there is nothing to stop a malicious model from being uploaded to the platform. While the marketplaces will perform malware scans and other security checks on input, it is unlikely that the safety of a given model can be guaranteed.

An organization may have policies or procedures in place to control the types of third-party libraries or software a developer can leverage within an application. There might even be an entire process for managing vulnerabilities in such software.

If you are using a model in an application that will accept and process sensitive data, some companies will implement guardrails around where it is acceptable to pull and use a model from.

This can involve hard controls that prevent access to some endpoints and more generic policy rules.

A few general questions to consider include:

Are developers currently allowed to use models as part of application design?
- If yes, is there an approval process?
Are developers allowed to run inference on a model running on a remote server?
- If yes, are there guidelines and best practices around sending data to a company like OpenAI?
Are developers allowed to import and use a model to perform a task within an application?
- Is there an approval process in place for a given model?
- Is there any process or technical limitation in place for a given model?
Are developers allowed to use a baseline model to train from?
If developers are using a model from a marketplace, are there legal issues in terms of licensing?

2. Model Filetypes

Open-source machine learning frameworks will have different preferred model file formats. Just as you might exercise caution when evaluating external files (such as when processing images), you should be conscious of the underlying file format of the model that is in use. You might see this referred to as the “model weights” or a “serialized model file,” for example. Here are the files that are provided with the “Whisper Tiny” model from OpenAI:

AI Security Risks and Recommendations: Demystifying the AI Box of Magic

The three highlighted files are all the same model but in different file formats.

For example, “safetensors” is a file format developed and maintained by Hugging Face and used primarily by the TensorFlow libraries. The format’s selling point is that it is simple and does not perform any dangerous acts directly while loading the model. Many (but not all) of the models on the Hugging Face platform use this format.

Another file format generated from the PyTorch library that is probably the most used is ML frameworks: a Python pickle. A “pickle” is a serialized file format that executes Python operations when run. In practice, running a pickle file is the same as running arbitrary Python code. As PyTorch is the most used model framework, many models that are shared and used in the space are pickle files.

Running an arbitrary pickle file from anywhere, even an otherwise trustworthy source, is dangerous and can be harmful. Both Hugging Face and PyTorch warn users about this in their documentation. When importing models and model weights from a source, a security consideration that needs to be made is the format of the incoming model and whether the format is dangerous.

Pickles are the most obvious point of concern, but any other model formats may have specific point-in-time vulnerabilities with specific parsers and handlers.

Even beyond the concept of pickles, the actual model file—even in a safe model format can contain malicious code.

3. Inference Servers and Model Hosting

Model marketplaces allow users to directly perform inference on their models via their server API. They often offer a paid product that provides a more production-level inference experience.

AI Security Risks and Recommendations: Demystifying the AI Box of Magic

If your organization is deploying a bespoke ML model to perform a task, an AI security risk, you’ll face concerns about where that model is being hosted and how it is being interacted with.

Consider the following:

The model files and weights may be sensitive, proprietary data. If they are hosted on the same server as a web application, the local files would be only one web exploit away from being exposed.
If multiple systems require the use of the same models, the model files will need to be propagated across multiple hosts, which increases costs and security exposure.
Running both the model and the web application architecture on the same host will cause a significant drain on performance.
A large inference request, which triggers a spike in CPU/GPU usage, could take down both the model inference and application servers.
A GPU-backed server may be required to perform inference, which can be expensive when running a normal application architecture.
Multiple models may need to be chained together, potentially running in parallel. This is much easier to do with a well-designed inference server.

In this instance, you might consider a dedicated inference server hosted remotely or locally. And in doing so, you will then face additional security risks, such as:

What exposure does the server have? Does it have APIs, a web server, and more?
How are the APIs authenticated? Are there authorization checks in place?
Does the inference server implement transport layer security (TLS)? If so, does its performance cost make its use feasible?
Has any security research been done on this platform? Have any vulnerabilities been discovered? Is it possible to perform your own analysis?

4. Pre- and Post-Processing of Multimodal Content

Your input needs to be in the right format (i.e., a tensor) before you can communicate with the model. In the case of image or audio processing models, this means that you will need to reshape or modify the files.

Consider that you have an application that accepts an image and that the image is then meant to be passed through a specific image-classification model. You could use a tool like the Transformers library from Hugging Face to employ a high-level API to accomplish the task. However, in more complex scenarios that use custom models, you will need to do the heavy lifting yourself. This heavy lifting, which includes editing and resizing an image, would likely use an image library (e.g., ImageMagick or PIL for Python). A custom implementation that is used for image processing might be affected by a vulnerability.

The best way to identify these types of issues is to understand the steps required for processing and manually review the steps taken by the application architecture. Manual and automated code review helps, though it is not a magic wand.

5. Service Exposure

The act of exposing a model via an API introduces a lot of security risks. You need to consider the network exposure of this new service in the manner you would for a new API. The current crop of inference servers quite generous in the amount of network exposure they allow—TorchServe and Triton expose multiple HTTP services and multiple gRPC services.

6. Lack of Authentication

Generally speaking, it is very common for inference servers to offer no authentication mechanisms themselves.

This lack of authentication makes sense in context. For example, model execution requires very fast performance. Having this inference server need to check an authentication token could significantly impact the performance of a component that is likely the most performance-critical in the system.

At the same time, a developer who runs an inference server somewhere likely chooses to expose these APIs to a wider exposure landscape to get things working. In fact, the underlying architecture of not having authentication but needing to run on specific, expensive hardware leads to a scenario where it is very likely that you will need to invoke the inference server from a host other than the local host.

Consequently, it is extremely important to make sure that these APIs are not bound to external interfaces.

With APIs this powerful, you must ensure that the service’s network exposure is well-restricted and not accessible to components that don't need to call them.

7. Remove Code Execution as a Feature

Most popular inference servers today have two primary functions exposed in APIs: inference and model management (e.g., importing new models). As we talked about earlier, a model file is an arbitrary file that contains relatively arbitrary code. Being able to upload a new model to an inference server is a direct path to code execution.

Given the two prior points, service exposure and the fact that these APIs don’t implement authentication by default, this makes a malicious user leveraging an inference server code execution a real threat and something you need to consider.

Mitigating AI Security Risks

At first glance, the security risks presented by AI might appear overwhelming. It’s worth noting, however, that many of these issues are touched on within the technology’s various security policies. Many of the security risks are known risks to a certain degree.

What’s more, while worthy of security consideration, remote code executions and lack of authentication are hard to classify as "vulnerabilities.” That is simply how this application is designed. If an organization needs to use it, it must find a way to work around those restrictions in the most security-conscious way possible.

Find out more about Kroll’s AI Security Testing services, that can give you insight into your vulnerabilities and resilience to AI security risks.