Leading AI companies continue to face significant cybersecurity challenges, particularly in protecting sensitive information, as highlighted in recent research from Wiz. The study focused on the Forbes top 50 AI firms, revealing that 65% of them were found to be leaking verified secrets—such as API keys, tokens, and credentials—on public GitHub repositories.
These leaks often occurred in places not easily accessible to standard security scanners, including deleted forks, developer repositories, and GitHub gists, indicating a deeper and more persistent problem than surface-level exposure. Wiz's approach to uncovering these leaks involved a framework called "Depth, Perimeter, and Coverage." Depth allowed researchers to look beyond just the main repositories, reaching into less visible parts of the codebase.
Perimeter expanded the search to contributors and organization members, recognizing that individuals could inadvertently upload company-related secrets to their own public spaces. Coverage ensured that new types of secrets, such as those used by AI-specific platforms like Tavily, Langchain, Cohere, and Pinecone, were included in the scan, which many traditional tools overlook.
The findings show that despite being leaders in cutting-edge technology, these AI companies have not adequately addressed basic security hygiene. The researchers disclosed the discovered leaks to the affected organisations, but nearly half of these notifications either failed to reach the intended recipients, were ignored, or received no actionable response, underscoring the lack of dedicated channels for vulnerability disclosure.
Security Tips
Wiz recommends several essential security measures for all organisations, regardless of size. First, deploying robust secret scanning should be a mandatory practice to proactively identify and remove sensitive information from codebases. Second, companies should prioritise the detection of their own unique secret formats, especially if they are new or specific to their operations. Engaging vendors and the open source community to support the detection of these formats is also advised.
Finally, establishing a clear and accessible disclosure protocol is crucial. Having a dedicated channel for reporting vulnerabilities and leaks enables faster remediation and better coordination between researchers and organisations, minimising potential damage from exposure. The research serves as a stark reminder that even the most advanced companies must not overlook fundamental cybersecurity practices to safeguard sensitive data and maintain trust in the rapidly evolving AI landscape.
