Finally, someone is managing AI mishaps: the FLARE-AI risk reporting station is now online.

A group of AI researchers has launched the open-source platform FLARE-AI, modeled after the outage reporting site Downdetector, allowing anyone to report and track harm caused by AI.
(Previous context: "2 + 2 = 5" tricks AI browsers: ChatGPT Atlas, Claude, Perplexity collectively fall for it and leak credentials)
(Background supplement: What is AI red team drills? Why do you need it to protect enterprise cybersecurity)

Table of Contents

Toggle

  • From Cross-national Coalitions to Cross-party Bills
  • Why the Fragmented Reporting Mechanism Is a Real Problem
  • Congressional Bills Aim to Take Over, Crowdsourced Reporting Still Has Concerns

When a chatbot teaches someone how to make a bomb, leaks personal data, or makes users increasingly paranoid, there is currently no universally recognized place to report it. The software security community has long had a mature "coordinated vulnerability disclosure" mechanism, but AI failures have long relied on journalists writing articles one by one, with the public watching the spectacle, leaving no systematic record.

To address this, a group of AI researchers has launched the open-source platform FLARE-AI (Flaw Reporting for AI), allowing anyone to report and track harm caused by AI, and then handing cases over to model developers and the non-profit organization MITRE, which tracks technical system issues long-term. The overall concept is similar to the outage reporting site Downdetector, except that this time the target is not website outages, but the black-box behavior of AI models.

From Cross-national Coalitions to Cross-party Bills

The driving force behind FLARE-AI is Hugging Face AI policy researcher Avijit Ghosh, who co-led development with computer scientists Elaine Zhu and Shayne Longpre. The three did not act on a whim; they have been researching AI reporting mechanisms since last year, and this time they connected 49 AI experts from 32 different organizations to jointly write a research paper, arguing that as AI becomes more widely adopted and agentic AI gains greater permissions, the lack of a consistent reporting channel will become a major hazard.

"There is currently no centralized, accountable way to report flaws in AI systems," Ghosh said. This statement highlights the core contradiction: the whole world is talking about AI risks, but there is no consensus on "who to notify when something bad happens."

Why the Fragmented Reporting Mechanism Is a Real Problem

Jessica Ji, a researcher at the Center for Security and Emerging Technology, considers this "a great initiative." She pointed out that existing reporting mechanisms are indeed fragmented, and AI models themselves are black boxes. "Any approach that makes AI more transparent, I support."

Ghosh added that issues with AI systems are not just security vulnerabilities, but also psychological harm, discriminatory bias, and misinformation. Different companies have different standards for identifying these problems, resulting in some issues never being acknowledged as having occurred. "Without a coordinated disclosure mechanism, there is no external way to force transparency," he said.

Several recent incidents illustrate how real this vulnerability is. Security firm LayerX revealed this week a technique that can trick AI-integrated browsers (including OpenAI's Atlas and Perplexity's Comet) into bypassing their own guardrails. As long as the AI is made to believe it is playing a game, the browser may go out of control and attempt to hack into websites (the relevant vendors have fixed this issue).

Further reading: "2 + 2 = 5" tricks AI browsers: ChatGPT Atlas, Claude, Perplexity Comet... All 6 obediently hand over passwords

In April this year, security researcher Johann Rehberger also discovered that images generated by ChatGPT could be used to induce Claude to leak personal data.

Congressional Bills Aim to Take Over, Crowdsourced Reporting Still Has Concerns

Rumman Chowdhury, CEO of Humane Intelligence PBC, believes that FLARE-AI could be a practical way for many AI developers to implement reporting mechanisms. However, she also cautioned that such initiatives typically come with real challenges: first, how to handle a large influx of reports that may not be serious; second, whether the reporting mechanism itself can gain endorsement from credible, authoritative organizations.

This is why the recent U.S. congressional bill is especially critical. The bill, introduced by Representatives Deborah Ross, Jeff Hurd, and Don Beyer, would require the National Institute of Standards and Technology (NIST) to establish standards for reporting AI flaws and maintain a centralized database of AI flaw reports. Ghosh and other leaders believe this would incentivize AI developers to take system issues seriously and fix them, while also allowing users to review the security of various systems based on different use scenarios.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned