OpenAI Open-Source Privacy Filter, capable of automatically detecting and masking private information in text locally

robot
Abstract generation in progress

ME News Report, April 23 (UTC+8), according to Dongcha Beating monitoring, OpenAI has open-sourced Privacy Filter under the Apache 2.0 license, a locally deployable text de-identification model. Users input text into the model, which automatically recognizes eight categories of personally identifiable information (PII)—including names, emails, phone numbers, addresses, accounts, URLs, dates, and keys—and marks or masks them. The entire process is completed locally, and data does not need to be sent to the cloud. The model has a total of 1.5 billion parameters but uses a sparse mixture-of-experts architecture, activating only 50 million parameters during inference, allowing it to run on laptops or even in browsers. The context window is 128K tokens, and a single forward pass can label all privacy information. Users can adjust the precision-recall trade-off through preset operation points or fine-tune with their own data to adapt to specific scenarios. The model primarily supports English, with limited multilingual capabilities. (Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin