Skip to content
Tools · Apr 27, 2026

Three open-source web apps demonstrate OpenAI's Privacy Filter for PII detection and redaction

Hugging Face engineers built three reference applications using OpenAI's newly released Privacy Filter model and Gradio Server, showing how to detect and redact personally-identifiable information in documents, images, and text pastes at scale.

Trust70
HypeLow hype

1 source · cross-referenced

ShareXLinkedInEmail
TL;DR
  • OpenAI released Privacy Filter, a 1.5B-parameter Apache 2.0-licensed model that detects eight categories of personally-identifiable information (PII) in a single 128k-token context window and achieves state-of-the-art performance on the PII-Masking-300k benchmark.
  • Hugging Face engineers built three production applications demonstrating the model: a document privacy explorer that highlights PII spans in PDFs and DOCX files, an image anonymizer that redacts PII detected via OCR with draggable canvas overlays, and a pastebin tool that generates dual URLs for public redacted and private unredacted views.
  • All three apps use Gradio Server, a FastAPI-based backend framework that provides queuing, GPU allocation, and unified endpoint handling for both browser and programmatic clients, eliminating the need to duplicate business logic across different interface layers.

OpenAI released Privacy Filter this week as an open-source PII detector, now available on Hugging Face's model hub. The model is 1.5 billion parameters with 50 million active parameters and is licensed permissively under Apache 2.0. It identifies personally-identifiable information across eight categories—private person, address, email, phone number, URL, date, account number, and secret—in a single forward pass over a 128,000-token context window. According to Hugging Face, the model achieves state-of-the-art performance on the PII-Masking-300k benchmark.

Hugging Face engineers demonstrated three distinct applications built around Privacy Filter and Gradio Server, a backend framework for pairing custom frontends with queued inference. The Document Privacy Explorer accepts PDF or DOCX uploads, extracts text, runs it through Privacy Filter in a single pass, and renders the document in a styled HTML reader with detected PII spans highlighted by category and client-side filterable toggles. Because the full document processes in one 128k-context window, text offsets map cleanly to rendered positions without chunking artifacts.

The Image Anonymizer accepts screenshots or images, applies optical character recognition to extract bounding boxes for each word, reconstructs full text with a character-to-box mapping, runs Privacy Filter over the reconstructed text, and returns pixel-aligned rectangles for detected PII. The frontend renders these as draggable black bars on a canvas, allowing users to manually adjust positions, add new bars, and toggle entire categories on and off. Image export happens client-side without server round-trips.

SmartRedact Paste is a pastebin tool that applies Privacy Filter to submitted text, replacing detected PII spans with category placeholders (e.g., <PRIVATE_EMAIL>), and generates two URLs: a public one serving the redacted version and a token-gated private URL showing the original with highlighted spans. The app routes both as FastAPI endpoints within Gradio Server, demonstrating how queued model endpoints and plain HTTP routes can coexist in a single application. All three apps show how Gradio Server's unified queueing and client libraries eliminate code duplication across browser-based and programmatic access patterns.

Sources
  1. 01Hugging FaceHow to build scalable web apps with OpenAI's Privacy Filter
Also on Tools

Stories may contain errors. Dispatch is assembled with AI assistance and curated by human editors; despite the trust-score filter, mistakes happen. We correct publicly — every article links to its revision history. Nothing here is financial, legal, or medical advice. Verify before relying on any claim.

© 2026 Dispatch. No ads. No sponsorships. No paid placement. Reader-supported via Ko-fi.

Built by a person who cares about honest AI news.