Does PII Guard work with all AI models?

Yes. PII Guard is model-agnostic. It processes your prompt before it reaches the model, so it works with all 300+ models available on AI·Collab — including GPT-5, Claude, Gemini, Mistral, and more.

Can I enable PII Guard for all models at once?

PII Guard is configured per-model in your account settings, giving you granular control. Organisation admins can enforce it centrally across all team members.

What happens to the original PDF after redaction?

The original PDF is not stored. Only the clean, redacted text is forwarded to the knowledge base for chunking and embedding. The source file is discarded after processing.

Does PII Guard affect response quality?

In most cases, no. The AI model receives a coherent prompt with [REDACTED] placeholders. It responds to the structure and intent of your message. For tasks that require the actual value (e.g. addressing a specific person by name), you would need to disable PII Guard for that session.

Is Presidio running on EU servers?

Yes. Presidio runs on AI·Collab's own European infrastructure in Germany. PII analysis never leaves our network and is covered by our Zero Data Retention Policy.

How is this different from the existing Privacy Guardrails?

The existing Privacy Guardrails blog post describes AI·Collab's general infrastructure-level protections (EU routing, ZDR, Azure Content Safety). PII Guard is a new, active redaction layer — it actively strips personal data from your input before it reaches the model. It is opt-in and per-model.

Does it support German language PII?

Yes. AI·Collab runs a custom Presidio analyzer configured with German NLP models (spaCy de_core_news_sm). German names, addresses, IBANs, and phone formats are correctly detected.

We Use Cookies

We use cookies to enhance your browsing experience, analyze site traffic, and personalize content. By clicking "Accept All", you consent to our use of cookies. You can customize your preferences or reject non-essential cookies.

Learn more about our cookie policy

Privacy & Security

PII & PHI Redaction: Protect Sensitive Data in Chat and PDF Documents

AI·Collab automatically removes names, email addresses, phone numbers, IBANs, and medical identifiers — before your data ever reaches an AI model.

Basics

5 min read

New Feature

Your Data. Your Rules.

Every day, professionals share sensitive information with AI tools without thinking twice — employee names in a support ticket, patient data in a medical summary, client IBANs in a financial report. AI·Collab PII Guard stops that. It automatically detects and redacts Personally Identifiable Information (PII) and Protected Health Information (PHI) before your content reaches any AI model — for both live chat and uploaded PDF documents. Powered by Microsoft Presidio on European infrastructure, with Zero Data Retention.

Watch: PII Redaction in Chat & PDF

See PII Guard in action: chat with and without the toggle enabled, then upload a PDF with sensitive data and watch the redaction in real time.

Watch on YouTube

What is PII Guard?

PII Guard is a privacy layer built directly into AI·Collab. It sits between you and the AI model — intercepting your input, running entity recognition on the text, and replacing sensitive values with neutral placeholders before the request is forwarded. The AI model never sees the raw data. Redacted values are never stored. You get the same quality of AI response — without the privacy risk.

Why It Matters

Built for teams where privacy is non-negotiable.

Chat PII Guard

Enable a per-model toggle in Account Settings. Every message you send is redacted by Presidio before it reaches the AI.

PDF PII Redaction

Upload a PDF to your knowledge base with PII filtering enabled. Marker (self-hosted OCR) extracts the text on-premise, Presidio redacts PII — only clean text enters the knowledge base. No raw document content reaches any external service.

Named Entity Recognition

Presidio uses spaCy NLP models to detect names, locations, organisations, and more — not just regex patterns.

Transparent Billing

PII-protected requests are marked separately in your usage dashboard. Chat: +30% credit uplift. PDF: +2 credits per page.

GDPR & ZDRP Compliant

Presidio runs on our European servers. PII analysis never leaves the EU. Covered by our Zero Data Retention Policy.

Per-Model Control

Enable PII Guard only for the models where you need it. No blanket settings — granular, per-model control per user.

Layer 1: Chat PII Guard

Redact sensitive data from every prompt before it reaches the model.

Enable the PII Guard toggle in your Account Settings for any model in your library. Once active, every message you send is processed by Presidio before being forwarded to the AI. Names, email addresses, phone numbers, IBANs, and other identifiers are replaced with [REDACTED] placeholders. The model responds to the cleaned prompt — and you get a full, useful answer without the privacy risk.

How Chat PII Guard works:

1You type a message

2Presidio detects PII entities

3PII replaced with [REDACTED]

4Clean prompt sent to AI model

Layer 2: PDF PII Redaction

Clean text-only ingestion — PII never enters your knowledge base.

When you upload a PDF to a knowledge base with PII filtering enabled, AI·Collab intercepts the file before it reaches OpenWebUI. Instead of sending the document to an external OCR API, AI·Collab uses Marker — a self-hosted, GPU-accelerated OCR engine running on our own infrastructure. The raw document text never leaves our network. Presidio then scans the extracted text and removes all detected PII. Only the clean, redacted text is forwarded for chunking and embedding into your knowledge base. This is the key difference from the standard upload path: in normal flow, Mistral OCR processes the document (EU-hosted, but external). In the PII-protected path, Marker handles OCR entirely on-premise — no raw document content ever reaches an external service.

How PDF PII Redaction works:

1Upload PDF to knowledge base (PII filter enabled)

2Middleware intercepts the file

3Marker (self-hosted GPU OCR) extracts text — no external API call

4Presidio scans and redacts all PII from extracted text

5Clean text forwarded to knowledge base — original PDF discarded

What PII is Detected

Presidio recognises a broad range of entity types in German and English text.

Full names

Email addresses

Phone numbers

IBAN / credit card numbers

Dates of birth

Passport / ID numbers

IP addresses

Medical record numbers

Locations & addresses

Organisation names

Social security numbers

URLs

Credits & Cost

PII Guard has a small credit overhead to cover the Presidio NLP processing: • Chat PII Guard: a 30% credit uplift applies to each protected request (e.g. a 10-credit response costs 13 credits with PII Guard enabled). • PDF PII Redaction: an additional 2 credits per page on top of the standard OCR cost. All PII-protected usage is clearly labelled in your account dashboard under Usage Statistics.

Security & Compliance

Presidio runs on AI·Collab's own European infrastructure in Germany — not a third-party cloud. PII analysis happens on-premise, inside the same network as the rest of the platform. No personal data leaves our servers unredacted. Redacted values are never stored or logged. All processing is covered by our Zero Data Retention Policy (ZDRP) and GDPR compliance framework. PII Guard is explicitly documented in our Data Processing Agreement (DPA / AVV under Art. 28 GDPR) as a technical and organisational measure. If your legal or compliance team requires a signed DPA, you can download and request countersignature at aicollab.app/dpa/. For organisations: PII Guard can be enforced centrally by admins, ensuring all team members process sensitive documents correctly — with a full audit trail in centralised billing.

Download the DPA / AVV (Art. 28 GDPR)

Frequently Asked Questions

PII redaction powered by Microsoft Presidio, running on AI·Collab's European infrastructure. Covered by our Zero Data Retention Policy (ZDRP).

Basics

PII Guard Update: One Toggle, Fully Sealed

PII Guard is now one account-wide toggle: chat text is redacted for all models and PDF attachments in chat are blocked — sensitive documents go through the on-premise Knowledge Base pipeline.

Basics

Privacy-First AI: Automatic PII Protection & Content Filtering

How AI·Collab protects your privacy with automatic PII redaction and content filtering — GDPR-compliant AI that puts security first.

Basics

New: EU-based Routing & EU-hosted AI Models

European routing for EU-hosted model inference (Azure Sweden Central) — why it matters for GDPR, audits, and procurement.

Ready to Experience 300+ AI Models?

Get started today. Access models from OpenAI, Google, Anthropic, Grok and more.

GDPR compliant · Zero data retention · Cancel anytime

We Use Cookies

PII & PHI Redaction: Protect Sensitive Data in Chat and PDF Documents

Your Data. Your Rules.

Watch: PII Redaction in Chat & PDF

What is PII Guard?

Why It Matters

Layer 1: Chat PII Guard

Layer 2: PDF PII Redaction

What PII is Detected

Credits & Cost

Security & Compliance

Related Reading

Frequently Asked Questions

Related Articles

PII Guard Update: One Toggle, Fully Sealed

Privacy-First AI: Automatic PII Protection & Content Filtering

New: EU-based Routing & EU-hosted AI Models

Ready to Experience 300+ AI Models?