Running PhilterScope

PhilterScope is a standalone CLI tool for PII redaction auditing and policy optimization. This document explains the available commands and flags.

Note that PhilterScope is intended to be run locally and not over a network. If running over a network, be sure to use SSL/TLS connection to MongoDB and to the PhilterScope UI.

1. Installation

PhilterScope is written in Go. You can build the binary for your platform using the provided Makefile:

make build

This will create philterscope-audit and philterscope-serve binaries in the project root.

2. Commands

PhilterScope provides two primary commands: philterscope-audit for performing audits and philterscope-serve for viewing results.

`philterscope-audit`

The philterscope-audit command compares raw text files against a "golden dataset" to evaluate redaction quality.

Usage:

PHILTERSCOPE_MONGODB_CONNECTION_STRING=mongodb://localhost:27017/philterscope ./philterscope-audit [flags]

Or without MongoDB:

./philterscope-audit [flags]

Commonly Used Flags:

Flag	Default	Description
`--url`	`http://localhost:8080`	The Philter API URL to use for redacting the raw text files.
`--token`	(none)	The Philter API Token, if required by your Philter server.
`--policy`	`default`	The name of the Philter policy to use for redaction.
`--input`	`./raw`	The directory containing the raw text files or Philter `explain` JSON files.
`--golden`	`golden.json`	The path to the golden dataset file or directory.
`--output`	`.`	The directory where the `report.html` and `report.json` will be saved.
`--threshold`	`0.5`	The default recall threshold for policy suggestions (0.0 to 1.0).
`--thresholds`	(none)	Per-entity recall thresholds (e.g., `NAME=0.9,SSN=1.0`).
`--group`	`default`	Assign a group name to the audit for history tracking.
`--ai`	`false`	Enable AI-driven policy recommendations (requires Ollama).

Example:

./philterscope-audit --input ./examples/raw --golden ./examples/golden --output ./examples/ --threshold 0.75 --ai

Thresholds can also be set individually for each entity type:

./philterscope-audit --golden ./examples/golden/ --input ./examples/raw/ --output ./examples/ --threshold 0.75 --thresholds "NAME=0.9,SSN=1.0"

`philterscope-serve`

The philterscope-serve command launches the Evaluation UI, allowing you to view and interact with the results of a previous audit.

Usage:

./philterscope-serve [flags]

Flags:

Flag	Default	Description
`--report`	`report.json`	The JSON report file generated by the `audit` command.
`--port`	`5000`	The port on which the UI will be served.
`--privacy`	`false`	Enable privacy mode (obfuscates PII in UI).
`--id`	(none)	The ID of a specific audit result to view from history.

Example:

PHILTERSCOPE_MONGODB_CONNECTION_STRING=mongodb://localhost:27017/philterscope ./philterscope-serve --privacy

Or without MongoDB:

./philterscope-serve --report ./examples/report.json --port 5000 --privacy

History and Audit Management

PhilterScope maintains a history of your audit runs. When using philterscope-serve, you can browse previous audits and their results.

Local Storage: By default, audits are stored in the .philterscope directory in your home folder.
Shared Storage: Use MongoDB for a centralized audit repository across your team.
Privacy Mode: When --privacy is enabled, all PII found in the audit results (both expected and actual) is replaced with a cryptographic hash in the UI.

3. Data Formats

PhilterScope is designed to be flexible with your data. It supports multiple formats for both input files and golden datasets.

Input Files

The input directory (--input) can contain:

Raw Text Files: Simple .txt files that will be sent to the Philter API for redaction.
Philter Explain JSON: If you have already redacted text using Philter's explain API, you can provide the JSON response directly. This allows you to audit pre-redacted data without calling the Philter API again.

Golden Datasets

The golden dataset (--golden) defines the expected redactions. PhilterScope looks for a match in several ways:

Tagged Text: Wrap PII in your raw text files with tags like <NAME>John Doe</NAME>. PhilterScope can parse these directly.
JSON Spans: A JSON file that defines the text and the character offsets for each PII entity. This is the recommended format for large datasets.

Example JSON Span format:

{
  "text": "My name is John Doe and I live at 123 Main St.",
  "labels": [
    {
      "text": "John Doe",
      "start": 11,
      "end": 19,
      "label": "NAME"
    }
  ]
}

Matching Logic

When you run philterscope-audit, PhilterScope searches for the golden data in this order:

The path provided by the --golden flag (if it's a file).
Matching filenames in the directory provided by --golden (if it's a directory).
<filename>.golden in the input directory.
A golden/ subdirectory within or next to your input directory.
Inline tags within the input file itself.

4. Understanding the Report

After an audit completes, PhilterScope generates an HTML report and a JSON report containing overall metrics (precision, recall, F1-score), per-entity recall, a confusion matrix, per-document results, and recommended policy changes.

Confusion Matrix

The confusion matrix shows how each expected entity type was classified by Philter. It is displayed immediately below the PII Recall Performance table in the report.

Each row represents an expected entity type from the golden dataset, and each column represents what Philter detected it as. The cells contain the count of occurrences.

There are two special labels in the matrix:

(missed) (column): The entity was present in the golden dataset but Philter did not detect it at all. These are false negatives.
(none) (row): Philter detected an entity that was not present in the golden dataset. These are false positives (spurious detections).

Cells are color-coded:

Green: Correct classifications (expected and detected types match).
Orange: Misclassifications (Philter detected the entity but assigned the wrong type, e.g., a NAME detected as an ADDRESS).
Red: Missed entities or spurious detections.

Using the Confusion Matrix for Policy Tuning

The confusion matrix helps identify specific weaknesses in your Philter policy:

High counts in the (missed) column for a given entity type indicate that Philter is failing to detect that type. Consider adjusting the policy to add or tune the relevant filter.
Off-diagonal orange cells reveal type confusion. For example, if LOCATION entities are frequently detected as ADDRESS, the policy may need more specific patterns to distinguish the two.
High counts in the (none) row indicate false positives. Philter is flagging text that is not PII, which may require tightening filter rules or adjusting confidence thresholds.

The confusion matrix data is also included in the JSON report under the confusion_matrix field for programmatic analysis.