Skip to content

CLI

go-phileas includes a command-line tool, phileas, that redacts sensitive information from text files using a JSON or YAML policy.

Installation

Build the binary from the repository root:

make build-cli

Or directly with go build:

go build -o phileas ./cmd/phileas

Or install it directly with go install:

go install github.com/philterd/go-phileas/cmd/phileas@latest

Usage

phileas --policy <policy.json> --input <input.txt> [--context <context>]
phileas --policy <policy.json> --input <input.txt> --evaluate --spans <spans.json> [--context <context>]
phileas --policy <policy.json|policy.yaml> --validate

Flags

Flag Required Description
--policy Yes Path to the JSON or YAML policy file
--input Yes (unless --validate is set) Path to the input text file to redact
--context No Context name to associate with the filter operation. If omitted, context checks are skipped.
--evaluate No Enable evaluation mode. Compares the spans identified by Phileas against a set of ground-truth spans and prints precision, recall, and F1.
--spans When --evaluate is set Path to a JSON file containing ground-truth spans. Required when --evaluate is set.
--validate No Validate the policy file and exit. Requires --policy; --input is not needed.

In standard mode the redacted text is written to standard output. In evaluation mode the precision, recall, and F1 metrics are written to standard output instead. Errors are written to standard error and the process exits with a non-zero status code.

Example

Create a policy file policy.json:

{
  "identifiers": {
    "ssn": {
      "ssnFilterStrategies": [{"strategy": "REDACT", "redactionFormat": "{{{REDACTED-%t}}}"}]
    },
    "emailAddress": {
      "emailAddressFilterStrategies": [{"strategy": "STATIC_REPLACE", "staticReplacement": "[EMAIL]"}]
    }
  }
}

Create an input file input.txt:

My SSN is 123-45-6789 and my email is john@example.com.

Run the CLI:

phileas --policy policy.json --input input.txt

Output:

My SSN is {{{REDACTED-ssn}}} and my email is [EMAIL].

Using a context name

Passing --context associates the filter operation with a named context. This is useful when the same pieces of PII should be treated consistently across multiple calls within the same logical group (e.g., all records in a case file).

phileas --policy policy.json --input input.txt --context case-4821

When --context is omitted the CLI passes an empty string, effectively skipping any context-based grouping.

Redirecting output

Because the redacted text goes to stdout you can pipe it to other tools or redirect it to a file:

# Save redacted output to a file
phileas --policy policy.json --input input.txt > redacted.txt

# Pipe into another command
phileas --policy policy.json --input input.txt | wc -c

Policy format

The --policy flag accepts any valid go-phileas JSON or YAML policy. Files with a .yaml or .yml extension are parsed as YAML; all other files are parsed as JSON. See Policies for the full schema and all available identifier options.

Validating a policy

Use the --validate flag to check whether a policy file is well-formed without redacting any text. The format (JSON or YAML) is detected automatically from the file extension.

phileas --policy policy.json --validate

If the policy is valid the CLI prints Policy is valid. and exits with status 0. If the policy contains formatting errors the CLI prints the error to standard error and exits with a non-zero status code.

# Valid policy
$ phileas --policy policy.json --validate
Policy is valid.

# Invalid policy
$ phileas --policy bad-policy.json --validate
policy is not valid: invalid character 'b' looking for beginning of object key string

Evaluating filter performance

The --evaluate flag switches the CLI into evaluation mode. Instead of printing redacted text, Phileas compares the spans it finds against a set of human-labeled (ground-truth) spans and prints precision, recall, and F1.

This is useful for:

  • Benchmarking how well a policy detects a specific type of sensitive information.
  • Tuning policy parameters and measuring the impact on detection quality.
  • Regression testing after policy changes.

Ground-truth spans file format

The --spans file must be a JSON array of span objects. Only characterStart and characterEnd are used for matching — all other fields are optional.

[
  {"characterStart": 10, "characterEnd": 21},
  {"characterStart": 35, "characterEnd": 51}
]

A span in the predicted output is counted as a true positive when a ground-truth span with the same characterStart and characterEnd exists. Predicted spans with no matching ground-truth span are false positives; ground-truth spans that were not predicted are false negatives.

Evaluation example

Given input.txt:

My SSN is 123-45-6789 and my email is john@example.com.

And spans.json (the SSN at positions 10–21 and the email at 38–54):

[
  {"characterStart": 10, "characterEnd": 21},
  {"characterStart": 38, "characterEnd": 54}
]

And policy.json:

{
  "identifiers": {
    "ssn": {},
    "emailAddress": {}
  }
}

Run:

phileas --policy policy.json --input input.txt --evaluate --spans spans.json

Output:

True Positives:  2
False Positives: 0
False Negatives: 0
Precision:       1.0000
Recall:          1.0000
F1:              1.0000