How It Works
This module plugs into Phileas through a small service-provider interface (SPI) and reimplements the GLiNER inference pipeline that PhEye runs through the Python gliner library.
The SPI and service discovery
Phileas core defines a small SPI in the package ai.philterd.phileas.services.filters.ai.pheye:
PhEyeDetector: produces raw detections (PhEyeSpan) for a piece of text.PhEyeDetectorProvider: builds a detector. It is discovered at runtime viajava.util.ServiceLoader.
When a PhEye filter is configured with a modelPath (the policy field, or PhiSQL's DETECT PHEYE ... MODEL '<path>'), PhEyeFilter looks up a PhEyeDetectorProvider on the classpath and asks it to build a local detector. With no modelPath, Phileas uses the remote HTTP detector instead.
This module ships LocalPhEyeDetectorProvider and registers it in META-INF/services/ai.philterd.phileas.services.filters.ai.pheye.PhEyeDetectorProvider. Because the provider is registered for ServiceLoader, simply adding this module as a dependency makes local inference available. No extra wiring is required.
The detector returns PhEyeSpan results. The PhEyeFilter then applies its per-label thresholds and replacement strategies on top, exactly as it does for remote detections, so the rest of the redaction policy behaves identically whether detection ran locally or remotely.
The GLiNER pipeline
LocalPhEyeDetector reimplements the pipeline PhEye runs through the Python gliner library (GLiNER 0.2.25, uni-encoder span model, markerV0):
- Split text into words with a Unicode-aware whitespace splitter (regex
\w+(?:[-_]\w+)*|\S), keeping each word's character offsets. - Build the prompt as
[<<ENT>> label]* <<SEP>> <text words>. - Tokenize the prompt and text pre-split, and build a
words_maskmarking the first subtoken of each text word. - Enumerate candidate spans
[i, i+width]forwidthin0..max_width-1(max_widthis 12 for the PII model). - Run the ONNX model with inputs
input_ids,attention_mask,words_mask,text_lengths,span_idx, andspan_mask, and read thelogitsoutput of shape[words, width, classes]. - Decode: apply
sigmoid(logits) > threshold, then perform greedy flat (non-overlapping, highest score first) selection. - Map the selected word spans back to character offsets and return them as
PhEyeSpanresults.
The span width, maximum length, and prompt tokens come from the model's gliner_config.json. See Model Directory for the files this requires, and Limitations and Accuracy for the parity status of this port.