Skip to content

Audit log

Every state-changing action in Arbiter is recorded in the audit_log collection in MongoDB. There are two ways to view and export it:

  • Admin → Audit log (/admin/audit) — filtered slices across the whole system, exported as JSON or CSV. Admins only.
  • Audit Log popup on the Document Queue — the full history for a single document (document-level events plus all of its span events), shown inline in a modal. Admin only — the underlying GET /api/v1/documents/{id}/history endpoint requires ROLE_ADMIN. Admins can also download the history as CSV from the popup.

What gets logged

Action Resource Notes
LOGIN (success / failure) User Authentication outcome. Every form-login attempt produces one row regardless of result. A failure row carries a reason detail naming the underlying exception (e.g. BadCredentialsException, LockedException, UsernameNotFoundException) so an operator can tell wrong-password from unknown-user from rate-limited at a glance. The attempted email is always recorded, even when the email doesn't match a known account.
LOGOUT User Manual sign-out
BATCH_CREATE Batch Includes name, group, thresholds
BATCH_GROUP_CHANGE Batch Old → new group
BATCH_THRESHOLDS_CHANGE Batch Both PII and Document threshold deltas
BATCH_WEIGHTS_CHANGE Batch Override map
BATCH_WEIGHTS_RESET Batch All overrides cleared
BATCH_CLOSE Batch Records who closed it and when
DOCUMENT_UPLOAD Document Web upload
DOCUMENT_INGEST Document API ingest (POST /api/v1/ingest)
DOCUMENT_IMPORT Document One row per document pulled in by a data-import background job. Outcome is SUCCESS for newly-imported documents and SKIPPED for placeholders written when the source row already had a Document. Details include the import jobId and the source attribution (sourceSystem, sourceUrl, sourceIndex, sourceDocId) so the document can be traced back to the OpenSearch / Elasticsearch hit, S3 object, or local file it came from.
DATA_IMPORT_STARTED BackgroundJob Fired when a data-import job is promoted from PENDING to RUNNING by the dispatcher. Details include the job's type, sourceId, batchId, and friendly names.
DATA_IMPORT_COMPLETED BackgroundJob Terminal SUCCESS row for a data-import job. Details include processed, failed, skipped counters.
DATA_IMPORT_FAILED BackgroundJob Terminal FAILURE row for a data-import job (validation failure at queue time, or runtime error during the run). Carries the same counters plus an error string.
DOCUMENT_STATUS_CHANGE Document Generic status transition; also fires on Approve / Unapprove / Unreject
DOCUMENT_APPROVAL Document Reviewer approved a document; payload includes approvedBy, acquired, and required approval counts
DOCUMENT_REJECT Document Reviewer rejected a document; payload includes previous status and rejectedBy
DOCUMENT_UNAPPROVE Document Approval rescinded and document returned to review
DOCUMENT_UNREJECT Document Rejection rescinded and document returned to review
DOCUMENT_FINALIZE Document Document finalized; includes certificateId and documentHash
DOCUMENT_DOWNLOAD Document Reviewer downloaded a finalized redacted document
DOCUMENT_AUDIT_EXPORT Document Admin downloaded the per-document audit log as CSV
DOCUMENT_PII_SENT_TO_LLM Document Fired before each outbound Ollama LLM call (Explain or Second Opinion). Records the Ollama instance, model, and span count. Written before the HTTP request so the entry exists even if Ollama is unreachable or returns an error. See LLM-as-a-Judge.
DOCUMENT_LOCK_BROKEN Document Admin forcibly cleared a review lock held by another user
FINALIZATION_POLICY_APPLIED Document A finalization policy ran after document finalize; payload includes policyId, policyName, option, and action
SPAN_UPDATE Span Status and/or type change
SPAN_REDACT_LIKE Span Counts of new-and-approved spans
USER_CREATE / _UPDATE / _DELETE User Admin user management
GROUP_CREATE / _UPDATE / _DELETE Group Admin group management
API_KEY_GENERATE / _REVOKE User Per-user API key lifecycle
PASSWORD_CHANGE User Self-service password change
NOTIFICATION_SETTINGS_CHANGE Settings SMTP settings save (excluding the password value)
INBOX_MESSAGE_READ InboxMessage User marked an inbox notification as read
INBOX_MESSAGE_ARCHIVED InboxMessage User archived an inbox notification
OPENSEARCH_DATASOURCE_CREATE / _UPDATE / _DELETE OpenSearchDataSource Data source registered, edited (with passwordChanged boolean), or removed (see Data sources)
ELASTICSEARCH_DATASOURCE_CREATE / _UPDATE / _DELETE ElasticsearchDataSource Same shape as OpenSearch — separate collection, separate audit lineage
S3_DATASOURCE_CREATE / _DELETE S3DataSource S3 data source registered or removed
RDB_DATASOURCE_CREATE / _DELETE RelationalDbDataSource Relational database data source registered or removed
RDB_DANGEROUS_SQL_BLOCKED RelationalDbDataSource An RDB save was rejected because the SQL contains a disallowed keyword (DELETE, TRUNCATE, DROP). Payload includes the matched keywords and offending SQL; entityId is null because nothing was saved.
LOCAL_DATASOURCE_CREATE / _DELETE LocalDirectoryDataSource Local directory data source registered or removed

Each entry stores:

  • timestamp (UTC Instant)
  • userEmail and userId (when the actor is signed in)
  • action, resourceType, resourceId
  • outcome (SUCCESS / FAILURE)
  • ipAddress (honors X-Forwarded-For first hop)
  • details (per-action contextual map; never includes secrets)

Export

The form has:

  • Start time / End time — required, interpreted in the server's local timezone. The page pre-fills the past 24 hours.
  • User email — optional exact match.
  • Resource type — optional, one of User / Group / Batch / Document / Span (or "All").
  • Resource ID — optional exact match.
  • Preview — runs the same query as the export and shows the first 10 matching entries inline so you can sanity-check filters before committing to a download. If the preview is empty, the download will be empty too — widen the time range or relax filters.
  • Download JSON / Download CSV — same filters, two formats.

Both formats include every column above. CSV embeds the per-action details map as a JSON-encoded string in a single quoted column so spreadsheet readers preserve it.

The export is capped at 100,000 rows per request. Narrow the time range or filters if you hit the cap.

Per-document audit log

Every row in the Document Queue (/queue) has an Audit Log button that opens a modal showing the full history for that one document. This includes both document-level events (ingest, status changes, finalization) and all events on spans that belong to the document (status changes, type changes, manual creation, deletion, second-opinion requests). Entries are sorted newest first and paginated 10 per page.

Admin only. The underlying GET /api/v1/documents/{id}/history endpoint requires ROLE_ADMIN because the history includes raw PII span text. Non-admin reviewers receive 403 if they call the endpoint directly; the popup button is hidden for non-admins in the UI.

Download CSV

The popup has a Download button that exports the document's full audit history (not just the page currently shown) as a CSV file, sorted newest to oldest. The file is named audit-log-<documentId>.csv.

Only administrators can download the audit log. For non-admin users the Download button is rendered disabled with a tooltip explaining the restriction; the underlying API endpoint (GET /api/v1/documents/{id}/history.csv) also rejects non-admin requests with HTTP 403.

The export itself is audited. Before generating the CSV, Arbiter writes a DOCUMENT_AUDIT_EXPORT entry capturing the admin who initiated the download and the timestamp. Because the entry is recorded before the audit log is queried, and the CSV is sorted newest-first, the export event appears as the top row of every downloaded file — so each export is self-attesting.

The CSV is intended as a chain-of-custody artifact and is redacted on export: the PII text of each span is omitted. Instead, span entries include the span's location so you can correlate the entry back to the original document without leaking the PII value itself.

Columns:

Column Notes
timestamp ISO-8601 UTC instant
actor Email address of the user who performed the action (blank if system). The CSV always uses the email address; the JSON history endpoints (GET …/history and GET …/spans/{id}/history) use the MongoDB user ID by default — pass ?resolveActors=true (admin only) to receive email addresses there instead.
action The action code (e.g. SPAN_UPDATE, DOCUMENT_INGEST)
resourceType Document or Span
resourceId The document or span ID
spanType PII type — populated only for span entries
spanCharacterStart Inclusive start offset in the document text (span entries)
spanCharacterEnd Exclusive end offset in the document text (span entries)
spanPage 1-based page number for PDF documents (span entries)
details The per-action details map, JSON-encoded in a quoted field

spanText is never present in the CSV. If you need to correlate an entry back to the actual PII value, do so against your secured copy of the original document using the character offsets and page number.

LLM-as-a-Judge events

Every call that sends document content to an Ollama instance produces a DOCUMENT_PII_SENT_TO_LLM entry. The event is written before the HTTP request leaves Arbiter so the record exists even when Ollama is unreachable or returns an error.

Field in details Content
instanceId MongoDB ID of the Ollama instance
instanceName Display name of the Ollama instance
model Model name sent to Ollama (e.g. llama3)
spanCount Number of PII spans included in the prompt (Explain calls only)
spanId ID of the span being evaluated (Second Opinion calls only)
spanType PII type of that span (Second Opinion calls only)

The resource type is always Document and the resource ID is the parent document's ID — even for Second Opinion calls, which originate from a span — because the full document text is always included in the prompt.

Ollama logging

Arbiter sends the full unredacted document text and all PII span values to Ollama in the request body. Ollama must be deployed with request-body logging disabled. If OLLAMA_DEBUG=1 is set or your Ollama deployment forwards request bodies to an external logging sink, PII will appear in logs outside Arbiter's security boundary. See Admin → LLM-as-a-Judge for configuration requirements.

Storage and retention

Audit entries are stored in MongoDB and are indexed by timestamp, the actor's email, action, resource type, and resource ID so the Audit Log search remains fast as the collection grows. Arbiter does not automatically expire entries — set up a database-level TTL on the timestamp field or a periodic deletion job if you need bounded retention.

Failure modes

If writing an audit entry fails (e.g., MongoDB is unavailable for a moment), the action it describes still succeeds — the audit service swallows the exception and logs a warning. This trades audit completeness for application availability; review logs from the Arbiter process for any "Failed to write audit log entry" warnings.