Here is where it almost fails.
Here is why it doesn't.

A live stress test on a complex forensic passage. Adjust the confidence threshold. Click any redaction to inspect the rule. Download the audit log.

Threshold Decision Logic

The system does not guess. It applies a documented decision rule to every identifier.

Confidence scores are not metrics. They are decision inputs. Each score maps to exactly one of four outcomes.

95–100%
AUTO-ELIDE

Direct identifiers and near-certain indirect identifiers. No human review required.

80–94%
AUTO-ELIDE + FLAG

High-confidence identifiers with contextual or combination risk. Logged with flag for audit.

70–79%
HUMAN REVIEW

Ambiguous identifiers. System halts automatic elision. Reviewer receives context and rule.

< 70%
RETAIN + NOTE

Insufficient confidence for elision. Field retained in output. Decision documented in audit log.

Default threshold: 85% (Balanced). Configurable per protocol. Threshold and decision table version logged in every audit artifact.
Full Decision Walkthrough

The full lifecycle of a decision — from raw input to documented output.

This is not a clean case. One identifier is genuinely uncertain. Watch how the system handles it.

STEP
01
Input
RAW
Event

"The evaluee is a 41-year-old bilingual male of Puerto Rican descent, currently residing in the Eastside neighborhood of Austin. He was referred by his public defender, Ms. Gabriela Reyes, following a competency evaluation ordered in Cause No. 2024-CR-04471 in Travis County District Court."

Pipeline Note

Raw forensic passage ingested. Document received in memory. Not written to disk. Parser isolates text from any metadata structure.

STEP
02
Detection
PASS 1
Event

Named Entity Recognition pass identifies candidates: "41-year-old" (AGE), "bilingual male" (DEMOGRAPHIC), "Puerto Rican descent" (ETHNICITY), "Eastside neighborhood of Austin" (GEOGRAPHIC), "Gabriela Reyes" (PERSON), "2024-CR-04471" (CASE_NUMBER), "Travis County District Court" (JURISDICTION).

Pipeline Note

First pass: pattern recognition + NER model. Seven candidate identifiers flagged. Each assigned preliminary type classification.

STEP
03
Ambiguity
UNCERTAIN
Event

"Eastside neighborhood of Austin" — geographic subdivision raises combination risk question. Austin population: ~978,000. Eastside: ~85,000. Combined with age, ethnicity, and legal proceeding: population estimate narrows significantly. Is this a direct identifier? No. Is it a combination risk? Possibly.

Pipeline Note

Second pass: contextual evaluation. System identifies that geographic subdivision + demographic combination + specific court proceeding creates potential re-identification pathway. Confidence: 76%. Below auto-elide threshold.

STEP
04
Scoring
COMPUTED
Event

AGE (41): 0.94 → auto-elide. DEMOGRAPHIC (bilingual male): 0.83 → auto-elide + flag. ETHNICITY (Puerto Rican): 0.91 → auto-elide. GEOGRAPHIC (Eastside, Austin): 0.76 → human review. PERSON (Gabriela Reyes): 0.99 → auto-elide. CASE_NUMBER: 1.00 → auto-elide. JURISDICTION (Travis County): 0.88 → auto-elide.

Pipeline Note

Each candidate scored against protocol threshold. Decision table applied. One field (GEOGRAPHIC) falls in the human review band. All others auto-resolved.

STEP
05
System Action
RESOLVED
Event

Six identifiers auto-elided per threshold. One identifier ("Eastside neighborhood of Austin", confidence 0.76) routed to human review queue with full context: field value, type classification, confidence score, governing rule, and combination risk flag.

Pipeline Note

Pipeline halts automatic elision for the flagged field. All other decisions finalized. Output held pending human review decision on the geographic field.

STEP
06
Human Review
REVIEWER
Event

Reviewer receives: field value, risk classification (COMBINATION_RISK), confidence score (76%), governing rule (Expert Determination — geographic subdivision combined with demographic and jurisdictional data), and the three contributing fields creating the combination risk. Reviewer decision: ELIDE. Reasoning: combination with ethnicity and specific court creates population below acceptable threshold.

Pipeline Note

Human reviewer sees exactly what the system saw. Decision documented. Reviewer identity and timestamp logged. Reasoning captured in audit trail.

STEP
07
Output
CLEAN
Event

"The evaluee is a [AGE REDACTED] bilingual male of [ETHNICITY REDACTED] descent, currently residing in [GEOGRAPHIC REDACTED]. He was referred by his public defender, [NAME REDACTED], following a competency evaluation ordered in [CASE NUMBER REDACTED] in [JURISDICTION REDACTED]."

Pipeline Note

De-identified output generated. All seven identifiers elided. Clinical content fully preserved. Document structure intact. Re-identification risk: below Expert Determination threshold per reviewer attestation.

STEP
08
Audit Log
DOCUMENTED
Event

7 elision events logged. 6 auto-resolved by system. 1 escalated to human review. Reviewer decision, identity, timestamp, and reasoning captured. Decision source noted for each entry: RULE_ENGINE or REVIEWER. Reproducibility statement appended. Log available for IRB submission, deposition, or peer review.

Pipeline Note

Complete decision chain documented. Every elision traceable to its source: rule-based, model-based, or human reviewer. Audit log signed with protocol version and threshold applied.

Identifiers in passage
7
Auto-resolved by system
6
Escalated to human review
1
Design Philosophy

On Human Judgment

This system is explicitly designed to hand uncertainty to humans.

Not as a safety net. As a structural decision. The pipeline does not claim omniscience. It surfaces uncertainty with full context and stops. A human clinician, researcher, or reviewer makes the call. That decision is documented as a first-class event in the audit trail — with identity, timestamp, reasoning, and the information the system provided at the point of escalation.

What the system decides
Direct identifiers. High-confidence indirect identifiers. Cases above threshold with no ambiguity flags.
What the system escalates
Combination risk. Contextual identifiers. Relational references. Any field where confidence falls in the review band.
What the human decides
Everything the system is not certain about. The reviewer sees exactly what the system saw — no more, no less.
What gets documented either way
Every decision. Source: RULE_ENGINE or REVIEWER. Rule applied. Confidence score. Timestamp. Audit trail is complete regardless of escalation path.
Stress Test
Conservative
70% confidence
All flagged items elided. Maximum protection, some clinical detail removed.
Balanced
85% confidence
High-confidence identifiers elided. Contextual and relational flags sent to human review.
Permissive
95% confidence
Only direct identifiers elided. Indirect and combination risks flagged for review.
Identifiers Detected
14
Elided at Threshold
12
Sent to Human Review
2
Confidence Range
72–100%
Input Passage — Click any redaction or flag to inspect

Client is a ████████ Hispanic female referred by her attorney, ████████ of ████████ in ████████. She presents with reported symptoms of PTSD following a motor vehicle accident on ████████, on ████████. Client reports that her ████████, who works at ████████, drove her to the initial ER visit at ████████. She currently resides at ████████, and can be reached at ████████. Her treating therapist, ████████, has documented twelve sessions of EMDR treatment since the incident.

Elision Inspector
Click any redacted block or underlined text in the passage to inspect the rule, confidence score, and elision decision.
Flag types:COMBINATION RISKCONTEXTUALRELATIONAL REFREVIEW REQUIRED
Human Review Queue

The system does not claim omniscience. It surfaces uncertainty. Human judgment determines the final boundary.

Type
Demographic
Confidence
81%
Rule
HIPAA Expert Determination — demographic combination with geographic data creates re-identification risk above threshold
COMBINATION RISK
Type
Treatment Detail
Confidence
72%
Rule
Treatment frequency flagged as indirect identifier under Expert Determination — retained for review
REVIEW REQUIRED
Audit Log

Every elision decision logged with its rule, confidence, and flag. The audit trail is the methodology. Exportable for IRB review or deposition.

Document
ELIDER AUDIT LOG
Protocol
HIPAA Expert Determination
Reproducibility
Deterministic — same input yields identical output
Operator Attestation: Each elision decision is documented with the governing rule, confidence score, and flag classification. This log may be submitted as methodology documentation in IRB review, research publication, or legal proceeding.
#
Type
Conf.
Flag
Source
Rule
001
Age
97.0%
RULE_ENGINE
HIPAA Safe Harbor §164.514(b)(2)(i) — ages over 89 are direct identifiers; all ages retained as potential indirect identifiers under Expert Determination
002
Person Name
99.0%
RULE_ENGINE
Named Entity Recognition: third-party person — attorney name
003
Organization
98.0%
RULE_ENGINE
Named Entity Recognition: organization — law firm
004
Geographic
96.0%
RULE_ENGINE
HIPAA Safe Harbor §164.514(b)(2)(i) — geographic subdivision smaller than state
005
Date
99.0%
RULE_ENGINE
HIPAA Safe Harbor §164.514(b)(2)(i) — dates directly related to individual
006
Geographic
94.0%
CONTEXTUAL
RULE_ENGINE + FLAG
Contextual geographic identifier — specific location of incident
007
Person Name
88.0%
RELATIONAL_REF
RULE_ENGINE + FLAG
Named Entity Recognition: relational reference — 'John' identified as proper name via relational clause 'her brother'
008
Org + Location
91.0%
COMBINATION_RISK
RULE_ENGINE + FLAG
Combination identifier: employer + location creates indirect re-identification pathway
009
Organization
97.0%
RULE_ENGINE
Named Entity Recognition: healthcare facility
010
Street Address
100.0%
RULE_ENGINE
HIPAA Safe Harbor §164.514(b)(2)(i) — street address is direct identifier
011
Phone Number
100.0%
RULE_ENGINE
HIPAA Safe Harbor §164.514(b)(2)(i) — telephone number is direct identifier
012
Person Name
99.0%
RULE_ENGINE
Named Entity Recognition: third-party person — treating clinician
Reliability Boundary

Elider flags and scores. It does not decide. The pipeline is sufficient for direct identifiers and high-confidence indirect identifiers. Contextual, relational, and combination-risk identifiers require human clinical judgment at the review boundary. This is not a limitation. This is the design.

Re-Identification Risk Taxonomy

The standard is: is re-identification risk acceptably low in context?

Elider evaluates four distinct risk categories. Each requires different detection logic. Each requires different documentation. All four appear in the audit log.

DIRECTDirect Identifier Risk

A single field that identifies an individual without additional information. Names, addresses, phone numbers, SSNs, dates of birth.

Governing Standard
HIPAA Safe Harbor §164.514(b)(2)(i) — 18 enumerated identifiers
Example
"4821 Mockingbird Lane, Austin, TX 78745"
COMBINATIONCombination Risk

Two or more fields that, individually, would not identify an individual but, in combination, create a re-identification pathway. The canonical failure mode of Safe Harbor-only analysis.

Governing Standard
HIPAA Expert Determination §164.514(b)(1) — statistical re-identification analysis
Example
"Hispanic female" + "San Antonio" + "motor vehicle accident" + "2024"
CONTEXTUALContextual Risk

A field that is not an identifier in isolation but becomes one given the context of the document, the population, or the proceeding. Geographic specifics, institutional references, incident descriptions.

Governing Standard
Expert Determination — contextual population analysis
Example
"I-35 near New Braunfels" in an accident report narrows population to witnesses and parties of a specific incident
LINKAGELinkage Risk

Re-identification achieved by linking the document to external datasets. Occupation, institutional affiliation, and unique family structures can create a population of one when linked to publicly available records.

Governing Standard
Expert Determination — external dataset linkage analysis
Example
"only senior electrical engineer at company" + "rural county <8,000" + "identical twin" = population of 1
Linkage Risk Case Study

The Linkage Risk Case

No single field is a direct identifier. The combination is.

Input Passage

Evaluee is a 52-year-old male, employed as a senior electrical engineer, who sustained a traumatic brain injury in a workplace accident. He resides in a rural county in Central Texas with a population under 8,000. He is the only employee at his company with his job title and tenure. His treating neurologist noted that he has an identical twin sibling, also a professional in the same field, which complicates baseline cognitive comparison.

Field-by-Field Risk Analysis
52-year-old maleLOW
Age + sex alone: insufficient for re-identification.
senior electrical engineerLOW
Occupation alone: common enough in general population.
rural county, Central Texas, pop. <8,000MEDIUM
Geographic specificity narrows population significantly.
only employee with this title and tenureHIGH
Occupation + organizational position: near-unique within employer context.
identical twin, same professionCRITICAL
Combination of age, geography, occupation, family structure, and professional field creates a population of likely 1. This is a direct re-identification vector even without any name, address, or date.
Expert Determination Finding

No HIPAA Safe Harbor identifier appears in this passage. All 18 categories are technically absent. Under Safe Harbor alone, this document would pass. Under Expert Determination, it fails. The linkage of occupation, geography, organizational uniqueness, and family structure reduces the re-identification population to a near-certain individual. This is what Elider's two-pass architecture is designed to catch.