Adversarial Methodology

Here is where it almost fails.
Here is why it doesn't.

A live stress test on a complex forensic passage. Adjust the confidence threshold. Click any redaction to inspect the rule. Download the audit log.

Threshold Decision Logic

How Confidence Scores Drive Decisions

The system does not guess. It applies a documented decision rule to every identifier.

Confidence scores are not metrics. They are decision inputs. Each score maps to exactly one of four outcomes.

95–100%

AUTO-ELIDE

Direct identifiers and near-certain indirect identifiers. No human review required.

80–94%

AUTO-ELIDE + FLAG

High-confidence identifiers with contextual or combination risk. Logged with flag for audit.

70–79%

HUMAN REVIEW

Ambiguous identifiers. System halts automatic elision. Reviewer receives context and rule.

< 70%

RETAIN + NOTE

Insufficient confidence for elision. Field retained in output. Decision documented in audit log.

Default threshold: 85% (Balanced). Configurable per protocol. Threshold and decision table version logged in every audit artifact.

Full Decision Walkthrough

One Messy Example. Eight Steps. Nothing Skipped.

The full lifecycle of a decision — from raw input to documented output.

This is not a clean case. One identifier is genuinely uncertain. Watch how the system handles it.

STEP

Input

RAW

Event

"The evaluee is a 41-year-old bilingual male of Puerto Rican descent, currently residing in the Eastside neighborhood of Austin. He was referred by his public defender, Ms. Gabriela Reyes, following a competency evaluation ordered in Cause No. 2024-CR-04471 in Travis County District Court."

Pipeline Note

Raw forensic passage ingested. Document received in memory. Not written to disk. Parser isolates text from any metadata structure.

STEP

Detection

PASS 1

Event

Named Entity Recognition pass identifies candidates: "41-year-old" (AGE), "bilingual male" (DEMOGRAPHIC), "Puerto Rican descent" (ETHNICITY), "Eastside neighborhood of Austin" (GEOGRAPHIC), "Gabriela Reyes" (PERSON), "2024-CR-04471" (CASE_NUMBER), "Travis County District Court" (JURISDICTION).

Pipeline Note

First pass: pattern recognition + NER model. Seven candidate identifiers flagged. Each assigned preliminary type classification.

STEP

Ambiguity

UNCERTAIN

Event

"Eastside neighborhood of Austin" — geographic subdivision raises combination risk question. Austin population: ~978,000. Eastside: ~85,000. Combined with age, ethnicity, and legal proceeding: population estimate narrows significantly. Is this a direct identifier? No. Is it a combination risk? Possibly.

Pipeline Note

Second pass: contextual evaluation. System identifies that geographic subdivision + demographic combination + specific court proceeding creates potential re-identification pathway. Confidence: 76%. Below auto-elide threshold.

STEP

Scoring

COMPUTED

Event

AGE (41): 0.94 → auto-elide. DEMOGRAPHIC (bilingual male): 0.83 → auto-elide + flag. ETHNICITY (Puerto Rican): 0.91 → auto-elide. GEOGRAPHIC (Eastside, Austin): 0.76 → human review. PERSON (Gabriela Reyes): 0.99 → auto-elide. CASE_NUMBER: 1.00 → auto-elide. JURISDICTION (Travis County): 0.88 → auto-elide.

Pipeline Note

Each candidate scored against protocol threshold. Decision table applied. One field (GEOGRAPHIC) falls in the human review band. All others auto-resolved.

STEP

System Action

RESOLVED

Event

Six identifiers auto-elided per threshold. One identifier ("Eastside neighborhood of Austin", confidence 0.76) routed to human review queue with full context: field value, type classification, confidence score, governing rule, and combination risk flag.

Pipeline Note

Pipeline halts automatic elision for the flagged field. All other decisions finalized. Output held pending human review decision on the geographic field.

STEP

Human Review

REVIEWER

Event

Reviewer receives: field value, risk classification (COMBINATION_RISK), confidence score (76%), governing rule (Expert Determination — geographic subdivision combined with demographic and jurisdictional data), and the three contributing fields creating the combination risk. Reviewer decision: ELIDE. Reasoning: combination with ethnicity and specific court creates population below acceptable threshold.

Pipeline Note

Human reviewer sees exactly what the system saw. Decision documented. Reviewer identity and timestamp logged. Reasoning captured in audit trail.

STEP

Output

CLEAN

Event

"The evaluee is a [AGE REDACTED] bilingual male of [ETHNICITY REDACTED] descent, currently residing in [GEOGRAPHIC REDACTED]. He was referred by his public defender, [NAME REDACTED], following a competency evaluation ordered in [CASE NUMBER REDACTED] in [JURISDICTION REDACTED]."

Pipeline Note

De-identified output generated. All seven identifiers elided. Clinical content fully preserved. Document structure intact. Re-identification risk: below Expert Determination threshold per reviewer attestation.

STEP

Audit Log

DOCUMENTED

Event

7 elision events logged. 6 auto-resolved by system. 1 escalated to human review. Reviewer decision, identity, timestamp, and reasoning captured. Decision source noted for each entry: RULE_ENGINE or REVIEWER. Reproducibility statement appended. Log available for IRB submission, deposition, or peer review.

Pipeline Note

Complete decision chain documented. Every elision traceable to its source: rule-based, model-based, or human reviewer. Audit log signed with protocol version and threshold applied.

Identifiers in passage

Auto-resolved by system

Escalated to human review

Design Philosophy

On Human Judgment

This system is explicitly designed to hand uncertainty to humans.

Not as a safety net. As a structural decision. The pipeline does not claim omniscience. It surfaces uncertainty with full context and stops. A human clinician, researcher, or reviewer makes the call. That decision is documented as a first-class event in the audit trail — with identity, timestamp, reasoning, and the information the system provided at the point of escalation.

What the system decides

Direct identifiers. High-confidence indirect identifiers. Cases above threshold with no ambiguity flags.

What the system escalates

Combination risk. Contextual identifiers. Relational references. Any field where confidence falls in the review band.

What the human decides

Everything the system is not certain about. The reviewer sees exactly what the system saw — no more, no less.

What gets documented either way

Every decision. Source: RULE_ENGINE or REVIEWER. Rule applied. Confidence score. Timestamp. Audit trail is complete regardless of escalation path.

Stress Test

Confidence Threshold

Conservative

≥ 70% confidence

All flagged items elided. Maximum protection, some clinical detail removed.

Balanced

≥ 85% confidence

High-confidence identifiers elided. Contextual and relational flags sent to human review.

Permissive

≥ 95% confidence

Only direct identifiers elided. Indirect and combination risks flagged for review.

Identifiers Detected

Elided at Threshold

Sent to Human Review

Confidence Range

72–100%

Input Passage — Click any redaction or flag to inspect

Client is a ████████ Hispanic female referred by her attorney, ████████ of ████████ in ████████. She presents with reported symptoms of PTSD following a motor vehicle accident on ████████, on ████████. Client reports that her ████████, who works at ████████, drove her to the initial ER visit at ████████. She currently resides at ████████, and can be reached at ████████. Her treating therapist, ████████, has documented twelve sessions of EMDR treatment since the incident.

Elision Inspector

Click any redacted block or underlined text in the passage to inspect the rule, confidence score, and elision decision.

Flag types:COMBINATION RISKCONTEXTUALRELATIONAL REFREVIEW REQUIRED

Human Review Queue

Items Below Threshold — Require Clinical Judgment

The system does not claim omniscience. It surfaces uncertainty. Human judgment determines the final boundary.

Type

Demographic

Confidence

81%

Rule

HIPAA Expert Determination — demographic combination with geographic data creates re-identification risk above threshold

COMBINATION RISK

Type

Treatment Detail

Confidence

72%

Rule

Treatment frequency flagged as indirect identifier under Expert Determination — retained for review

REVIEW REQUIRED

Audit Log

Machine-Readable Audit Trail

Every elision decision logged with its rule, confidence, and flag. The audit trail is the methodology. Exportable for IRB review or deposition.

Document

ELIDER AUDIT LOG

Protocol

HIPAA Expert Determination

Reproducibility

Deterministic — same input yields identical output

Operator Attestation: Each elision decision is documented with the governing rule, confidence score, and flag classification. This log may be submitted as methodology documentation in IRB review, research publication, or legal proceeding.

#

Type

Conf.

Flag

Source

Rule

001

Age

97.0%

—

RULE_ENGINE

HIPAA Safe Harbor §164.514(b)(2)(i) — ages over 89 are direct identifiers; all ages retained as potential indirect identifiers under Expert Determination

002

Person Name

99.0%

—

RULE_ENGINE

Named Entity Recognition: third-party person — attorney name

003

Organization

98.0%

—

RULE_ENGINE

Named Entity Recognition: organization — law firm

004

Geographic

96.0%

—

RULE_ENGINE

HIPAA Safe Harbor §164.514(b)(2)(i) — geographic subdivision smaller than state

005

Date

99.0%

—

RULE_ENGINE

HIPAA Safe Harbor §164.514(b)(2)(i) — dates directly related to individual

006

Geographic

94.0%

CONTEXTUAL

RULE_ENGINE + FLAG

Contextual geographic identifier — specific location of incident

007

Person Name

88.0%

RELATIONAL_REF

RULE_ENGINE + FLAG

Named Entity Recognition: relational reference — 'John' identified as proper name via relational clause 'her brother'

008

Org + Location

91.0%

COMBINATION_RISK

RULE_ENGINE + FLAG

Combination identifier: employer + location creates indirect re-identification pathway

009

Organization

97.0%

—

RULE_ENGINE

Named Entity Recognition: healthcare facility

010

Street Address

100.0%

—

RULE_ENGINE

HIPAA Safe Harbor §164.514(b)(2)(i) — street address is direct identifier

011

Phone Number

100.0%

—

RULE_ENGINE

HIPAA Safe Harbor §164.514(b)(2)(i) — telephone number is direct identifier

012

Person Name

99.0%

—

RULE_ENGINE

Named Entity Recognition: third-party person — treating clinician

Reliability Boundary

Elider flags and scores. It does not decide. The pipeline is sufficient for direct identifiers and high-confidence indirect identifiers. Contextual, relational, and combination-risk identifiers require human clinical judgment at the review boundary. This is not a limitation. This is the design.

Re-Identification Risk Taxonomy

The Standard Is Not “Did We Remove Identifiers”

The standard is: is re-identification risk acceptably low in context?

Elider evaluates four distinct risk categories. Each requires different detection logic. Each requires different documentation. All four appear in the audit log.

DIRECTDirect Identifier Risk

A single field that identifies an individual without additional information. Names, addresses, phone numbers, SSNs, dates of birth.

Governing Standard

HIPAA Safe Harbor §164.514(b)(2)(i) — 18 enumerated identifiers

Example

"4821 Mockingbird Lane, Austin, TX 78745"

COMBINATIONCombination Risk

Two or more fields that, individually, would not identify an individual but, in combination, create a re-identification pathway. The canonical failure mode of Safe Harbor-only analysis.

Governing Standard

HIPAA Expert Determination §164.514(b)(1) — statistical re-identification analysis

Example

"Hispanic female" + "San Antonio" + "motor vehicle accident" + "2024"

CONTEXTUALContextual Risk

A field that is not an identifier in isolation but becomes one given the context of the document, the population, or the proceeding. Geographic specifics, institutional references, incident descriptions.

Governing Standard

Expert Determination — contextual population analysis

Example

"I-35 near New Braunfels" in an accident report narrows population to witnesses and parties of a specific incident

LINKAGELinkage Risk

Re-identification achieved by linking the document to external datasets. Occupation, institutional affiliation, and unique family structures can create a population of one when linked to publicly available records.

Governing Standard

Expert Determination — external dataset linkage analysis

Example

"only senior electrical engineer at company" + "rural county <8,000" + "identical twin" = population of 1

Linkage Risk Case Study

The Case Safe Harbor Alone Would Pass

The Linkage Risk Case

No single field is a direct identifier. The combination is.

Input Passage

Evaluee is a 52-year-old male, employed as a senior electrical engineer, who sustained a traumatic brain injury in a workplace accident. He resides in a rural county in Central Texas with a population under 8,000. He is the only employee at his company with his job title and tenure. His treating neurologist noted that he has an identical twin sibling, also a professional in the same field, which complicates baseline cognitive comparison.

Field-by-Field Risk Analysis

“52-year-old male”LOW

Age + sex alone: insufficient for re-identification.

“senior electrical engineer”LOW

Occupation alone: common enough in general population.

“rural county, Central Texas, pop. <8,000”MEDIUM

Geographic specificity narrows population significantly.

“only employee with this title and tenure”HIGH

Occupation + organizational position: near-unique within employer context.

“identical twin, same profession”CRITICAL

Combination of age, geography, occupation, family structure, and professional field creates a population of likely 1. This is a direct re-identification vector even without any name, address, or date.

Expert Determination Finding

No HIPAA Safe Harbor identifier appears in this passage. All 18 categories are technically absent. Under Safe Harbor alone, this document would pass. Under Expert Determination, it fails. The linkage of occupation, geography, organizational uniqueness, and family structure reduces the re-identification population to a near-certain individual. This is what Elider's two-pass architecture is designed to catch.

Here is where it almost fails.Here is why it doesn't.

The system does not guess. It applies a documented decision rule to every identifier.

The full lifecycle of a decision — from raw input to documented output.

This system is explicitly designed to hand uncertainty to humans.

The standard is: is re-identification risk acceptably low in context?

The Linkage Risk Case

Here is where it almost fails.
Here is why it doesn't.