Trust Index Methodology

Lawnise AI Trust Index Methodology v1.0

Approved public methodology for the versioned method used to convert scan observations into Lawnise AI Trust Index scores. The methodology underwrites Lawnise's Public-facing AI governance and evidence infrastructure and grounds the Enterprise Public AI TRiSM category that Lawnise owns.

Status: v1.0 — Approved for Publication
Effective Date: 2026-05-11
Canonical URL: www.lawnise.com/trust-index/methodology/v1

Lawnise AI Trust Index — Methodology Specification v1.0

Version: 1.0 Status: v1.0 — APPROVED FOR PUBLICATION Created: 2026-03-31 Effective date: 2026-05-11 Authored by: Lawnise Authority Core team Approved by: Lawmence (CEO) Canonical URL: https://www.lawnise.com/trust-index/methodology/v1 Predecessor: P0D frozen benchmark (117 observations, 2 streams, 3 AI providers, audited prompt set)

Purpose

This document is the normative specification for the Lawnise AI Trust Index v1.0. It defines how raw scan observations become publishable scores. Every Trust Index score references this document version. Every formula, threshold, and rule stated here is a binding constraint on the Index Calculator implementation.

This is not a brainstorm, roadmap, or aspirational document. Every section ends in a concrete rule or decision.

Document Structure

Part	Covers
Part 1: Measurement	Inputs, scoring formula, aggregation, confidence intervals
Part 2: Publication	Score status classification, publication criteria, anonymisation
Part 3: Governance	Versioning, benchmark integrity, correction/retraction, anti-gaming

Part 1: Measurement

1.1 Input Data

The Trust Index is calculated from Index Scan results produced by the Lawnise Index Scan Engine. Each scan result is one observation: one SPL prompt × one AI provider × one RKB fact set → one TruthGuard verdict.

1.1.1 Two Streams

Stream	Name	Inputs	Output
Stream 1	Industry Trust Index	SPL non-branded prompts + RKB-Industry facts	Public industry-level accuracy score
Stream 2	Market Accuracy Benchmark	SPL branded prompt templates + RKB-Entity facts	Anonymised entity-level benchmark

Rule 1.1.1-A: Stream 1 and Stream 2 are scored independently. They are never blended into a single number.

Rule 1.1.1-B: Stream 2 entity scores are stored privately. Public output contains only anonymised distributions (ranges, quartiles, aggregates). Entity names never appear in public Trust Index output.

1.1.2 Observation Unit

One observation = one row in ac_index_scan_results:

prompt_id — which SPL prompt was asked
ai_model — which AI provider answered
stream — industry or entity
jurisdiction — e.g., "MY"
sector — e.g., "banking", "insurance", "capital_markets"
findings — TruthGuard verification output (verdict, risk_type, severity)
accuracy_score — per-observation accuracy (see §1.2)

Rule 1.1.2-A: Only observations from completed scan runs (ac_index_scan_runs.status = 'completed') enter the calculation. Failed/cancelled runs are excluded.

Rule 1.1.2-B: Observations with verdict scan_error are excluded from scoring but counted in sample quality reporting (see §1.4). They indicate scraper or provider infrastructure failures, not AI knowledge failures.

1.1.3 Data Independence

The Trust Index is produced entirely from Lawnise-owned data:

SPL prompts: curated by Lawnise, stored in ac_standard_prompts
RKB facts: curated by Lawnise from public sources, stored in ac_reference_facts
Scans: initiated by Lawnise, not by any tenant

Rule 1.1.3-A (Benchmark Integrity): The Trust Index calculation must NOT inherit, reference, or be influenced by tenant-specific TCC policies, tenant BKB facts, tenant prompts, or tenant configuration. Running under the Lawnise operational tenant is an operational convenience (JWT/DB context), not a policy source.

Rule 1.1.3-B: Methodology parameters are stored in ac_config (global, non-tenant-scoped), not in tenant_policies (TCC). This separation is a hard constraint for InATJI credibility.

1.2 Per-Observation Scoring

Each observation receives a binary accuracy classification based on the TruthGuard engine's stored finding structure.

1.2.1 Outcome-to-Accuracy Mapping

The TruthGuard engine stores results in ac_index_scan_results.findings as a JSONB array of finding objects. Each finding has a verdict field (no_risk or risk_detected) and, for risk_detected findings, a risk_type classification. The Index Calculator maps these stored outcomes to binary accuracy as follows:

Stored Outcome	Accuracy Value	Rationale
`verdict = 'no_risk'`	1 (accurate)	AI response is consistent with RKB facts
`verdict = 'risk_detected'`, any `risk_type` except `refusal_non_answer`	0 (inaccurate)	AI response contradicts, fabricates, or materially omits verified facts
`verdict = 'risk_detected'`, `risk_type = 'refusal_non_answer'`	0 (inaccurate)	AI refused to answer using publicly available information (benchmark mode finding)
`verdict = 'scan_error'`	excluded	Infrastructure failure, not an AI knowledge assessment
`verdict = 'no_bkb_facts'`	excluded	Insufficient RKB coverage to assess; does not count for or against accuracy

Rule 1.2.1-A: Scoring is binary (0 or 1). There is no partial credit. An observation is either accurate or inaccurate.

Rule 1.2.1-B: A risk_detected finding with risk_type = 'refusal_non_answer' counts as inaccurate for Trust Index purposes. This risk type is emitted by the benchmark code path when an AI refuses to answer a question about publicly available information. It is a risk_type classification within the risk_detected verdict, not a separate top-level verdict.

Rule 1.2.1-C: If an observation produces multiple findings, the observation is scored as inaccurate (0) if ANY finding has verdict = 'risk_detected'. A single risk finding is sufficient to classify the observation as inaccurate.

Rule 1.2.1-D: Future methodology versions may introduce graded scoring (e.g., severity-weighted). v1.0 uses binary classification only, which is the simplest defensible approach given current sample sizes.

1.2.2 Residual Engine Precision Acknowledgment

The TruthGuard engine that produces verdicts has known residual precision limitations:

Combined accuracy of ~86-88% on the frozen P0D benchmark (117 observations)
FP rate ~9% (engine over-flags some correct answers)
FN rate ~4% (engine misses some incorrect answers)

Rule 1.2.2-A: Trust Index scores are presented as engine-assessed accuracy, not ground truth. All published scores carry a standard disclosure: _"Scores reflect automated verification against curated reference facts. Engine accuracy on the development benchmark is approximately 87%. Results should be interpreted as directional indicators, not absolute measures."_

Rule 1.2.2-B: As the engine improves (FP/FN rates decrease), the disclosure is updated to reflect current benchmark accuracy. Methodology version does NOT change for engine improvements alone — only for formula/threshold/rule changes.

1.3 Score Aggregation

1.3.1 Stream 1: Industry Trust Index Score

The Industry Trust Index score for a given jurisdiction and period is calculated as:

Industry_Score(j, p) = (Σ accurate_observations) / (Σ scored_observations) × 100

Where:

j = jurisdiction (e.g., "MY")
p = period (calendar month)
accurate_observations = observations with accuracy value = 1
scored_observations = total observations minus excluded observations (scan_error, no_bkb_facts)

This is a simple unweighted accuracy percentage across all scored observations in the period.

Rule 1.3.1-A (No sector weighting in v1.0): v1.0 does NOT apply differential weights to sectors (banking vs insurance vs capital markets) or prompt categories (consumer vs regulatory vs process). All scored observations count equally.

Rationale: Sector weighting requires empirical calibration data that does not yet exist. Premature weighting would be arbitrary and would embed unjustified assumptions. v1.0 establishes the baseline with equal weighting. Sector weighting may be introduced in v2.0 after ≥6 months of data provides a basis for calibration.

Rule 1.3.1-B: The headline number is the jurisdiction-level Industry Trust Index score. Breakdowns by provider, sector, and prompt category are provided as supplementary detail in the breakdown field, but they are not separately scored or ranked in v1.0.

1.3.2 Breakdown Structure

The breakdown JSONB field in ac_trust_index_scores contains:

{
  "by_provider": {
    "chatgpt": { "scored": 20, "accurate": 14, "accuracy": 70.0 },
    "gemini": { "scored": 20, "accurate": 17, "accuracy": 85.0 },
    "copilot": { "scored": 20, "accurate": 15, "accuracy": 75.0 }
  },
  "by_sector": {
    "banking": { "scored": 45, "accurate": 35, "accuracy": 77.8 },
    "insurance": { "scored": 10, "accurate": 8, "accuracy": 80.0 },
    "capital_markets": { "scored": 5, "accurate": 4, "accuracy": 80.0 }
  },
  "by_prompt_category": {
    "consumer": { "scored": 12, "accurate": 10, "accuracy": 83.3 },
    "regulatory": { "scored": 8, "accurate": 6, "accuracy": 75.0 },
    "process": { "scored": 10, "accurate": 7, "accuracy": 70.0 },
    "consumer_protection": { "scored": 8, "accurate": 7, "accuracy": 87.5 },
    "comparative": { "scored": 6, "accurate": 5, "accuracy": 83.3 },
    "current_affairs": { "scored": 4, "accurate": 2, "accuracy": 50.0 },
    "misconception": { "scored": 6, "accurate": 5, "accuracy": 83.3 },
    "cross_sector": { "scored": 6, "accurate": 4, "accuracy": 66.7 }
  },
  "excluded": {
    "scan_error": 3,
    "no_bkb_facts": 1
  }
}

Rule 1.3.2-A: Breakdowns are informational. They enable drill-down analysis but do not feed into the headline score calculation.

1.3.3 Stream 2: Market Accuracy Benchmark

Stream 2 produces per-entity accuracy scores using the same binary classification:

Entity_Score(e, j, p) = (Σ accurate_observations for entity e) / (Σ scored_observations for entity e) × 100

Per-entity scores are private (stored in ac_index_scan_results, accessible only to the entity's own tenant via the commercial product).

The public Market Accuracy Benchmark output is an anonymised distribution (example values are illustrative only and do not represent any real jurisdiction's reported figures):

{
  "jurisdiction": "MY",
  "sector": "banking",
  "period": "2026-03",
  "entity_count": 12,
  "accuracy_distribution": {
    "min": 52.0,
    "q1": 65.0,
    "median": 73.5,
    "q3": 82.0,
    "max": 95.0,
    "mean": 74.2,
    "std_dev": 12.1
  }
}

Rule 1.3.3-A: The public anonymised output must contain ≥5 entities per sector per jurisdiction. If fewer than 5 entities have sufficient data, the sector-level distribution is suppressed (not published) to prevent re-identification.

Rule 1.3.3-B: Entity names never appear in any public Trust Index output, API response, or report.

1.4 Confidence Intervals

Every Trust Index score includes a confidence interval that quantifies uncertainty due to finite sample size.

1.4.1 Statistical Method

The confidence interval is computed using the Wilson score interval at the 95% confidence level. Wilson is used for all sample sizes because it provides reliable coverage even at small n and near-boundary proportions, unlike the Wald interval which can produce degenerate intervals when p̂ is close to 0 or 1 or n is small.

Given:
  p̂  = accurate_observations / n  (observed accuracy proportion)
  n  = number of scored observations
  z  = 1.96 (95% confidence z-score)

Wilson centre:
  p̃ = (p̂ + z²/(2n)) / (1 + z²/n)

Wilson half-width:
  w = (z / (1 + z²/n)) × √(p̂(1 - p̂)/n + z²/(4n²))

Wilson interval:
  [p̃ - w,  p̃ + w]

The stored confidence_interval value in ac_trust_index_scores is the half-width w × 100 (expressed in percentage points). The published centre point is the Wilson centre p̃ × 100, not the raw observed proportion p̂ × 100.

Rule 1.4.1-A: The Wilson score interval is the sole CI method in v1.0. No Wald interval, no sample-size branching.

Example: If p̂ = 0.87, n = 117:

z²/n = 3.8416/117 = 0.03284
p̃ = (0.87 + 1.9208/117) / (1 + 0.03284) = (0.87 + 0.01642) / 1.03284 = 0.8584
w = (1.96 / 1.03284) × √(0.87 × 0.13/117 + 3.8416/54756)
w = 1.8978 × √(0.000967 + 0.0000702) = 1.8978 × 0.03221 = 0.06114
Published as: 85.8% ± 6.1%

1.4.2 CI Suppression

Rule 1.4.2-A: If the calculated CI half-width exceeds 15 percentage points, the score is marked indicative regardless of other sample quality criteria. A score of "73% ± 18%" is too imprecise to be useful.

Rule 1.4.2-B: The CI is always reported alongside the headline score. A score without a CI is never published.

1.5 Sample Quality

Sample quality determines whether a score is meaningful enough to classify as definitive, preliminary, or indicative.

1.5.1 Sample Quality Dimensions

Dimension	Field in `sample_quality` JSONB	Measures
Observation count	`scored_observations`	Total scored observations (excluding scan_error, no_bkb_facts)
Provider diversity	`distinct_providers`	Number of distinct AI providers assessed
Sector coverage	`distinct_sectors`	Number of distinct sectors covered
Session diversity	`distinct_scan_sessions`	Number of distinct scan run sessions
Prompt coverage	`distinct_prompts`	Number of distinct SPL prompts used
Excluded ratio	`excluded_ratio`	(scan_error + no_bkb_facts) / total_observations

1.5.2 Score Status Thresholds

Status	Criteria (ALL must be met)	Meaning
`definitive`	`scored_observations` ≥ 50 AND `distinct_providers` ≥ 3 AND `distinct_sectors` ≥ 2 AND `distinct_scan_sessions` ≥ 5 AND `distinct_prompts` ≥ 15 AND CI half-width ≤ 10pp AND `excluded_ratio` ≤ 0.15	High-confidence score suitable for public citation
`preliminary`	`scored_observations` ≥ 20 AND `distinct_providers` ≥ 2 AND `distinct_sectors` ≥ 1 AND `distinct_scan_sessions` ≥ 2 AND CI half-width ≤ 15pp	Directionally meaningful but insufficient for definitive claims
`indicative`	Does not meet `preliminary` criteria, OR CI half-width > 15pp	Insufficient data; internal use only

Rule 1.5.2-A: Score status is computed deterministically from sample quality metrics. It is never manually overridden.

Rule 1.5.2-B: An indicative score is calculated and stored (for trend tracking) but never published externally.

Rule 1.5.2-C: A preliminary score may be published with an explicit caveat: _"This score is preliminary. Sample size and provider coverage do not yet meet definitive thresholds."_

1.5.3 Validation Against P0D Benchmark

Applying these thresholds to the frozen P0D pilot dataset (for reference only — P0D is a development benchmark, not a production Index Scan):

Dimension	Industry (60)	Entity (57)	Combined (117)
Scored observations	60	57	117
Distinct providers	3	3	3
Distinct sectors	1 (banking)	1 (banking)	1 (banking)
Distinct scan sessions	Multiple	Multiple	Multiple
Distinct prompts	20	19	39

P0D assessment: Would achieve preliminary status (meets count and provider thresholds, but only 1 sector — banking — so definitive requires ≥2 sectors not met). This is expected: the pilot intentionally focused on banking only.

1.5.4 Intra-Session Scan-Error Repair

A scan session may produce rows with verdict = 'scan_error' (e.g., scraper timeout, provider rate-limit, malformed or unusable provider response, or null ai_response) due to scraper or AI-provider infrastructure failure. These rows are excluded from scoring per Rule 1.1.2-B but reduce the evaluable sample for the session. To preserve sample size without compromising session identity, intra-session repair is permitted under the following rules.

Rule 1.5.4-A — Repair scope (logical tuples, not scan_run_ids). A repair pass MAY re-dispatch only those logical session tuples (original_scan_run_id, prompt_id, ai_model) whose row landed with verdict = 'scan_error' and has not yet been successfully replaced. Repair-pass scan runs are dispatched under new scan_run_ids, and each repair-pass row MUST carry an explicit reference back to its original_scan_run_id (in metadata or via the lock memo's session manifest) so the evidence-set deduplicator can map repair rows to the logical tuple they replace. Repair MUST NOT re-dispatch tuples that already produced a usable non-error response, MUST NOT change the prompt set, and MUST NOT change the dataset_version pins.

Rule 1.5.4-B — Original panel only, partitioning permitted, same window. A repair pass MUST use only scanners drawn from the original panel; no replacement or substitute scanner may be introduced for the same session. A repair pass MAY be partitioned into single-scanner repair runs (one new scan_run_id per failed scanner per scope), or run as a multi-scanner repair — both shapes are allowed as long as the union of repair scanners is a subset of the original panel. The repair pass MUST complete within 24 hours of the original session's first dispatch. Beyond 24 hours, the repaired rows are considered a new sampling event and MUST be treated as a new session per Rule 1.5.2.

Rule 1.5.4-C — Maximum two repair passes. A given session permits at most two repair passes. After the second pass, any logical tuple whose latest row is still verdict = 'scan_error' is recorded as a permanent scan-error for the session and contributes to excluded_ratio per Rule 1.1.2-B and to the per-scope scanner-availability disclosure required at publication time.

Rule 1.5.4-D — Session identity preserved. Repair-pass scan runs share the same dataset_version pin and the same logical session identity as the original dispatch. The session evidence set is the union of original-dispatch rows and repair-pass rows for the session, deduplicated per Rule 1.5.4-G. The distinct_scan_sessions sample-quality metric counts the original dispatch + all its repair passes as one session, regardless of how many scan_run_ids the repair was partitioned into.

Rule 1.5.4-E — Disclosure. The lock memo and any external publication MUST disclose the per-scanner availability after the final repair pass. If any single scanner's verdict = 'scan_error' rate after the final repair pass exceeds 40% of its dispatched logical tuples for any scope, the lock memo MUST explicitly call this out and consider whether to (a) include the scanner with reduced weight, (b) exclude the scanner from this session's score with disclosure, or (c) escalate to the human reviewer for sign-off.

Rule 1.5.4-F — Hash discipline. The session evidence-set SHA-256 hash referenced by the row-by-row SOP is computed over the post-repair, deduplicated evidence set per Rule 1.5.4-G. Any pre-repair hash is operational telemetry only and MUST NOT be cited as the canonical session hash in audit artifacts.

Rule 1.5.4-G — Repair precedence (deduplication). When constructing the post-repair evidence set, for each logical tuple (original_scan_run_id, prompt_id, ai_model) apply the following precedence in order:

1. Original dispatch row is the starting point. 2. Repair pass 1 row overrides the original for that logical tuple only if the repair pass 1 row has verdict ≠ 'scan_error' (a usable response). If repair pass 1 also produced verdict = 'scan_error', retain the original row. 3. Repair pass 2 row overrides the result of step 2 for that logical tuple only if the repair pass 2 row has verdict ≠ 'scan_error'. If repair pass 2 also produced verdict = 'scan_error', retain the result of step 2. 4. If all passes produced `verdict = 'scan_error'`, the latest scan-error row is retained as a permanent scan-error placeholder for the tuple and contributes to excluded_ratio per Rule 1.1.2-B.

The lock memo MUST enumerate the full list of scan_run_ids comprising the session (original + all repair passes) so any auditor can reconstruct the post-repair evidence set deterministically.

Part 2: Publication

2.1 Publication Criteria

2.1.1 Eligibility

A Trust Index score is eligible for publication when:

1. Score status is definitive or preliminary (never indicative) 2. The score has been reviewed by at least one human operator (Lawmence or designated reviewer) 3. The published_at timestamp is set (marks the moment of publication)

Rule 2.1.1-A: Publication is an explicit human-approved action, not an automatic result of calculation. The Index Calculator computes and stores scores; a human sets published_at.

Rule 2.1.1-B: Once published_at is set, the score row becomes immutable (except for retraction — see §3.3). The evidence_hash and hash-chain link are frozen at publication time.

2.1.2 Publication Format

Every published Trust Index score includes:

Field	Example	Required
Jurisdiction	Malaysia (MY)	Yes
Period	March 2026	Yes
Overall score	73.2%	Yes
Confidence interval	± 6.1%	Yes
Score status	Definitive	Yes
Methodology version	v1.0	Yes
Provider breakdown	ChatGPT: 68%, Gemini: 81%, Copilot: 71%	Yes
Sector breakdown	Banking: 73%, Insurance: 78%	Yes, if ≥2 sectors
Sample size	180 scored observations	Yes
Engine accuracy disclosure	~87% benchmark accuracy	Yes
Excluded count	5 scan errors	Yes, if > 0

Rule 2.1.2-A: The engine accuracy disclosure (Rule 1.2.2-A) is mandatory in every publication. It is never omitted or buried in footnotes.

2.1.3 Stream 2 Publication

Stream 2 (Market Accuracy Benchmark) is published as anonymised distributions only:

Sector-level accuracy ranges, quartiles, and means
No entity names, no per-entity scores, no data that enables re-identification
Suppressed if < 5 entities per sector (Rule 1.3.3-A)

Rule 2.1.3-A: Stream 2 public reports carry the label "Market Accuracy Benchmark" (not "Trust Index") to distinguish them from the headline Industry Trust Index.

2.1.4 Publication Cleanup Governance

In some governed benchmark runs, a stored risk_detected row may contain one or more findings that are not publication-grade contradictions even though the raw verifier emitted them. v1.0 allows a narrow publication-layer cleanup mechanism so public reporting can exclude clearly weak findings without overwriting raw evidence.

This is a publication governance primitive, not a score-formula change.

Rule 2.1.4-A (Raw evidence immutability): Raw benchmark evidence is never overwritten during publication cleanup. The stored top-level row verdict remains unchanged. If the verifier emitted risk_detected, the row remains risk_detected in raw storage.

Rule 2.1.4-B (Metadata-only exclusion): Publication cleanup is implemented only through finding-level metadata, using metadata.publication_exclusion on the specific finding(s) being excluded. The exclusion must not be represented by deleting findings, mutating their substantive text, or rewriting the row verdict.

Rule 2.1.4-C (Finding-level granularity): Exclusions are applied at the individual finding level, not the whole-row level. If a row contains both excluded and non-excluded findings, the remaining valid finding(s) still count that row as inaccurate for publication purposes.

Rule 2.1.4-D (Locked categories in v1.0): Publication cleanup is restricted to explicit, reviewable categories. The currently approved v1.0 categories are:

self_admitting_fp
off_topic_retrieval

Rule 2.1.4-E (Current locked criteria): The first governed production publication used the following locked criteria:

self_admitting_fp: explanation text explicitly self-negates the contradiction (for example, phrases equivalent to "not a direct contradiction", "not a true contradiction", "consistent, not contradictory", "rephrasing, not a contradiction", or clearly borderline-only wording)
off_topic_retrieval: the finding relies on a known retrieval mismatch pattern rather than the prompt's actual issue. Current locked examples:
investment-linked / unit-portion facts attached to prompts that are not about investment-linked coverage
foreign-currency-deposit / Ringgit-aggregation facts attached to prompts that are not about currency coverage

Rule 2.1.4-F (Governed workflow): Publication cleanup requires a governed workflow:

1. dry-run proposed exclusions 2. produce an artifact listing rows/findings, reasons, and before/after counts 3. independent cross-check / review 4. explicit approval 5. metadata-only apply

Rule 2.1.4-G (Audit trail): Every applied exclusion must record enough audit detail to reconstruct what happened, including:

exclusion reason/category
matched phrase or matched pattern when applicable
applied timestamp
applied-by identifier / workflow

Rule 2.1.4-H (Methodology boundary): Publication cleanup under these locked categories does not by itself change the methodology version. It is treated as governed benchmark adjudication over raw stored findings. Expanding categories, changing category semantics materially, or altering how published rows are counted may require a methodology update.

Rule 2.1.4-I (Future benchmark lines): The same metadata-only cleanup pattern applies to future benchmark lines and jurisdictions unless explicitly superseded by a later methodology version. Future sessions must not invent broader ad hoc exclusion categories without recording the decision in methodology or a linked governance memo.

2.2 Jurisdiction Normalisation

2.2.1 Principle

Trust Index scores are relative to each jurisdiction's own baseline. A score of 73% in Malaysia and 68% in the UK does NOT mean Malaysia's AI ecosystem is "better" — the jurisdictions have different regulatory complexity, AI maturity, RKB depth, and SPL coverage.

2.2.2 Rules

Rule 2.2.2-A: Trust Index scores are never ranked across jurisdictions in v1.0 publications. No "Country X ranks #1" claims.

Rule 2.2.2-B: Cross-jurisdiction comparison is only valid at the trend level. _Example only:_ "Both Malaysia and Singapore showed improving accuracy over the past 6 months." Absolute score comparisons are explicitly disclaimed.

Rule 2.2.2-C: A "Global Composite Score" is deferred until ≥3 jurisdictions each have ≥6 months of definitive scores. The methodology for global aggregation will be specified in a future version.

Rule 2.2.2-D: When publishing multi-jurisdiction reports, each jurisdiction's score is presented independently with its own CI, sample quality, and sector coverage. No combined cross-jurisdiction headline number in v1.0.

Part 3: Governance

3.1 Versioning

3.1.1 Version Numbering

The methodology uses semantic versioning: MAJOR.MINOR.

Change Type	Version Impact	Example
Formula change (scoring, aggregation, weighting)	Major	v1.0 → v2.0
Threshold change (sample quality, CI suppression)	Major	v1.0 → v2.0
New score status tier	Major	v1.0 → v2.0
New breakdown dimension	Minor	v1.0 → v1.1
Disclosure wording update	Minor	v1.0 → v1.1
Engine accuracy improvement (no formula change)	No version change	—
RKB/SPL content refresh	No version change	—

Rule 3.1.1-A: Every ac_trust_index_scores row records the methodology_version used to calculate it. Historical scores are never restated under a new version.

Rule 3.1.1-B: When a major version change occurs, a bridge document is published explaining: (1) what changed, (2) why, (3) how old scores compare to new scores, and (4) whether trend continuity is preserved.

3.1.2 Backward Compatibility

Rule 3.1.2-A: Old scores calculated under v1.0 remain valid and citable as "v1.0 scores." They are never deleted or overwritten.

Rule 3.1.2-B: If a new version produces materially different scores for the same inputs, both the old-version and new-version scores are stored (via the methodology_version dimension in the unique constraint). This allows parallel display during transition periods.

3.2 Benchmark Integrity

3.2.1 Independence Constraints

Rule 3.2.1-A: The Trust Index calculation reads from ac_config for methodology parameters, never from tenant_policies (TCC).

Rule 3.2.1-B: The TCC-controlled TruthGuard enhancements (rollout toggles) apply to tenant AI monitoring scans only. Index Scan Engine runs use the benchmark-gated code path (discovery_metadata.stream), which is independent of per-tenant TCC toggles.

Rule 3.2.1-C: Adding, removing, or modifying TCC enhancement toggles for tenant scans does NOT affect Trust Index scores. The benchmark code path and the tenant code path are architecturally separate.

3.2.2 Input Pinning

Every Index Scan run pins its inputs via ac_index_scan_runs:

rkb_version_id — exact RKB snapshot used
spl_version_id — exact SPL snapshot used
ai_models — exact list of AI providers scanned
input_snapshot_hash — SHA-256 of the combined input set

Rule 3.2.2-A: A score's source_scan_ids links to specific scan runs, which link to specific pinned input versions. The full chain from score → scan run → inputs → individual results is traceable and auditable.

Rule 3.2.2-B: If RKB facts are updated mid-month, the Index Calculator uses the pinned version from the scan run, not the current live version. Scores reflect the inputs that were actually used, not the inputs that exist now.

3.2.3 Reproducibility

Rule 3.2.3-A (Input reproducibility): Lawnise-controlled inputs (RKB facts, SPL prompts, methodology parameters) are fully reproducible. Given the same rkb_version_id + spl_version_id + methodology_version, the same calculation logic will be applied. The input_snapshot_hash makes this verifiable.

Rule 3.2.3-B (Output variance acknowledgment): End-to-end output reproducibility is NOT guaranteed, even with identical Lawnise inputs and the same nominal AI provider model. Three sources of variance are outside Lawnise's control:

1. Provider-side nondeterminism: LLM sampling (temperature, top-p) means the same prompt can produce different responses across runs, even on the same model build. 2. Provider-side model drift: Providers may silently update model weights behind a stable model name (e.g., "gpt-5-3" today vs "gpt-5-3" next month). 3. LLM verification variance: The TruthGuard engine uses an LLM verifier step whose output can vary across runs for borderline cases.

provider_response_metadata in ac_index_scan_results captures whatever version/fingerprint info the provider returns, making provider-side drift auditable after the fact.

Rule 3.2.3-C: Reproducibility claim in publications: _"Lawnise-controlled inputs (reference facts, prompts, methodology) are pinned, versioned, and hash-verified. End-to-end output may vary due to AI provider nondeterminism and model drift, which Lawnise tracks but does not control."_

Rule 3.2.3-D: Published scores are based on the actual captured responses and findings, not on a claim that those responses are the only possible outputs for the given inputs. The score reflects what was observed in that specific scan run.

3.2.4 Opaque Public Identifiers

Rule 3.2.4-A: All Trust Index identifiers that appear in public-facing API responses, verification surfaces, citation outputs, or printed publications use HMAC-derived opaque public identifiers in the format {kid}_{uuid-like}, where {kid} is the active key identifier. The underlying internal score UUID is never exposed in any public output.

Rule 3.2.4-B: The opaque public identifier is deterministic per active key: the same internal UUID under the same active key always produces the same public identifier. External citations therefore remain stable for the lifetime of the active key.

Rule 3.2.4-C: Key rotation is a governance event. When the active key rotates, both the previous and current opaque identifiers remain resolvable so citations published under the prior key continue to verify. The rotation event is recorded in the audit log.

3.3 Correction and Retraction

3.3.1 Error Types

Error Type	Response	Trigger
Calculation error (bug in calculator logic)	Correction: publish corrected score, mark original as corrected	Internal QA or external report
Input error (incorrect RKB fact, misclassified prompt)	Correction: re-run scan with corrected inputs, publish new score for the period	RKB audit or external report
Methodology error (flaw in formula or threshold)	Retraction: retract affected scores, publish corrected scores under new methodology version	Internal review
Presentation error (wrong number in report, typo)	Erratum: correct the published content, note the change	Internal QA or external report

3.3.2 Correction Process

Rule 3.3.2-A: A corrected score is a new row in ac_trust_index_scores with the same period_start + jurisdiction but an updated calculation_run_id. The original row remains (immutable) with its published_at unchanged.

Rule 3.3.2-B: Corrections are disclosed publicly. The correction notice includes: what was wrong, what the corrected value is, and why the error occurred.

Rule 3.3.2-C: The hash chain incorporates corrections. A corrected score's previous_score_hash links to the original score, maintaining chain integrity.

3.3.3 Retraction Process

Rule 3.3.3-A: Retraction is reserved for cases where the score is fundamentally unsound (e.g., discovered that RKB facts were systematically wrong, or the engine had a bug that produced meaningless verdicts).

Rule 3.3.3-B: Retracted scores are marked in the database (retraction flag + reason) but never deleted. The hash chain is preserved.

Rule 3.3.3-C: Retraction notice is published prominently and remains permanently alongside the retracted score.

3.4 Anti-Gaming

3.4.1 Threat Model

The primary gaming risk is AI providers tuning their models to perform well on known benchmark prompts ("teaching to the test"). Secondary risk is external parties attempting to influence RKB facts.

3.4.2 Mitigations

Rule 3.4.2-A (Prompt opacity): The methodology publishes scoring categories and weights (the "what") but NOT the specific SPL prompt text or fact-matching thresholds (the "how"). SPL prompts are confidential Lawnise IP.

Rule 3.4.2-B (Prompt rotation): SPL prompts are refreshed quarterly. New prompts are added, stale prompts are retired. This prevents "teaching to the test" even if specific prompts are leaked.

Rule 3.4.2-C (Prompt variants): Each SPL prompt has 3-5 paraphrased variants. The variant used per scan run is randomly selected. This prevents gaming by memorising exact prompt text.

Rule 3.4.2-D (Anomaly detection): Flag sudden score improvements (>10pp month-over-month for the same provider) that do not correlate with observable AI model changes (new model version, public announcement). Flagged anomalies are investigated before publication.

Rule 3.4.2-E (RKB integrity): RKB facts are curated from official public sources (regulator websites, annual reports, legislation). RKB changes are tracked in ac_dataset_versions with hash-chain integrity. External parties cannot influence RKB content — it is Lawnise-controlled.

Rule 3.4.2-F (Methodology refresh): The methodology is reviewed annually for gaming-resistance. If evidence of systematic gaming is detected, an out-of-cycle methodology update is triggered.

3.5 Entity Opt-Out

Rule 3.5.1-A (Future exclusion): If a financial institution requests removal from Stream 2 (Market Accuracy Benchmark), Lawnise will exclude that entity from all future scan runs within 30 days. The entity's RKB-Entity facts are set to status = 'retired' to prevent future use in scans. No new branded prompts will be run against that entity.

Rule 3.5.1-B (Historical preservation): Historical RKB-Entity facts and scan results for the opted-out entity are NOT scrubbed, deleted, or anonymised. They remain in the database with full content intact. This is required for:

Audit integrity: Published scores that included the entity's data must remain verifiable against the original inputs (Rule 3.2.2-A).
Reproducibility: Pinned input snapshots (rkb_version_id) must remain resolvable to their original content (Rule 3.2.3-A).
Hash-chain validity: Evidence hashes computed over the original data would become unverifiable if the underlying data were scrubbed.

Rule 3.5.1-C (Public suppression): After opt-out, the entity's name and per-entity scores are suppressed from all future public output, including anonymised distributions. Historical published reports that included the entity (in anonymised aggregate form only — entity names were never in public output per Rule 1.3.3-B) remain unchanged.

Rule 3.5.1-D: Entity opt-out does not affect Stream 1 (Industry Trust Index), which uses non-branded prompts and industry-level facts only.

Appendix A: Glossary

Term	Definition
Observation	One SPL prompt × one AI provider × one RKB fact set → one TruthGuard verdict
Scored observation	An observation with verdict `no_risk` or `risk_detected` (including `risk_type = 'refusal_non_answer'`). Excludes `scan_error` and `no_bkb_facts`.
RKB	Reference Knowledge Base — Lawnise-curated verified facts from public sources
SPL	Standard Prompt Library — Lawnise-curated prompts used for Index Scans
CI	Confidence Interval — statistical uncertainty range at 95% confidence
pp	Percentage points
FP	False Positive — engine incorrectly flags an accurate AI response
FN	False Negative — engine fails to flag an inaccurate AI response
TCC	Tenant Configuration Center — per-tenant policy system (NOT used for Trust Index)
InATJI	Independent AI Truth Jurisdiction Infrastructure — Lawnise's long-term vision

Appendix B: Frozen P0D Benchmark Reference

The following data is from the P0D development benchmark (frozen 2026-03-31). It is included for calibration context only — it is not a Trust Index score.

Metric	Industry (60 obs)	Entity (57 obs)	Combined (117 obs)
Accuracy	88.3%	84-89%	~86-88%
FP rate	10.0%	5-9%	~9%
FN rate	1.7%	5-7%	~4%
Providers	ChatGPT, Gemini, Copilot	ChatGPT, Gemini, Copilot	3
Sectors	Banking	Banking	1
Gate	✅ Passed	✅ Passed	✅ Passed

Note: Two independent reviewers produced slightly different Entity labels on 2-3 borderline Copilot rows. Both agree the combined gate passes comfortably.

Appendix C: Version History

Version	Effective	Summary
v1.0	2026-05-11	Inaugural methodology specification. Formalises per-observation scoring (binary), aggregation (unweighted arithmetic mean), Wilson confidence interval at 95%, score-status thresholds (definitive / preliminary / indicative), Stream 2 anonymisation with k≥5 entities per sector, publication discipline (raw evidence immutability + governed metadata-only cleanup), and InATJI primitives (hash-chain integrity, opaque public identifiers, input pinning, audit log). Validates implicitly against the P0D pilot frozen benchmark; no prior versioned methodology to migrate from.

Future major or minor version changes append rows here per Rule 3.1.1.

End of Methodology Specification v1.0