- Home
- Trust Index
- Methodology v1
Trust Index Methodology
Lawnise AI Trust Index Methodology v1.0
Approved public methodology for the versioned method used to convert scan observations into Lawnise AI Trust Index scores. The methodology underwrites Lawnise's Public-facing AI governance and evidence infrastructure and grounds the Enterprise Public AI TRiSM category that Lawnise owns.
- Status
- v1.0 — Approved for Publication
- Effective Date
- 2026-05-11
- Canonical URL
- www.lawnise.com/trust-index/methodology/v1
Lawnise AI Trust Index — Methodology Specification v1.0
Version: 1.0 Status: v1.0 — APPROVED FOR PUBLICATION Created: 2026-03-31 Effective date: 2026-05-11 Authored by: Lawnise Authority Core team Approved by: Lawmence (CEO) Canonical URL: https://www.lawnise.com/trust-index/methodology/v1 Predecessor: P0D frozen benchmark (117 observations, 2 streams, 3 AI providers, audited prompt set)
Purpose
This document is the normative specification for the Lawnise AI Trust Index v1.0. It defines how raw scan observations become publishable scores. Every Trust Index score references this document version. Every formula, threshold, and rule stated here is a binding constraint on the Index Calculator implementation.
This is not a brainstorm, roadmap, or aspirational document. Every section ends in a concrete rule or decision.
Document Structure
| Part | Covers |
|---|---|
| Part 1: Measurement | Inputs, scoring formula, aggregation, confidence intervals |
| Part 2: Publication | Score status classification, publication criteria, anonymisation |
| Part 3: Governance | Versioning, benchmark integrity, correction/retraction, anti-gaming |
Part 1: Measurement
1.1 Input Data
The Trust Index is calculated from Index Scan results produced by the Lawnise Index Scan Engine. Each scan result is one observation: one SPL prompt × one AI provider × one RKB fact set → one TruthGuard verdict.
1.1.1 Two Streams
| Stream | Name | Inputs | Output |
|---|---|---|---|
| Stream 1 | Industry Trust Index | SPL non-branded prompts + RKB-Industry facts | Public industry-level accuracy score |
| Stream 2 | Market Accuracy Benchmark | SPL branded prompt templates + RKB-Entity facts | Anonymised entity-level benchmark |
Rule 1.1.1-A: Stream 1 and Stream 2 are scored independently. They are never blended into a single number.
Rule 1.1.1-B: Stream 2 entity scores are stored privately. Public output contains only anonymised distributions (ranges, quartiles, aggregates). Entity names never appear in public Trust Index output.
1.1.2 Observation Unit
One observation = one row in ac_index_scan_results:
prompt_id— which SPL prompt was askedai_model— which AI provider answeredstream— industry or entityjurisdiction— e.g., "MY"sector— e.g., "banking", "insurance", "capital_markets"findings— TruthGuard verification output (verdict, risk_type, severity)accuracy_score— per-observation accuracy (see §1.2)
Rule 1.1.2-A: Only observations from completed scan runs (ac_index_scan_runs.status = 'completed') enter the calculation. Failed/cancelled runs are excluded.
Rule 1.1.2-B: Observations with verdict scan_error are excluded from scoring but counted in sample quality reporting (see §1.4). They indicate scraper or provider infrastructure failures, not AI knowledge failures.
1.1.3 Data Independence
The Trust Index is produced entirely from Lawnise-owned data:
- SPL prompts: curated by Lawnise, stored in
ac_standard_prompts - RKB facts: curated by Lawnise from public sources, stored in
ac_reference_facts - Scans: initiated by Lawnise, not by any tenant
Rule 1.1.3-A (Benchmark Integrity): The Trust Index calculation must NOT inherit, reference, or be influenced by tenant-specific TCC policies, tenant BKB facts, tenant prompts, or tenant configuration. Running under the Lawnise operational tenant is an operational convenience (JWT/DB context), not a policy source.
Rule 1.1.3-B: Methodology parameters are stored in ac_config (global, non-tenant-scoped), not in tenant_policies (TCC). This separation is a hard constraint for InATJI credibility.
1.2 Per-Observation Scoring
Each observation receives a binary accuracy classification based on the TruthGuard engine's stored finding structure.
1.2.1 Outcome-to-Accuracy Mapping
The TruthGuard engine stores results in ac_index_scan_results.findings as a JSONB array of finding objects. Each finding has a verdict field (no_risk or risk_detected) and, for risk_detected findings, a risk_type classification. The Index Calculator maps these stored outcomes to binary accuracy as follows:
| Stored Outcome | Accuracy Value | Rationale |
|---|---|---|
verdict = 'no_risk' | 1 (accurate) | AI response is consistent with RKB facts |
verdict = 'risk_detected', any risk_type except refusal_non_answer | 0 (inaccurate) | AI response contradicts, fabricates, or materially omits verified facts |
verdict = 'risk_detected', risk_type = 'refusal_non_answer' | 0 (inaccurate) | AI refused to answer using publicly available information (benchmark mode finding) |
verdict = 'scan_error' | excluded | Infrastructure failure, not an AI knowledge assessment |
verdict = 'no_bkb_facts' | excluded | Insufficient RKB coverage to assess; does not count for or against accuracy |
Rule 1.2.1-A: Scoring is binary (0 or 1). There is no partial credit. An observation is either accurate or inaccurate.
Rule 1.2.1-B: A risk_detected finding with risk_type = 'refusal_non_answer' counts as inaccurate for Trust Index purposes. This risk type is emitted by the benchmark code path when an AI refuses to answer a question about publicly available information. It is a risk_type classification within the risk_detected verdict, not a separate top-level verdict.
Rule 1.2.1-C: If an observation produces multiple findings, the observation is scored as inaccurate (0) if ANY finding has verdict = 'risk_detected'. A single risk finding is sufficient to classify the observation as inaccurate.
Rule 1.2.1-D: Future methodology versions may introduce graded scoring (e.g., severity-weighted). v1.0 uses binary classification only, which is the simplest defensible approach given current sample sizes.
1.2.2 Residual Engine Precision Acknowledgment
The TruthGuard engine that produces verdicts has known residual precision limitations:
- Combined accuracy of ~86-88% on the frozen P0D benchmark (117 observations)
- FP rate ~9% (engine over-flags some correct answers)
- FN rate ~4% (engine misses some incorrect answers)
Rule 1.2.2-A: Trust Index scores are presented as engine-assessed accuracy, not ground truth. All published scores carry a standard disclosure: _"Scores reflect automated verification against curated reference facts. Engine accuracy on the development benchmark is approximately 87%. Results should be interpreted as directional indicators, not absolute measures."_
Rule 1.2.2-B: As the engine improves (FP/FN rates decrease), the disclosure is updated to reflect current benchmark accuracy. Methodology version does NOT change for engine improvements alone — only for formula/threshold/rule changes.
1.3 Score Aggregation
1.3.1 Stream 1: Industry Trust Index Score
The Industry Trust Index score for a given jurisdiction and period is calculated as:
Industry_Score(j, p) = (Σ accurate_observations) / (Σ scored_observations) × 100Where:
j= jurisdiction (e.g., "MY")p= period (calendar month)accurate_observations= observations with accuracy value = 1scored_observations= total observations minus excluded observations (scan_error, no_bkb_facts)
This is a simple unweighted accuracy percentage across all scored observations in the period.
Rule 1.3.1-A (No sector weighting in v1.0): v1.0 does NOT apply differential weights to sectors (banking vs insurance vs capital markets) or prompt categories (consumer vs regulatory vs process). All scored observations count equally.
Rationale: Sector weighting requires empirical calibration data that does not yet exist. Premature weighting would be arbitrary and would embed unjustified assumptions. v1.0 establishes the baseline with equal weighting. Sector weighting may be introduced in v2.0 after ≥6 months of data provides a basis for calibration.
Rule 1.3.1-B: The headline number is the jurisdiction-level Industry Trust Index score. Breakdowns by provider, sector, and prompt category are provided as supplementary detail in the breakdown field, but they are not separately scored or ranked in v1.0.
1.3.2 Breakdown Structure
The breakdown JSONB field in ac_trust_index_scores contains:
{
"by_provider": {
"chatgpt": { "scored": 20, "accurate": 14, "accuracy": 70.0 },
"gemini": { "scored": 20, "accurate": 17, "accuracy": 85.0 },
"copilot": { "scored": 20, "accurate": 15, "accuracy": 75.0 }
},
"by_sector": {
"banking": { "scored": 45, "accurate": 35, "accuracy": 77.8 },
"insurance": { "scored": 10, "accurate": 8, "accuracy": 80.0 },
"capital_markets": { "scored": 5, "accurate": 4, "accuracy": 80.0 }
},
"by_prompt_category": {
"consumer": { "scored": 12, "accurate": 10, "accuracy": 83.3 },
"regulatory": { "scored": 8, "accurate": 6, "accuracy": 75.0 },
"process": { "scored": 10, "accurate": 7, "accuracy": 70.0 },
"consumer_protection": { "scored": 8, "accurate": 7, "accuracy": 87.5 },
"comparative": { "scored": 6, "accurate": 5, "accuracy": 83.3 },
"current_affairs": { "scored": 4, "accurate": 2, "accuracy": 50.0 },
"misconception": { "scored": 6, "accurate": 5, "accuracy": 83.3 },
"cross_sector": { "scored": 6, "accurate": 4, "accuracy": 66.7 }
},
"excluded": {
"scan_error": 3,
"no_bkb_facts": 1
}
}Rule 1.3.2-A: Breakdowns are informational. They enable drill-down analysis but do not feed into the headline score calculation.
1.3.3 Stream 2: Market Accuracy Benchmark
Stream 2 produces per-entity accuracy scores using the same binary classification:
Entity_Score(e, j, p) = (Σ accurate_observations for entity e) / (Σ scored_observations for entity e) × 100Per-entity scores are private (stored in ac_index_scan_results, accessible only to the entity's own tenant via the commercial product).
The public Market Accuracy Benchmark output is an anonymised distribution (example values are illustrative only and do not represent any real jurisdiction's reported figures):
{
"jurisdiction": "MY",
"sector": "banking",
"period": "2026-03",
"entity_count": 12,
"accuracy_distribution": {
"min": 52.0,
"q1": 65.0,
"median": 73.5,
"q3": 82.0,
"max": 95.0,
"mean": 74.2,
"std_dev": 12.1
}
}Rule 1.3.3-A: The public anonymised output must contain ≥5 entities per sector per jurisdiction. If fewer than 5 entities have sufficient data, the sector-level distribution is suppressed (not published) to prevent re-identification.
Rule 1.3.3-B: Entity names never appear in any public Trust Index output, API response, or report.
1.4 Confidence Intervals
Every Trust Index score includes a confidence interval that quantifies uncertainty due to finite sample size.
1.4.1 Statistical Method
The confidence interval is computed using the Wilson score interval at the 95% confidence level. Wilson is used for all sample sizes because it provides reliable coverage even at small n and near-boundary proportions, unlike the Wald interval which can produce degenerate intervals when p̂ is close to 0 or 1 or n is small.
Given:
p̂ = accurate_observations / n (observed accuracy proportion)
n = number of scored observations
z = 1.96 (95% confidence z-score)
Wilson centre:
p̃ = (p̂ + z²/(2n)) / (1 + z²/n)
Wilson half-width:
w = (z / (1 + z²/n)) × √(p̂(1 - p̂)/n + z²/(4n²))
Wilson interval:
[p̃ - w, p̃ + w]The stored confidence_interval value in ac_trust_index_scores is the half-width w × 100 (expressed in percentage points). The published centre point is the Wilson centre p̃ × 100, not the raw observed proportion p̂ × 100.
Rule 1.4.1-A: The Wilson score interval is the sole CI method in v1.0. No Wald interval, no sample-size branching.
Example: If p̂ = 0.87, n = 117:
- z²/n = 3.8416/117 = 0.03284
- p̃ = (0.87 + 1.9208/117) / (1 + 0.03284) = (0.87 + 0.01642) / 1.03284 = 0.8584
- w = (1.96 / 1.03284) × √(0.87 × 0.13/117 + 3.8416/54756)
- w = 1.8978 × √(0.000967 + 0.0000702) = 1.8978 × 0.03221 = 0.06114
- Published as: 85.8% ± 6.1%
1.4.2 CI Suppression
Rule 1.4.2-A: If the calculated CI half-width exceeds 15 percentage points, the score is marked indicative regardless of other sample quality criteria. A score of "73% ± 18%" is too imprecise to be useful.
Rule 1.4.2-B: The CI is always reported alongside the headline score. A score without a CI is never published.
1.5 Sample Quality
Sample quality determines whether a score is meaningful enough to classify as definitive, preliminary, or indicative.
1.5.1 Sample Quality Dimensions
| Dimension | Field in sample_quality JSONB | Measures |
|---|---|---|
| Observation count | scored_observations | Total scored observations (excluding scan_error, no_bkb_facts) |
| Provider diversity | distinct_providers | Number of distinct AI providers assessed |
| Sector coverage | distinct_sectors | Number of distinct sectors covered |
| Session diversity | distinct_scan_sessions | Number of distinct scan run sessions |
| Prompt coverage | distinct_prompts | Number of distinct SPL prompts used |
| Excluded ratio | excluded_ratio | (scan_error + no_bkb_facts) / total_observations |
1.5.2 Score Status Thresholds
| Status | Criteria (ALL must be met) | Meaning |
|---|---|---|
| `definitive` | scored_observations ≥ 50 AND distinct_providers ≥ 3 AND distinct_sectors ≥ 2 AND distinct_scan_sessions ≥ 5 AND distinct_prompts ≥ 15 AND CI half-width ≤ 10pp AND excluded_ratio ≤ 0.15 | High-confidence score suitable for public citation |
| `preliminary` | scored_observations ≥ 20 AND distinct_providers ≥ 2 AND distinct_sectors ≥ 1 AND distinct_scan_sessions ≥ 2 AND CI half-width ≤ 15pp | Directionally meaningful but insufficient for definitive claims |
| `indicative` | Does not meet preliminary criteria, OR CI half-width > 15pp | Insufficient data; internal use only |
Rule 1.5.2-A: Score status is computed deterministically from sample quality metrics. It is never manually overridden.
Rule 1.5.2-B: An indicative score is calculated and stored (for trend tracking) but never published externally.
Rule 1.5.2-C: A preliminary score may be published with an explicit caveat: _"This score is preliminary. Sample size and provider coverage do not yet meet definitive thresholds."_
1.5.3 Validation Against P0D Benchmark
Applying these thresholds to the frozen P0D pilot dataset (for reference only — P0D is a development benchmark, not a production Index Scan):
| Dimension | Industry (60) | Entity (57) | Combined (117) |
|---|---|---|---|
| Scored observations | 60 | 57 | 117 |
| Distinct providers | 3 | 3 | 3 |
| Distinct sectors | 1 (banking) | 1 (banking) | 1 (banking) |
| Distinct scan sessions | Multiple | Multiple | Multiple |
| Distinct prompts | 20 | 19 | 39 |
P0D assessment: Would achieve preliminary status (meets count and provider thresholds, but only 1 sector — banking — so definitive requires ≥2 sectors not met). This is expected: the pilot intentionally focused on banking only.
1.5.4 Intra-Session Scan-Error Repair
A scan session may produce rows with verdict = 'scan_error' (e.g., scraper timeout, provider rate-limit, malformed or unusable provider response, or null ai_response) due to scraper or AI-provider infrastructure failure. These rows are excluded from scoring per Rule 1.1.2-B but reduce the evaluable sample for the session. To preserve sample size without compromising session identity, intra-session repair is permitted under the following rules.
Rule 1.5.4-A — Repair scope (logical tuples, not scan_run_ids). A repair pass MAY re-dispatch only those logical session tuples (original_scan_run_id, prompt_id, ai_model) whose row landed with verdict = 'scan_error' and has not yet been successfully replaced. Repair-pass scan runs are dispatched under new scan_run_ids, and each repair-pass row MUST carry an explicit reference back to its original_scan_run_id (in metadata or via the lock memo's session manifest) so the evidence-set deduplicator can map repair rows to the logical tuple they replace. Repair MUST NOT re-dispatch tuples that already produced a usable non-error response, MUST NOT change the prompt set, and MUST NOT change the dataset_version pins.
Rule 1.5.4-B — Original panel only, partitioning permitted, same window. A repair pass MUST use only scanners drawn from the original panel; no replacement or substitute scanner may be introduced for the same session. A repair pass MAY be partitioned into single-scanner repair runs (one new scan_run_id per failed scanner per scope), or run as a multi-scanner repair — both shapes are allowed as long as the union of repair scanners is a subset of the original panel. The repair pass MUST complete within 24 hours of the original session's first dispatch. Beyond 24 hours, the repaired rows are considered a new sampling event and MUST be treated as a new session per Rule 1.5.2.
Rule 1.5.4-C — Maximum two repair passes. A given session permits at most two repair passes. After the second pass, any logical tuple whose latest row is still verdict = 'scan_error' is recorded as a permanent scan-error for the session and contributes to excluded_ratio per Rule 1.1.2-B and to the per-scope scanner-availability disclosure required at publication time.
Rule 1.5.4-D — Session identity preserved. Repair-pass scan runs share the same dataset_version pin and the same logical session identity as the original dispatch. The session evidence set is the union of original-dispatch rows and repair-pass rows for the session, deduplicated per Rule 1.5.4-G. The distinct_scan_sessions sample-quality metric counts the original dispatch + all its repair passes as one session, regardless of how many scan_run_ids the repair was partitioned into.
Rule 1.5.4-E — Disclosure. The lock memo and any external publication MUST disclose the per-scanner availability after the final repair pass. If any single scanner's verdict = 'scan_error' rate after the final repair pass exceeds 40% of its dispatched logical tuples for any scope, the lock memo MUST explicitly call this out and consider whether to (a) include the scanner with reduced weight, (b) exclude the scanner from this session's score with disclosure, or (c) escalate to the human reviewer for sign-off.
Rule 1.5.4-F — Hash discipline. The session evidence-set SHA-256 hash referenced by the row-by-row SOP is computed over the post-repair, deduplicated evidence set per Rule 1.5.4-G. Any pre-repair hash is operational telemetry only and MUST NOT be cited as the canonical session hash in audit artifacts.
Rule 1.5.4-G — Repair precedence (deduplication). When constructing the post-repair evidence set, for each logical tuple (original_scan_run_id, prompt_id, ai_model) apply the following precedence in order:
1. Original dispatch row is the starting point. 2. Repair pass 1 row overrides the original for that logical tuple only if the repair pass 1 row has verdict ≠ 'scan_error' (a usable response). If repair pass 1 also produced verdict = 'scan_error', retain the original row. 3. Repair pass 2 row overrides the result of step 2 for that logical tuple only if the repair pass 2 row has verdict ≠ 'scan_error'. If repair pass 2 also produced verdict = 'scan_error', retain the result of step 2. 4. If all passes produced `verdict = 'scan_error'`, the latest scan-error row is retained as a permanent scan-error placeholder for the tuple and contributes to excluded_ratio per Rule 1.1.2-B.
The lock memo MUST enumerate the full list of scan_run_ids comprising the session (original + all repair passes) so any auditor can reconstruct the post-repair evidence set deterministically.
Part 2: Publication
2.1 Publication Criteria
2.1.1 Eligibility
A Trust Index score is eligible for publication when:
1. Score status is definitive or preliminary (never indicative) 2. The score has been reviewed by at least one human operator (Lawmence or designated reviewer) 3. The published_at timestamp is set (marks the moment of publication)
Rule 2.1.1-A: Publication is an explicit human-approved action, not an automatic result of calculation. The Index Calculator computes and stores scores; a human sets published_at.
Rule 2.1.1-B: Once published_at is set, the score row becomes immutable (except for retraction — see §3.3). The evidence_hash and hash-chain link are frozen at publication time.
2.1.2 Publication Format
Every published Trust Index score includes:
| Field | Example | Required |
|---|---|---|
| Jurisdiction | Malaysia (MY) | Yes |
| Period | March 2026 | Yes |
| Overall score | 73.2% | Yes |
| Confidence interval | ± 6.1% | Yes |
| Score status | Definitive | Yes |
| Methodology version | v1.0 | Yes |
| Provider breakdown | ChatGPT: 68%, Gemini: 81%, Copilot: 71% | Yes |
| Sector breakdown | Banking: 73%, Insurance: 78% | Yes, if ≥2 sectors |
| Sample size | 180 scored observations | Yes |
| Engine accuracy disclosure | ~87% benchmark accuracy | Yes |
| Excluded count | 5 scan errors | Yes, if > 0 |
Rule 2.1.2-A: The engine accuracy disclosure (Rule 1.2.2-A) is mandatory in every publication. It is never omitted or buried in footnotes.
2.1.3 Stream 2 Publication
Stream 2 (Market Accuracy Benchmark) is published as anonymised distributions only:
- Sector-level accuracy ranges, quartiles, and means
- No entity names, no per-entity scores, no data that enables re-identification
- Suppressed if < 5 entities per sector (Rule 1.3.3-A)
Rule 2.1.3-A: Stream 2 public reports carry the label "Market Accuracy Benchmark" (not "Trust Index") to distinguish them from the headline Industry Trust Index.
2.1.4 Publication Cleanup Governance
In some governed benchmark runs, a stored risk_detected row may contain one or more findings that are not publication-grade contradictions even though the raw verifier emitted them. v1.0 allows a narrow publication-layer cleanup mechanism so public reporting can exclude clearly weak findings without overwriting raw evidence.
This is a publication governance primitive, not a score-formula change.
Rule 2.1.4-A (Raw evidence immutability): Raw benchmark evidence is never overwritten during publication cleanup. The stored top-level row verdict remains unchanged. If the verifier emitted risk_detected, the row remains risk_detected in raw storage.
Rule 2.1.4-B (Metadata-only exclusion): Publication cleanup is implemented only through finding-level metadata, using metadata.publication_exclusion on the specific finding(s) being excluded. The exclusion must not be represented by deleting findings, mutating their substantive text, or rewriting the row verdict.
Rule 2.1.4-C (Finding-level granularity): Exclusions are applied at the individual finding level, not the whole-row level. If a row contains both excluded and non-excluded findings, the remaining valid finding(s) still count that row as inaccurate for publication purposes.
Rule 2.1.4-D (Locked categories in v1.0): Publication cleanup is restricted to explicit, reviewable categories. The currently approved v1.0 categories are:
self_admitting_fpoff_topic_retrieval
Rule 2.1.4-E (Current locked criteria): The first governed production publication used the following locked criteria:
self_admitting_fp: explanation text explicitly self-negates the contradiction (for example, phrases equivalent to "not a direct contradiction", "not a true contradiction", "consistent, not contradictory", "rephrasing, not a contradiction", or clearly borderline-only wording)off_topic_retrieval: the finding relies on a known retrieval mismatch pattern rather than the prompt's actual issue. Current locked examples:- investment-linked / unit-portion facts attached to prompts that are not about investment-linked coverage
- foreign-currency-deposit / Ringgit-aggregation facts attached to prompts that are not about currency coverage
Rule 2.1.4-F (Governed workflow): Publication cleanup requires a governed workflow:
1. dry-run proposed exclusions 2. produce an artifact listing rows/findings, reasons, and before/after counts 3. independent cross-check / review 4. explicit approval 5. metadata-only apply
Rule 2.1.4-G (Audit trail): Every applied exclusion must record enough audit detail to reconstruct what happened, including:
- exclusion reason/category
- matched phrase or matched pattern when applicable
- applied timestamp
- applied-by identifier / workflow
Rule 2.1.4-H (Methodology boundary): Publication cleanup under these locked categories does not by itself change the methodology version. It is treated as governed benchmark adjudication over raw stored findings. Expanding categories, changing category semantics materially, or altering how published rows are counted may require a methodology update.
Rule 2.1.4-I (Future benchmark lines): The same metadata-only cleanup pattern applies to future benchmark lines and jurisdictions unless explicitly superseded by a later methodology version. Future sessions must not invent broader ad hoc exclusion categories without recording the decision in methodology or a linked governance memo.
2.2 Jurisdiction Normalisation
2.2.1 Principle
Trust Index scores are relative to each jurisdiction's own baseline. A score of 73% in Malaysia and 68% in the UK does NOT mean Malaysia's AI ecosystem is "better" — the jurisdictions have different regulatory complexity, AI maturity, RKB depth, and SPL coverage.
2.2.2 Rules
Rule 2.2.2-A: Trust Index scores are never ranked across jurisdictions in v1.0 publications. No "Country X ranks #1" claims.
Rule 2.2.2-B: Cross-jurisdiction comparison is only valid at the trend level. _Example only:_ "Both Malaysia and Singapore showed improving accuracy over the past 6 months." Absolute score comparisons are explicitly disclaimed.
Rule 2.2.2-C: A "Global Composite Score" is deferred until ≥3 jurisdictions each have ≥6 months of definitive scores. The methodology for global aggregation will be specified in a future version.
Rule 2.2.2-D: When publishing multi-jurisdiction reports, each jurisdiction's score is presented independently with its own CI, sample quality, and sector coverage. No combined cross-jurisdiction headline number in v1.0.
Part 3: Governance
3.1 Versioning
3.1.1 Version Numbering
The methodology uses semantic versioning: MAJOR.MINOR.
| Change Type | Version Impact | Example |
|---|---|---|
| Formula change (scoring, aggregation, weighting) | Major | v1.0 → v2.0 |
| Threshold change (sample quality, CI suppression) | Major | v1.0 → v2.0 |
| New score status tier | Major | v1.0 → v2.0 |
| New breakdown dimension | Minor | v1.0 → v1.1 |
| Disclosure wording update | Minor | v1.0 → v1.1 |
| Engine accuracy improvement (no formula change) | No version change | — |
| RKB/SPL content refresh | No version change | — |
Rule 3.1.1-A: Every ac_trust_index_scores row records the methodology_version used to calculate it. Historical scores are never restated under a new version.
Rule 3.1.1-B: When a major version change occurs, a bridge document is published explaining: (1) what changed, (2) why, (3) how old scores compare to new scores, and (4) whether trend continuity is preserved.
3.1.2 Backward Compatibility
Rule 3.1.2-A: Old scores calculated under v1.0 remain valid and citable as "v1.0 scores." They are never deleted or overwritten.
Rule 3.1.2-B: If a new version produces materially different scores for the same inputs, both the old-version and new-version scores are stored (via the methodology_version dimension in the unique constraint). This allows parallel display during transition periods.
3.2 Benchmark Integrity
3.2.1 Independence Constraints
Rule 3.2.1-A: The Trust Index calculation reads from ac_config for methodology parameters, never from tenant_policies (TCC).
Rule 3.2.1-B: The TCC-controlled TruthGuard enhancements (rollout toggles) apply to tenant AI monitoring scans only. Index Scan Engine runs use the benchmark-gated code path (discovery_metadata.stream), which is independent of per-tenant TCC toggles.
Rule 3.2.1-C: Adding, removing, or modifying TCC enhancement toggles for tenant scans does NOT affect Trust Index scores. The benchmark code path and the tenant code path are architecturally separate.
3.2.2 Input Pinning
Every Index Scan run pins its inputs via ac_index_scan_runs:
rkb_version_id— exact RKB snapshot usedspl_version_id— exact SPL snapshot usedai_models— exact list of AI providers scannedinput_snapshot_hash— SHA-256 of the combined input set
Rule 3.2.2-A: A score's source_scan_ids links to specific scan runs, which link to specific pinned input versions. The full chain from score → scan run → inputs → individual results is traceable and auditable.
Rule 3.2.2-B: If RKB facts are updated mid-month, the Index Calculator uses the pinned version from the scan run, not the current live version. Scores reflect the inputs that were actually used, not the inputs that exist now.
3.2.3 Reproducibility
Rule 3.2.3-A (Input reproducibility): Lawnise-controlled inputs (RKB facts, SPL prompts, methodology parameters) are fully reproducible. Given the same rkb_version_id + spl_version_id + methodology_version, the same calculation logic will be applied. The input_snapshot_hash makes this verifiable.
Rule 3.2.3-B (Output variance acknowledgment): End-to-end output reproducibility is NOT guaranteed, even with identical Lawnise inputs and the same nominal AI provider model. Three sources of variance are outside Lawnise's control:
1. Provider-side nondeterminism: LLM sampling (temperature, top-p) means the same prompt can produce different responses across runs, even on the same model build. 2. Provider-side model drift: Providers may silently update model weights behind a stable model name (e.g., "gpt-5-3" today vs "gpt-5-3" next month). 3. LLM verification variance: The TruthGuard engine uses an LLM verifier step whose output can vary across runs for borderline cases.
provider_response_metadata in ac_index_scan_results captures whatever version/fingerprint info the provider returns, making provider-side drift auditable after the fact.
Rule 3.2.3-C: Reproducibility claim in publications: _"Lawnise-controlled inputs (reference facts, prompts, methodology) are pinned, versioned, and hash-verified. End-to-end output may vary due to AI provider nondeterminism and model drift, which Lawnise tracks but does not control."_
Rule 3.2.3-D: Published scores are based on the actual captured responses and findings, not on a claim that those responses are the only possible outputs for the given inputs. The score reflects what was observed in that specific scan run.
3.2.4 Opaque Public Identifiers
Rule 3.2.4-A: All Trust Index identifiers that appear in public-facing API responses, verification surfaces, citation outputs, or printed publications use HMAC-derived opaque public identifiers in the format {kid}_{uuid-like}, where {kid} is the active key identifier. The underlying internal score UUID is never exposed in any public output.
Rule 3.2.4-B: The opaque public identifier is deterministic per active key: the same internal UUID under the same active key always produces the same public identifier. External citations therefore remain stable for the lifetime of the active key.
Rule 3.2.4-C: Key rotation is a governance event. When the active key rotates, both the previous and current opaque identifiers remain resolvable so citations published under the prior key continue to verify. The rotation event is recorded in the audit log.
3.3 Correction and Retraction
3.3.1 Error Types
| Error Type | Response | Trigger |
|---|---|---|
| Calculation error (bug in calculator logic) | Correction: publish corrected score, mark original as corrected | Internal QA or external report |
| Input error (incorrect RKB fact, misclassified prompt) | Correction: re-run scan with corrected inputs, publish new score for the period | RKB audit or external report |
| Methodology error (flaw in formula or threshold) | Retraction: retract affected scores, publish corrected scores under new methodology version | Internal review |
| Presentation error (wrong number in report, typo) | Erratum: correct the published content, note the change | Internal QA or external report |
3.3.2 Correction Process
Rule 3.3.2-A: A corrected score is a new row in ac_trust_index_scores with the same period_start + jurisdiction but an updated calculation_run_id. The original row remains (immutable) with its published_at unchanged.
Rule 3.3.2-B: Corrections are disclosed publicly. The correction notice includes: what was wrong, what the corrected value is, and why the error occurred.
Rule 3.3.2-C: The hash chain incorporates corrections. A corrected score's previous_score_hash links to the original score, maintaining chain integrity.
3.3.3 Retraction Process
Rule 3.3.3-A: Retraction is reserved for cases where the score is fundamentally unsound (e.g., discovered that RKB facts were systematically wrong, or the engine had a bug that produced meaningless verdicts).
Rule 3.3.3-B: Retracted scores are marked in the database (retraction flag + reason) but never deleted. The hash chain is preserved.
Rule 3.3.3-C: Retraction notice is published prominently and remains permanently alongside the retracted score.
3.4 Anti-Gaming
3.4.1 Threat Model
The primary gaming risk is AI providers tuning their models to perform well on known benchmark prompts ("teaching to the test"). Secondary risk is external parties attempting to influence RKB facts.
3.4.2 Mitigations
Rule 3.4.2-A (Prompt opacity): The methodology publishes scoring categories and weights (the "what") but NOT the specific SPL prompt text or fact-matching thresholds (the "how"). SPL prompts are confidential Lawnise IP.
Rule 3.4.2-B (Prompt rotation): SPL prompts are refreshed quarterly. New prompts are added, stale prompts are retired. This prevents "teaching to the test" even if specific prompts are leaked.
Rule 3.4.2-C (Prompt variants): Each SPL prompt has 3-5 paraphrased variants. The variant used per scan run is randomly selected. This prevents gaming by memorising exact prompt text.
Rule 3.4.2-D (Anomaly detection): Flag sudden score improvements (>10pp month-over-month for the same provider) that do not correlate with observable AI model changes (new model version, public announcement). Flagged anomalies are investigated before publication.
Rule 3.4.2-E (RKB integrity): RKB facts are curated from official public sources (regulator websites, annual reports, legislation). RKB changes are tracked in ac_dataset_versions with hash-chain integrity. External parties cannot influence RKB content — it is Lawnise-controlled.
Rule 3.4.2-F (Methodology refresh): The methodology is reviewed annually for gaming-resistance. If evidence of systematic gaming is detected, an out-of-cycle methodology update is triggered.
3.5 Entity Opt-Out
Rule 3.5.1-A (Future exclusion): If a financial institution requests removal from Stream 2 (Market Accuracy Benchmark), Lawnise will exclude that entity from all future scan runs within 30 days. The entity's RKB-Entity facts are set to status = 'retired' to prevent future use in scans. No new branded prompts will be run against that entity.
Rule 3.5.1-B (Historical preservation): Historical RKB-Entity facts and scan results for the opted-out entity are NOT scrubbed, deleted, or anonymised. They remain in the database with full content intact. This is required for:
- Audit integrity: Published scores that included the entity's data must remain verifiable against the original inputs (Rule 3.2.2-A).
- Reproducibility: Pinned input snapshots (
rkb_version_id) must remain resolvable to their original content (Rule 3.2.3-A). - Hash-chain validity: Evidence hashes computed over the original data would become unverifiable if the underlying data were scrubbed.
Rule 3.5.1-C (Public suppression): After opt-out, the entity's name and per-entity scores are suppressed from all future public output, including anonymised distributions. Historical published reports that included the entity (in anonymised aggregate form only — entity names were never in public output per Rule 1.3.3-B) remain unchanged.
Rule 3.5.1-D: Entity opt-out does not affect Stream 1 (Industry Trust Index), which uses non-branded prompts and industry-level facts only.
Appendix A: Glossary
| Term | Definition |
|---|---|
| Observation | One SPL prompt × one AI provider × one RKB fact set → one TruthGuard verdict |
| Scored observation | An observation with verdict no_risk or risk_detected (including risk_type = 'refusal_non_answer'). Excludes scan_error and no_bkb_facts. |
| RKB | Reference Knowledge Base — Lawnise-curated verified facts from public sources |
| SPL | Standard Prompt Library — Lawnise-curated prompts used for Index Scans |
| CI | Confidence Interval — statistical uncertainty range at 95% confidence |
| pp | Percentage points |
| FP | False Positive — engine incorrectly flags an accurate AI response |
| FN | False Negative — engine fails to flag an inaccurate AI response |
| TCC | Tenant Configuration Center — per-tenant policy system (NOT used for Trust Index) |
| InATJI | Independent AI Truth Jurisdiction Infrastructure — Lawnise's long-term vision |
Appendix B: Frozen P0D Benchmark Reference
The following data is from the P0D development benchmark (frozen 2026-03-31). It is included for calibration context only — it is not a Trust Index score.
| Metric | Industry (60 obs) | Entity (57 obs) | Combined (117 obs) |
|---|---|---|---|
| Accuracy | 88.3% | 84-89% | ~86-88% |
| FP rate | 10.0% | 5-9% | ~9% |
| FN rate | 1.7% | 5-7% | ~4% |
| Providers | ChatGPT, Gemini, Copilot | ChatGPT, Gemini, Copilot | 3 |
| Sectors | Banking | Banking | 1 |
| Gate | ✅ Passed | ✅ Passed | ✅ Passed |
Note: Two independent reviewers produced slightly different Entity labels on 2-3 borderline Copilot rows. Both agree the combined gate passes comfortably.
Appendix C: Version History
| Version | Effective | Summary |
|---|---|---|
| v1.0 | 2026-05-11 | Inaugural methodology specification. Formalises per-observation scoring (binary), aggregation (unweighted arithmetic mean), Wilson confidence interval at 95%, score-status thresholds (definitive / preliminary / indicative), Stream 2 anonymisation with k≥5 entities per sector, publication discipline (raw evidence immutability + governed metadata-only cleanup), and InATJI primitives (hash-chain integrity, opaque public identifiers, input pinning, audit log). Validates implicitly against the P0D pilot frozen benchmark; no prior versioned methodology to migrate from. |
Future major or minor version changes append rows here per Rule 3.1.1.
End of Methodology Specification v1.0