Skip to main content

Education

AI Contextual Accuracy: Why AI Can Be Right in General and Wrong About You

Contextual accuracy is whether an AI answer is correct about your specific case, not merely correct in general. Lawnise explains why that distinction matters.

Lawnise Research & Editorial team

Institutional byline · published by Lawnise

Published2026-06-21~8 min readMethodology v1.1
Lawnise research explainer hero on a deep-blue ground: the headline 'Right in general. Wrong about you.' with the subhead 'Contextual accuracy, explained.', beside an abstract emerald-and-ice mark contrasting a broad field with a single specific point.

A relationship manager at an insurer is preparing for a renewal call. Before she dials, she does what her customers increasingly do too: she asks a public AI assistant to refresh her on the product. "What's the surrender value treatment on this kind of participating policy, and how does the cooling-off period work?" The answer is excellent. It explains the mechanics cleanly, gets the general principle exactly right, and would pass an exam. She could hand it to a trainee as a textbook definition.

It is also useless for the call she is about to make, because it describes the category and not the contract. Her customer's policy has a specific surrender schedule, a specific cooling-off window the insurer publishes, and a specific clause the customer actually signed. On every one of those, the assistant either guessed or generalised. The answer was right about insurance. It was wrong about this policy.

That distinction — right in general, wrong about you — is the whole subject of this piece. It is more common than the dramatic hallucination, harder to catch, and for a regulated institution it is the kind of error that actually reaches a customer. It deserves a name.

Two kinds of accuracy

We tend to talk about AI accuracy as if it were one thing: the answer is either right or it isn't. That framing hides the distinction that matters most to a bank or an insurer.

There is general accuracy — whether an answer is correct as a statement about the world. What is a structured deposit? How does a participating policy work? What does "cooling-off period" mean? Public AI assistants are now genuinely good at this. They have read more textbooks, circulars, and explainers than any single employee will, and on the generic version of almost any financial-services question they will give a clean, well-organised, broadly correct answer.

Then there is contextual accuracy — whether the answer is correct about the specific thing the person is actually asking about. Not "what is a structured deposit" but "what does this institution's structured deposit cost today." Not "how does cooling-off work" but "how long is the window on this policy, this week." This is a different question, and being excellent at the first kind of accuracy does not make a system good at the second. Often it makes the failure worse, because the fluent generic answer arrives wearing all the confidence of the correct one.

So, a definition, stated plainly so it can be quoted:

Contextual accuracy is whether an AI answer is correct about your specific case — your current price, your live policy, your actual process, your latest facts — not merely correct in general.

General accuracy is about the category. Contextual accuracy is about you. The gap between them is where the damage lives.

Why general correctness doesn't carry over

It would be reasonable to assume that a system this good at the general question must be at least decent at the specific one. The assumption is wrong, and the reason it's wrong is structural rather than a flaw that better models will quietly fix.

General knowledge is stable and widely repeated. The definition of a participating policy is the same this year as last, written down in thousands of places, so a model trained on the public web absorbs it thoroughly. Your specifics are the opposite. Your current tariff, your live policy wording, your latest process change, the product you retired last quarter — these are narrow, they change, and many of them were never in the training data in a current form to begin with. The model has no reliable, up-to-date picture of your particular case, so when asked about it the system does the thing it was built to do: it produces the most plausible-sounding answer, which is the generic answer dressed in specific-sounding detail.

That is the trap. A wholly wrong answer often signals its own wrongness — it reads oddly, or the assistant hedges. A contextually wrong answer does not, because the scaffolding is genuinely correct. The mechanism is described accurately; only the figure, the date, the clause, or the channel that applies to you is off. It is right enough to be trusted and wrong exactly where trusting it costs something.

What contextual drift actually looks like

The failures aren't exotic, which is precisely why they slip past. Across its work in the sector, Lawnise has observed the same few shapes recurring — each one a case where the general answer was fine and the contextual one was not.

A product that the institution discontinued is still described as available, with the assistant cheerfully explaining the general way such products work and how to apply — accurate about the category, wrong that this firm still offers it. A current product is explained correctly in principle but quoted at a stale price, an old fee, or a lock-in that no longer matches the live tariff — the mechanics right, the number that applies to this customer wrong. A process is described the way the process generally works rather than the way this institution actually runs it: a cooling-off window stated at the textbook length rather than the one the firm publishes, a complaint timeline more generous than the firm's own written commitment, a fraud-reporting route given as the generic best practice rather than this firm's live channel. And the quiet tell beneath all of them: ask several assistants the same specific question and you often get several different specifics back, none of them quite the institution's own. When the general framing is shared but the particulars diverge, the divergence is the contextual error surfacing.

None of these are the system failing at AI. They are the system succeeding at the general question and being asked to answer a contextual one it has no current basis to answer.

Where this is documented

This is not a thought experiment, and the place it stops being abstract is the barometer. Lawnise runs public barometer pages that ask public AI assistants the real, high-intent questions a customer would ask about a specific scope — and then check each answer against the institution's own approved, current facts. The pattern this piece names is what those pages document, scope by scope.

The ASEAN AI accuracy barometer hub is the entry point. Beneath it, four scope barometers record the same contextual-accuracy gap in four different regulated settings: Singapore banking, Singapore insurance, Malaysia banking, and Malaysia insurance. Read them as the evidence that contextual drift is real and recurring rather than occasional — they are where the pattern is observed and logged, scope by scope, against current institutional facts.

What they show, in the language of this piece, is consistent: the general answers tend to be fine. The contextual ones are where the assistants and the institution's own truth keep failing to line up.

Why this is the accuracy that matters

For a regulated institution, contextual accuracy is not the interesting kind of accuracy. It is the only kind that touches a customer.

No one is harmed by a correct textbook definition. People are harmed by a confidently wrong specific — a price they budget around, a deadline they plan to, an eligibility rule they rely on, a safety step they follow in a hurry. Every one of those is a contextual claim, and every one of them travels in the institution's name even though the institution didn't write it and can't see it. The customer asking does not parse the distinction between general and contextual accuracy; they asked about their bank or their policy and they received an answer that sounded like the institution speaking. When the specific is wrong, the institution is who they hold to it.

This is also why the reassuring industry line — that the models keep getting better — misses the point. They are getting better, mostly at general accuracy, the kind that was already strong. Contextual accuracy doesn't improve the same way, because no general-purpose model can stay current on every product, price, clause, and process of every regulated firm in every market. The gap isn't a temporary roughness on the way to perfect. It is a structural feature of asking a system built on broad public knowledge a narrow, current, institution-specific question.

Which connects this directly to the governance question we've made elsewhere: that AI answer accuracy is becoming a governance issue for financial institutions, because responsibility for what's said in your name sits with you even when control does not. Contextual accuracy is the precise place that gap opens. It is the part of the answer the institution actually owns the truth of — and the part a general-purpose assistant is least equipped to get right.

The shape of the exposure

It is worth being careful about what is and isn't being claimed, because overstating it helps no one. No regulator has ruled that an institution is answerable for every word a third-party model generates about it, and this piece doesn't claim one has. The point is narrower and more durable.

It is simply that the accuracy worth governing is contextual accuracy, not general accuracy — and that the two are easy to confuse precisely because public assistants are now so fluent at the general kind. An institution that reassures itself by noting how good these systems have become is measuring the wrong accuracy. The question is never whether an assistant understands insurance or banking in general. It is whether, asked about this institution, this product, this policy, this week, the answer holds. That is the accuracy a customer acts on, and it is the accuracy a board or supervisor may ask whether anyone was watching.

The institutions that handle this well won't be the ones waiting for general-purpose models to somehow become current on their specifics. That isn't on offer. They'll be the ones that learned to tell the two accuracies apart — and started treating contextual accuracy as something they are responsible for knowing, checking, and recording.

Right in general, wrong about you. Naming that gap is the first step to governing it. If contextual accuracy is something your institution wants to understand for its own scope, we're glad to talk.

How to cite this

Short form
Lawnise Research & Editorial team. (2026). AI Contextual Accuracy: Why AI Can Be Right in General and Wrong About You. Lawnise. https://www.lawnise.com/research/ai-contextual-accuracy
Long form (APA)
Lawnise Research & Editorial team. (2026, June 21). AI Contextual Accuracy: Why AI Can Be Right in General and Wrong About You (Methodology v1.1). Lawnise. https://www.lawnise.com/research/ai-contextual-accuracy
BibTeX
@misc{lawnise2026aicontextualaccuracy,
  author = {Lawnise Research and Editorial team},
  title = {AI Contextual Accuracy: Why AI Can Be Right in General and Wrong About You},
  year = {2026},
  publisher = {Lawnise},
  url = {https://www.lawnise.com/research/ai-contextual-accuracy}
}

References

  1. [1]Lawnise Methodology (v1.1). Contextual accuracy is the gap between answers that are correct in general and answers that are correct about a specific institution, product, policy, price, or process. Lawnise uses this methodology to explain why public AI answers must be checked against current institutional facts. https://www.lawnise.com/trust-index/methodology/v1#main

About Lawnise

Lawnise is an independent AI verification platform for regulated financial institutions. We monitor and verify what public AI systems say about banks, insurers and other regulated brands, preserving the evidence trail needed to manage AI accuracy risk as a governance discipline.

Methodology v1.1 · Right to reply

If you are responsible for your firm's AI-visibility posture, we will walk you through what AI is saying about your brand.

Book Briefing