Context Clashes | Lukas Ellinger

Large Language Models often behave differently depending on how a user phrases, frames, or structures a message — even when the underlying intent is identical.
These unintended shifts, called context clashes, occur when seemingly minor changes to the input push the model into a different internal reasoning trajectory.

What Are Context Clashes?

A context clash occurs when:

the task stays the same,
but the model’s behavior changes,
because the surrounding linguistic or conversational context shifts the model into a different internal interpretation.

These shifts can be caused not only by differences in the prompt itself, but also by the preceding conversation, latent assumptions, and contextual cues the model has accumulated.

This phenomenon lies at the intersection of ambiguity, context grounding, and robustness, and can arise from a broad range of variations, including:

removing or reshuffling constraints
stylistic changes (formal ↔ informal, neutral ↔ directive)
paraphrasing or shortening instructions
changes in question structure
switching from explicit to implicit intent
over-explaining or under-explaining details
adding or removing reasoning cues (“step-by-step”, “briefly”, “explain like…”)

differences in conversation history
earlier turns that shift the model’s stance or expectations
subtle framing from user tone or prior messages
implicit assumptions carried over from previous interactions
conflicting or weak grounding signals
accumulation of unintended context drift over multiple turns

Research Questions

When do phrasing changes lead to unintended shifts in model behavior?
How do different formulations occupy different regions of the model’s latent space?
How can we detect context clashes through logits, entropy, and temperature comparisons?
How can models be made more robust to these unintended contextual drifts?

What Are Context Clashes?

Prompt-related sources

Context-related sources

Research Questions

Selected Reference