Context Clashes
How phrasing and formatting unintentionally shift LLM behavior
Large Language Models often behave differently depending on how a user phrases, frames, or structures a message — even when the underlying intent is identical.
These unintended shifts, called context clashes, occur when seemingly minor changes to the input push the model into a different internal reasoning trajectory.
What Are Context Clashes?
A context clash occurs when:
- the task stays the same,
- but the model’s behavior changes,
because the surrounding linguistic or conversational context shifts the model into a different internal interpretation.
These shifts can be caused not only by differences in the prompt itself, but also by the preceding conversation, latent assumptions, and contextual cues the model has accumulated.
This phenomenon lies at the intersection of ambiguity, context grounding, and robustness, and can arise from a broad range of variations, including:
Prompt-related sources
- removing or reshuffling constraints
- stylistic changes (formal ↔ informal, neutral ↔ directive)
- paraphrasing or shortening instructions
- changes in question structure
- switching from explicit to implicit intent
- over-explaining or under-explaining details
- adding or removing reasoning cues (“step-by-step”, “briefly”, “explain like…”)
Context-related sources
- differences in conversation history
- earlier turns that shift the model’s stance or expectations
- subtle framing from user tone or prior messages
- implicit assumptions carried over from previous interactions
- conflicting or weak grounding signals
- accumulation of unintended context drift over multiple turns
Research Questions
- When do phrasing changes lead to unintended shifts in model behavior?
- How do different formulations occupy different regions of the model’s latent space?
- How can we detect context clashes through logits, entropy, and temperature comparisons?
- How can models be made more robust to these unintended contextual drifts?