Scientists have discovered clearly evidence that AI language models store memory and reasoning in separate neural pathways. The discovery could lead to more secure and transparent systems that can “forget” sensitive data without losing the ability to think.
Large language models, such as those in the GPT family, rely on two basic capabilities:
- Memorization that allows them to recall accurate facts, quotes, or training data.
- Reasoning that enables them to apply general principles to solve new problems.
Until now, scientists weren't sure whether these two functions were deeply interconnected or shared the same internal architecture. They decided to check it out and discovered that the separation was surprisingly clean. It shows that rote memorization relies on narrow, specialized neural pathways, while logical reasoning and problem solving use broader, common components. Most importantly, the researchers showed that memory circuits could be surgically removed with minimal impact on the model's ability to think.
In experiments on language models, millions of neural weights were ranked by a property called curvature, which measures the sensitivity of the model's performance to small changes. High curvature indicates flexible general purpose paths; low curvature means narrow, specialized. When the researchers removed the low curvature components – essentially disabling “memory circuits” – the model lost 97% of its ability to recall training data, but retained almost all of its reasoning ability.
One of the most unexpected discoveries was that arithmetic operations use the same neural pathways as memorization, not reasoning. After cleaning up memory-related components, math performance dropped dramatically, while logical problem solving remained almost unchanged.
This suggests that, for now, the AI is “memorizing” math rather than calculating it, like a student reciting times tables rather than doing calculations. This finding may explain why language models often struggle with even simple math without external tools.
The research team visualized the model's internal “loss landscape” – a conceptual map showing how wrong or accurate the AI's predictions are as its internal settings change. Using a mathematical tool called K-FAC (Kronecker Approximate Curvature), they identified which regions of the network corresponded to memory and which to reasoning.
Tests on multiple systems, including vision models trained on intentionally mislabeled images, proved the point: after removing the memorization components, memorization performance dropped to just 3%, but reasoning tasks such as logical deduction, common sense reasoning, and scientific reasoning remained stable at 95-106% of baseline.
Understanding these internal divisions can have profound implications for AI security and management. Models that memorize text verbatim risk leaking private information, copyrighted data, or malicious content. If engineers can selectively disable or edit memory circuits, they can build systems that retain intelligence while removing sensitive or biased data.
While the current technique does not guarantee permanent deletion, as “forgotten” data can sometimes reappear during retraining, the study is an important step towards improving AI transparency.


















