Dynamic Knowledge Circuits

Abstract

Despite exceptional capabilities in knowledge-intensive tasks, Large Language Models (LLMs) face a critical gap in understanding how they internalize new knowledge, particularly how to structurally embed acquired knowledge in their neural computations. We address this issue through the lens of Knowledge Circuits Evolution, identifying computational subgraphs that facilitate knowledge storage and processing. Our systematic analysis of circuit evolution throughout continual pre-training reveals several key findings: (1) the acquisition of new knowledge is influenced by its relevance to pre-existing knowledge; (2) the evolution of knowledge circuits exhibits a distinct phase shift from formation to optimization; (3) the evolution of knowledge circuits follows a deep-to-shallow pattern. These insights not only advance our theoretical understanding of the mechanisms of new knowledge acquisition in LLMs, but also provide potential implications for improving continual pre-training strategies to enhance model performance.

Overview

Figure 1: Illustration of our findings: Phase shift from formation to optimization in the evolution of knowledge circuits, each phase characterized by distinct features at the performance, topology, and component levels.

In this paper, we investigate the mechanism of new knowledge acquisition in LLMs from the perspective of knowledge circuits. By analyzing the evolution of knowledge circuits throughout continual pre-training, we uncover several interesting findings, as illustrated in Figure 1. Key findings of the paper are summarized as:

⚪ (Performance-level) The acquisition of new knowledge is significantly influenced by its relevance to pre-existing knowledge, with relevant new knowledge being integrated more efficiently than completely new knowledge.
⚪ (Topology-level) In the process of knowledge acquisition, the evolution of knowledge circuits exhibits a distinct phase shift from formation to optimization, each marked by unique structural and behavioral characteristics.
⚪ (Components-level) The evolution of knowledge circuits follows a deep-to-shallow pattern, where mid-to-deeper layers first develop the extraction function, and later, lower layers enrich their knowledge representations.

Figure 2: Hit@10 of the performance of knowledge circuits in GPT-2 Small, GPT-2 Medium and Phi-1.5 throughout training. Left: Performance for circuits discovered by different types of knowledge, where K_rel and K_compl represent relevant new knowledge and completely new knowledge, respectively. Right: Performance for circuits discovered by different frequencies of knowledge, where Low-freq, Medium-freq, and High-freq represent knowledge with frequencies in the ranges [1, 2), [2, 5] and (5, 27], respectively. Note that we smooth the curves using a window size of 3 epochs for all settings.

The results depicted in Figure 1 reveal a consistent growth pattern in the Hit@10 metric until it approaches its upper bound, which demonstrates the sustained knowledge acquisition capability of knowledge circuits throughout continual pre-training. Notably, the K_rel performance curve consistently lies above the curve for K_compl, suggesting that LLMs exhibit preferential learning efficiency when assimilating knowledge extensions within existing conceptual frameworks, as opposed to acquiring completely new knowledge.

Takeaway: Knowledge Relevance Principle The acquisition of new knowledge is influenced by its relevance to pre-existing knowledge. LLMs exhibit learning efficiency advantages when acquiring relevant new knowledge versus completely new knowledge.

This insight could motivate the utilization of data curriculums in continual pre-training, by organizing the data in a way that mimics the structure and distribution of the original corpus, thereby enabling the model to integrate new information more efficiently.

Structural Consistency and Topological Centralization

Figure 3: Top: Edges Jaccard Similarity of intermediate knowledge circuits with the circuits at the final checkpoint. Bottom: Knowledge Circuit Entropy of knowledge circuits throughout training. K_rel and K_compl represent relevant new knowledge and completely new knowledge, respectively. Low-freq, Medium-freq, and High-freq represent knowledge with frequencies in the ranges [1, 2), [2,5] and (5, 27], respectively.

We first quantify the structural consistency of knowledge circuits by measuring the Jaccard Similarity between edge sets (Figure 3 Top) within knowledge circuits at intermediate checkpoints relative to the final circuit. The metric exhibit a consistent monotonic upward trend throughout training, indicating that the knowledge circuits become increasingly similar to the final circuit. This convergence pattern suggests an evolutionary process where knowledge circuits progressively stabilize their core architecture as knowledge acquisition progresses.

Our results in Figure 3 Bottom show a stable downward trend in the knowledge circuit entropy metric for edges in the subgraph across all models, suggesting that the identified knowledge circuits become increasingly centralized, with the importance of critical edges growing as knowledge acquisition progresses. We also observe that the downward trend of the knowledge circuit entropy slows down significantly after a certain turning point during the training of all models. We attribute this interesting phenomenon to a phase shift in the evolution of knowledge circuits across continual pre-training.

In the initial formation phase of knowledge circuits, less efficient knowledge circuits gradually take shape within the models, resulting in a rapid decrease in circuit entropy. At the phase shift points, the knowledge circuits reach a status of stability where the most critical nodes and edges have been involved. In the subsequent optimization phase, the topology composed of critical nodes and edges becomes more stable, while the computations within these components are being optimized to represent and retrieve the knowledge more efficiently, leading to a slowdown in the rate of decrease in circuit entropy.

Aligning Topology with Specific Knowledge Circuits

Figure 4: Hit@10 of the performance of aligned knowledge circuits in GPT-2 Small throughout training. Init, Before, After, Last represents the circuits whose topologies align with those at the initial checkpoint, the checkpoint before the phase shift, the checkpoint after the phase shift, and the final checkpoint, respectively. Original represents the original knowledge circuits at each checkpoint. Note that we smooth the curves using a window size of 3 epochs.

To clarify the influence of the topology of knowledge circuits on performance, we conduct a detailed examination of the knowledge circuits at several key training checkpoints. Specifically, we focus on the knowledge circuits at the initial checkpoint, the checkpoint immediately before the phase shift point, the checkpoint immediately after the phase shift point, and the last checkpoint. We align the topology of the knowledge circuits at each checkpoint throughout training with those of focus and then evaluate the performance for aligned circuits.

The results in Figure 4 reveal that the performance of all aligned circuits remain unchanged during the formation phase. However, each circuit begins to improve its performance during the optimization phase, with those aligned with the post-phase-shift topologies (After and Last) ultimately performing, on average, 54\% better than those aligned with the pre-phase-shift topologies (Init and Before). This observation suggests the evolution of the topology of knowledge circuits at the phase shift point plays a crucial role in improving circuit performance.

Takeaway: Biphasic Circuit Evolution

The evolution of knowledge circuits exhibits a distinct phase shift from formation to optimization, each marked by unique structural and behavioral characteristics.

This finding suggests that knowledge circuit state could serve as a valuable tracking status for the continual pre-training process, enabling more informed adjustments to the training method or data in response to different phases.

Specialized Nodes

Figure 5: Proportion of specialized attention heads in all nodes of the knowledge circuits throughout training for GPT-2 Small and GPT-2 Medium. Note that we smooth the curves using a window size of 3 epochs.

We check the emergence and track the proportion of these specialized attention heads in all possible nodes of the knowledge circuits throughout training, and present our results in Figure 5. We observe that during the circuit formation phase, mover heads gradually emerge from nearly zero, while the proportion of relation heads decreases until the phase shift. In the circuit optimization phase, the proportion of all kinds of attention heads stabilizes. The proportion of mixture heads remains stable throughout training.

Figure 6: Top: Layer distribution of head in the knowledge circuits in GPT-2 Small throughout training. Bottom: Layer distribution of relation head in the knowledge circuits in GPT-2 Small throughout training.

We further examine the layer-wise distribution of mover heads and relation heads within knowledge circuits throuout training. Our results in Figure 6 reveal that the increase in mover heads and the decrease in relation heads primarily occur in the mid-to-deeper layers during the circuit formation phase.

Activated Edges

Figure 7: Layer distribution of the edges activation ratio within the knowledge circuits in GPT-2 Small.

Next, we investigate how the nodes within knowledges circuits propagate information to subsequent components through the edges. Specifically, we analyze the variation in edge activation patterns across different layers of the network throughout training. We quantify the edge activation ratio for each layer by calculating the proportion of edges originating from that layer within the knowledge circuit, relative to all possible edges originating from that layer in the whole model.

Our results in Figure 7 reveal that, during the circuit formation phase, the edges activation ratios in the lower layers gradually decrease, while those in the mid-to-deeper layers exhibit a corresponding increase. However, as training progresses, a transition occurs around the phase shift point, where the edge activation ratios begin to stabilize.

Evolutionary Pattern of Components

During the early training phase of circuit formation, the focus is primarily on developing the extraction function within the nodes of the mid-to-deeper layers of the knowledge circuits. This is reflected in the increased emergence of mover heads and activated edges, along with a decrease in the presence of relation heads in these layers. This process continues until the extraction function is fully established at the phase shift point, as demonstrated by the similar performance advantage of circuits aligned with the post-phase-shift topologies over those aligned with the pre-phase-shift topologies. In the subsequent training phase of circuit optimization, the focus shifts to enriching knowledge representations in the lower layers, evidenced by a stabilized topology and component structure, but with a rapid improvement in the performance of knowledge circuits.

Takeaway: Deep-to-Shallow Pattern

The evolution of knowledge circuits follows a deep-to-shallow pattern, where mid-to-deeper layers first develop the extraction function, and later, lower layers enrich their knowledge representations.

BibTeX

@misc{ou2025llmsacquirenewknowledge,
    title={How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training}, 
    author={Yixin Ou and Yunzhi Yao and Ningyu Zhang and Hui Jin and Jiacheng Sun and Shumin Deng and Zhenguo Li and Huajun Chen},
    year={2025},
    eprint={2502.11196},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    url={https://arxiv.org/abs/2502.11196}, 
}

This website is adapted from Nerfies, licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

How Do LLMs Acquire New Knowledge?

A Knowledge Circuits Perspective on Continual Pre-Training