Shared Memory#
SharedMemory (mtf/memory.py) is a plain Python object — a list of MemoryEntry objects —
passed by reference to every phase, agent, and the debate engine. It is the single shared
state accumulator for the entire pipeline run.
Data model#
class MemoryEntry:
kind: MemoryKind # canonical tag (see table below)
content: str # the stored text
metadata: dict # arbitrary key-value pairs (agent_id, phase, filename, …)
SharedMemory.add(kind, content, **metadata) appends a new entry.
SharedMemory.filter(*kinds) returns entries matching any of the specified kinds.
SharedMemory.format_context(*kinds) produces the prompt-ready context block (see below).
MemoryKind values#
Kind |
Written by |
What it contains |
|---|---|---|
|
|
Quantitative digest of one image/PDF, or the cross-file synthesis when multiple files are provided. Each entry carries |
|
|
Full structured report from one literature agent instance: relevant papers, ranked hypotheses with basis/verification/failure-mode classification, key equations. |
|
|
Synthesis produced at the end of each phase’s debate call. Carries |
|
|
Free-text guidance entered by the user when rejecting a debate round. Read by literature and fitting agents in subsequent rounds. |
|
Literature phase |
Each approved hypothesis extracted from the literature synthesis (one entry per hypothesis line). Passed to the fitting phase as the list to iterate over. |
|
|
The |
|
|
Full review report from one reviewer agent instance, including per-hypothesis SUPPORTED/PLAUSIBLE/SPECULATIVE/REJECTED verdicts with check IDs cited. |
|
Literature phase (GPD) |
Physics convention defaults for one subfield, returned by GPD |
|
Fitting phase, literature phase, |
Structured check results from GPD verification tools. Written by: |
|
Fitting phase (GPD) |
Pre-dispatch pitfall warnings combining domain |
|
Literature phase (GPD) |
Cross-session convention-pitfall patterns pre-fetched before the first literature fan-out via |
|
|
Audit trail of auto-detected physics domains when |
|
Review phase ( |
Synthesized list of proposed new measurements, ordered by discriminating power. Written directly by the review phase after calling |
|
|
Summary of data and model items parsed from complex user-supplied input by the tool-builder agent. |
|
|
Qualitative hypothesis evaluation report produced when |
|
Qualitative phase |
Flag entry written when the fitting phase is skipped. Content: |
|
|
Original user phenomenon text, written once at run start before any phase. Guards against double-write. Never overwritten. |
|
|
Fabrication/integrity warnings from post-exec checks in |
Context injection#
Each concrete agent specifies which MemoryKind values it needs by passing extra_kinds to
BaseAgent._query(). Only the requested kinds are included in the context block prepended
to the agent’s prompt.
Context format#
SharedMemory.format_context(*kinds) produces:
=== SHARED CONTEXT ===
--- INDEX ---
[1] [USER_FEEDBACK] Increase the temperature range in your search...
[2] [IMAGE_DATA] ## Image Type\nLine graph …
[3] [CONVENTIONS] {"subfield": "condensed_matter", ...
[4] [PHENOMENON] We observe a sharp resistance drop...
--- FULL ENTRIES BELOW ---
[USER_FEEDBACK] Increase the temperature range in your search.
[IMAGE_DATA] ## Image Type\nLine graph …\n## Axes and Units …
[CONVENTIONS] {"subfield": "condensed_matter", "metric_signature": "+---", …}
[PHENOMENON] We observe a sharp resistance drop to zero at 92 K…
=== END CONTEXT ===
The index is prepended automatically when more than 3 entries are present, giving agents a navigable table of contents. _format_index(entries) is the private helper; format_index() is its public thin wrapper.
BaseAgent._build_prompt() prepends this block:
=== SHARED CONTEXT ===
…
=== END CONTEXT ===
Task: Investigate the following experimental phenomenon …
After the context block and task text, _build_prompt() always appends a honesty-enforcement reminder (_HONESTY_REMINDER). When CONVENTIONS entries are present, a convention-lock reminder (_CONVENTION_REMINDER) is also appended.
Which kinds each agent reads#
Agent |
|
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
DebateEngine.synthesize() always calls memory.format_context() with no arguments,
receiving all entries regardless of kind, plus it explicitly appends CONVENTIONS and
PHYSICS_VERDICT entries to the user content block.
MemoryKind.PHENOMENON is always present in memory after orchestrator start. It is not in any agent’s extra_kinds — agents encounter it only when DebateEngine.synthesize() calls format_context() with no arguments (receiving all kinds), or when an agent’s own extra_kinds happens to cover all kinds. Its primary role is as an audit anchor and context for the debate synthesis, not as a per-agent prompt injection.
Why IMAGE_DATA is included in every agent’s context#
LiteratureAgent, FittingAgent, and ReviewerAgent all include IMAGE_DATA in their
extra_kinds. This means extracted numerical data from user-supplied plots (axis values,
data series as Python lists, peak positions, slopes, error bars) is automatically visible
to every agent that performs analysis — no explicit passing of data is required.
Accumulation order#
Entries accumulate in chronological order within a single pipeline run:
[PHENOMENON] one entry at run start (orchestrator init)
[IMAGE_DATA] per file, then cross-file synthesis (phase 0)
[CONVENTIONS] per domain (start of phase 1)
[LITERATURE] N entries, one per lit agent (phase 1 round 1 …)
[USER_FEEDBACK] 0 or more, one per rejection (phase 1)
[DEBATE] one per round (phase="literature") (phase 1)
[HYPOTHESIS] one per approved hypothesis line (phase 1 approval)
# Fitting path (default):
[FITTING_WARNINGS] 0 or more, per domain × hypothesis (phase 2 pre-dispatch)
[PHYSICS_VERDICT] 0 or more (convention check pre-exec) (phase 2 fit)
[FIT_RESULT] M × N_hypotheses entries (phase 2)
[INTEGRITY_WARNING] 0 or more (if fabrication detected) (phase 2 fit)
[PHYSICS_VERDICT] 0 or more (checks 5.1 + 5.3) (phase 2 post-fit)
[DEBATE] one (phase="fitting") (phase 2)
[PHYSICS_VERDICT] 0 or more (dimensional postscript) (phase 2 debate)
# Qualitative path (--no-fitting):
[QUALITATIVE_EVAL] N entries, one per eval agent (phase 2)
[FITTING_SKIPPED] one flag entry (phase 2)
[DEBATE] one (phase="qualitative") (phase 2)
[REVIEW] K entries, one per reviewer (phase 3)
[PHYSICS_VERDICT] 0 or more (run_check per hypothesis) (phase 3)
[DEBATE] one (phase="review") (phase 3)
[PHYSICS_VERDICT] 0 or more (dimensional postscript) (phase 3 debate)
[PROPOSALS] one (proposal synthesis) (phase 3)
Thread safety#
SharedMemory contains no locks. It is safe under asyncio’s single-threaded event loop
for the main pipeline.
The GUI (StreamlitInterface) runs the orchestrator in a separate daemon thread with its own
event loop. Communication between the orchestrator thread and the Streamlit UI thread goes
through queue.Queue pairs — the orchestrator thread never reads or writes SharedMemory
from within the Streamlit thread, so no cross-thread access occurs.