API Reference

API Reference#

Auto-generated from source docstrings.

Core#

Configuration dataclass for MTF.

class mtf.config.MTFConfig(n_literature=2, citation_verification=True, citation_verification_max=10, n_fitting=2, n_qualitative=2, n_reviewer=2, reviewer_verification_passes=1, reviewer_models=<factory>, literature_model='claude-haiku-4-5-20251001', fitting_model='claude-sonnet-4-6', reviewer_model='claude-sonnet-4-6', debate_model='claude-sonnet-4-6', image_digest_model='claude-sonnet-4-6', max_debate_rounds=3, fitting_enabled=True, fitting_scope='per_hypothesis', fitting_semaphore_limit=6, toolkit_items=<factory>, enable_gpd_mcp=True, physics_domains=<factory>, gpd_servers=<factory>, auto_detect_domains=False, gpd_domain_detection_max_domains=3, literature_plausibility_screen=True, auto_reject_physics_failures=False, fitting_convention_check=True, fitting_max_convention_retries=1, fitting_result_integrity_check=True, n_proposal=2, proposal_model='claude-sonnet-4-6', followup_model='claude-sonnet-4-6', pdf_enhanced_extraction=True, pdf_figure_extraction_max_tokens=8192, pdf_min_size_kb_for_enhanced=200)[source]#

Bases: object

Top-level configuration for an MTF run.

Parameters:

n_literature (int)
citation_verification (bool)
citation_verification_max (int)
n_fitting (int)
n_qualitative (int)
n_reviewer (int)
reviewer_verification_passes (int)
reviewer_models (list[str])
literature_model (str)
fitting_model (str)
reviewer_model (str)
debate_model (str)
image_digest_model (str)
max_debate_rounds (int)
fitting_enabled (bool)
fitting_scope (str)
fitting_semaphore_limit (int)
toolkit_items (dict[str, object])
enable_gpd_mcp (bool)
physics_domains (list[str])
gpd_servers (list[str])
auto_detect_domains (bool)
gpd_domain_detection_max_domains (int)
literature_plausibility_screen (bool)
auto_reject_physics_failures (bool)
fitting_convention_check (bool)
fitting_max_convention_retries (int)
fitting_result_integrity_check (bool)
n_proposal (int)
proposal_model (str)
followup_model (str)
pdf_enhanced_extraction (bool)
pdf_figure_extraction_max_tokens (int)
pdf_min_size_kb_for_enhanced (int)

auto_detect_domains: bool = False#

auto_reject_physics_failures: bool = False#

citation_verification: bool = True#

citation_verification_max: int = 10#

debate_model: str = 'claude-sonnet-4-6'#

enable_gpd_mcp: bool = True#

fitting_convention_check: bool = True#

fitting_enabled: bool = True#

fitting_max_convention_retries: int = 1#

fitting_model: str = 'claude-sonnet-4-6'#

fitting_result_integrity_check: bool = True#

fitting_scope: str = 'per_hypothesis'#

fitting_semaphore_limit: int = 6#

followup_model: str = 'claude-sonnet-4-6'#

gpd_domain_detection_max_domains: int = 3#

gpd_servers: list[str]#

image_digest_model: str = 'claude-sonnet-4-6'#

literature_model: str = 'claude-haiku-4-5-20251001'#

literature_plausibility_screen: bool = True#

max_debate_rounds: int = 3#

n_fitting: int = 2#

n_literature: int = 2#

n_proposal: int = 2#

n_qualitative: int = 2#

n_reviewer: int = 2#

pdf_enhanced_extraction: bool = True#

pdf_figure_extraction_max_tokens: int = 8192#

pdf_min_size_kb_for_enhanced: int = 200#

physics_domains: list[str]#

proposal_model: str = 'claude-sonnet-4-6'#

reviewer_model: str = 'claude-sonnet-4-6'#

reviewer_models: list[str]#

reviewer_verification_passes: int = 1#

toolkit_items: dict[str, object]#

Shared in-process memory for all agents and phases.

class mtf.memory.MemoryEntry(kind, content, metadata=<factory>)[source]#

Bases: object

Parameters:

kind (MemoryKind)
content (str)
metadata (dict[str, Any])

content: str#

kind: MemoryKind#

metadata: dict[str, Any]#

class mtf.memory.MemoryKind(value)[source]#

Bases: str, Enum

CONVENTIONS = 'conventions'#

DEBATE = 'debate'#

DOMAIN_CLASSIFICATION = 'domain_classification'#

DOMAIN_PATTERNS = 'domain_patterns'#

FITTING_SKIPPED = 'fitting_skipped'#

FITTING_WARNINGS = 'fitting_warnings'#

FIT_RESULT = 'fit_result'#

HYPOTHESIS = 'hypothesis'#

IMAGE_DATA = 'image_data'#

INTEGRITY_WARNING = 'integrity_warning'#

LITERATURE = 'literature'#

PHENOMENON = 'phenomenon'#

PHYSICS_VERDICT = 'physics_verdict'#

PROPOSALS = 'proposals'#

QUALITATIVE_EVAL = 'qualitative_eval'#

REVIEW = 'review'#

TOOLKIT_DIGEST = 'toolkit_digest'#

USER_FEEDBACK = 'user_feedback'#

class mtf.memory.SharedMemory[source]#

Bases: object

Central store passed by reference to all agents and phases.

Thread-safe by virtue of asyncio single-threaded event loop.

add(kind, content, **metadata)[source]#

Return type:

None

Parameters:

kind (MemoryKind)
content (str)
metadata (Any)

filter(*kinds)[source]#

Return type:: list[MemoryEntry]
Parameters:: kinds (MemoryKind)

fit_results()[source]#

Return type:: list[MemoryEntry]

format_context(*kinds)[source]#

Return type:: str
Parameters:: kinds (MemoryKind)

format_index()[source]#

Return a compact one-line-per-entry table of contents for all entries.

Return type:: str

hypotheses()[source]#

Return type:: list[str]

MTFOrchestrator: sequences all phases.

class mtf.orchestrator.MTFOrchestrator(config=None, interface=None, toolkit=None)[source]#

Bases: object

Top-level orchestrator that sequences literature → fitting → review phases.

Parameters:

config (MTFConfig | None)
interface (HumanInterface | None)
toolkit (ToolkitRegistry | None)

async run(phenomenon, files=None)[source]#

Run the full MTF pipeline on a phenomenon description.

Parameters:

phenomenon (str) – Text description of the experimental phenomenon.
files (list[str | Path] | None) – Optional list of input file paths (images: PNG, JPG, GIF, WebP; documents: PDF) to digest before running the main analysis phases. Extracted data is stored in SharedMemory and made available to all downstream agents.

Return type:

str

Returns:

Final report as a string.

DebateEngine: synthesizes multiple agent reports into a single coherent summary.

class mtf.debate.DebateEngine(config, memory, gpd=None)[source]#

Bases: object

Single-shot Anthropic API call that synthesizes N agent reports.

Parameters:

config (MTFConfig)
memory (SharedMemory)
gpd (GPDMCPClient | None)

async synthesize(reports, phase, extra_context='', store_as_debate=True)[source]#

Synthesize reports from multiple agents into one summary.

Not an agentic call — a plain messages.create() for speed. For fitting and review phases, appends an objective dimensional check postscript using GPD if available (Addition 8).

Parameters:

store_as_debate (bool) – If True (default), store the result as MemoryKind.DEBATE. Pass False when the caller handles storage itself (e.g. review_phase stores proposals as MemoryKind.PROPOSALS to avoid duplicating the text).
reports (list[str])
phase (str)
extra_context (str)

Return type:

str

Agents#

Base agent wrapping claude-agent-sdk query().

class mtf.agents.base.BaseAgent(agent_id, model, tools, memory, system_prompt)[source]#

Bases: object

Wraps sdk.query() and injects shared memory context into every prompt.

Parameters:

agent_id (str)
model (str)
tools (list[Any])
memory (SharedMemory)
system_prompt (str)

File Digest Agent: extracts quantitative data from experimental images, plots, and PDFs.

class mtf.agents.image_digest.FileDigestSubagent(config, client)[source]#

Bases: object

Digests a single file (image or PDF) and returns a structured analysis.

This is the leaf-level subagent: it handles exactly one file and has no SharedMemory interaction — results are returned to the coordinating ImageDigestAgent which stores them.

Parameters:

config (MTFConfig)
client (anthropic.Anthropic)

async digest(path)[source]#

Analyse a single file and return the digest text.

Return type:: str
Parameters:: path (Path)

class mtf.agents.image_digest.ImageDigestAgent(config, memory)[source]#

Bases: object

Coordinates parallel file digestion and synthesizes a unified cross-file analysis.

For each input file, spawns a FileDigestSubagent in parallel. After all subagents complete, a synthesis call combines their outputs into a unified analysis stored in SharedMemory alongside the individual digests.

Supports images (PNG, JPG, GIF, WebP) and PDFs.

Parameters:

config (MTFConfig)
memory (SharedMemory)

async digest(file_path)[source]#

Analyse a single file and store the result in SharedMemory.

Returns the structured digest text.

Return type:: str
Parameters:: file_path (str | Path)

async digest_all(file_paths)[source]#

Digest multiple files in parallel via subagents, then synthesize.

Spawns one FileDigestSubagent per file concurrently. Individual digests are stored in SharedMemory as they complete. If more than one file is provided, a final synthesis call combines all digests into a unified cross-file analysis (also stored in SharedMemory).

Returns the list of individual digest texts.

Return type:: list[str]
Parameters:: file_paths (list[str | Path])

LiteratureAgent: searches arxiv and semantic scholar for relevant papers.

class mtf.agents.literature.LiteratureAgent(agent_id, model, memory, gpd_tools=None, config=None)[source]#

Bases: BaseAgent

Parameters:

agent_id (str)
model (str)
memory (SharedMemory)
gpd_tools (list[Any] | None)
config (Any | None)

async investigate(phenomenon)[source]#

Return type:: str
Parameters:: phenomenon (str)

FittingAgent: proposes and runs fits for a given hypothesis.

class mtf.agents.fitting.FittingAgent(agent_id, model, memory, toolkit, gpd_tools=None, gpd=None, config=None)[source]#

Bases: BaseAgent

Parameters:

agent_id (str)
model (str)
memory (SharedMemory)
toolkit (ToolkitRegistry)
gpd_tools (list[Any] | None)
gpd (GPDMCPClient | None)
config (MTFConfig | None)

async fit(hypothesis)[source]#

Generate and execute fitting code for this hypothesis.

Return type:: str
Parameters:: hypothesis (str)

async identify_needed_toolkit_items(hypothesis)[source]#

Ask the agent which toolkit items it needs to fit this hypothesis.

Return type:: list[str]
Parameters:: hypothesis (str)

ReviewerAgent: evaluates theory validity and suggests further experiments.

class mtf.agents.reviewer.ReviewerAgent(agent_id, model, memory, gpd_tools=None)[source]#

Bases: BaseAgent

Parameters:

agent_id (str)
model (str)
memory (SharedMemory)
gpd_tools (list[Any] | None)

async review(phenomenon)[source]#

Return type:: str
Parameters:: phenomenon (str)

Toolkit#

ToolkitRegistry: holds user-provided experimental data and model functions.

class mtf.toolkit.registry.ToolkitRegistry[source]#

Bases: object

Registry for user-supplied data arrays and model callables.

Users register items before calling MTFOrchestrator.run() so that FittingAgents can request them by name.

all_data()[source]#

Return type:: dict[str, Any]

describe()[source]#

Return type:: str

get_data(name)[source]#

Return type:: Any
Parameters:: name (str)

get_model(name)[source]#

Return type:: Callable[..., Any]
Parameters:: name (str)

list_data()[source]#

Return type:: list[str]

list_models()[source]#

Return type:: list[str]

register_data(name, value)[source]#

Return type:

None

Parameters:

name (str)
value (Any)

register_model(name, fn)[source]#

Return type:

None

Parameters:

name (str)
fn (Callable[[...], Any])

Tools#

Fitting tools: executes agent-generated lmfit code in a sandboxed namespace.

mtf.tools.fitting_tools.run_fitting_code(code, data, integrity_check=True)[source]#

Execute agent-generated fitting code in a sandboxed namespace.

The code has access to: numpy (np), lmfit (Model, Parameters, minimize), scipy.optimize, scipy.stats, and the ‘data’ dict provided by the user toolkit.

The code MUST assign its results to a variable named ‘result’ which is returned to the caller.

When integrity_check=True (default), sentinel wrappers detect whether a real optimizer was called and _validate_result() checks for obvious fabrications. When integrity_check=False, the real lmfit symbols are injected directly and no validation is performed.

Parameters:

code (str) – Python source code string produced by a FittingAgent.
data (dict[str, Any]) – Dict of user-provided arrays / scalars from the toolkit registry.
integrity_check (bool) – Whether to run anti-fabrication checks (default True).

Return type:

dict[str, Any]

Returns:

The value of ‘result’ from the executed code, or an error dict. May include an ‘integrity_warnings’ key with a list of warning strings.

GPD MCP client: bridges GPD MCP servers to sdk.SdkMcpTool objects.

Each GPD server runs as a subprocess communicating over stdio MCP protocol. A dedicated background event loop handles all async MCP I/O, exposing SdkMcpTool-compatible callables to mtf agents.

class mtf.tools.gpd_mcp.GPDMCPClient[source]#

Bases: object

Manages live connections to one or more GPD MCP servers.

All async MCP I/O runs in a dedicated background thread with its own event loop so that sync tool functions can block-call into the async MCP session without conflicting with the main asyncio event loop used by the rest of mtf.

Usage:

client = GPDMCPClient()
client.start(["verification", "errors", "protocols", "conventions", "patterns"])
tool = client.make_tool("verification", "get_checklist",
                        "Get domain-specific physics checklist.")
# ... pass tool to agents ...
client.close()

async async_call(server, tool_name, **kwargs)[source]#

Async wrapper around call() for use inside asyncio.gather() fan-outs.

Offloads the blocking wait to a thread-pool thread so it does not block the main asyncio event loop.

Return type:

str

Parameters:

server (str)
tool_name (str)
kwargs (Any)

property available: bool#: True if at least one GPD MCP server started successfully.

call(server, tool_name, **kwargs)[source]#

Synchronously call an MCP tool and return its text result.

Returns an empty string if the server is unavailable.

Return type:

str

Parameters:

server (str)
tool_name (str)
kwargs (Any)

close()[source]#

Shut down all MCP server subprocesses and stop the background loop.

Return type:: None

make_tool(server, tool_name, description)[source]#

Return an sdk.SdkMcpTool backed by the given MCP server tool.

Returns None if the server was not started (e.g. GPD not installed), so callers can filter None values out of their tool lists.

Return type:

SdkMcpTool | None

Parameters:

server (str)
tool_name (str)
description (str)

start(server_names)[source]#

Start the requested GPD MCP servers.

Silently no-ops if the get-physics-done package is not installed.

Return type:: None
Parameters:: server_names (list[str])

Interface#

Human interface abstraction and CLI implementation.

class mtf.interface.CLIInterface[source]#

Bases: HumanInterface

Rich-based CLI interface.

async ask(prompt)[source]#

Return type:: str
Parameters:: prompt (str)

async ask_for_files()[source]#

Interactively ask the user for optional input file paths (images or PDFs).

Return type:: list[str]

async confirm(prompt)[source]#

Return type:: bool
Parameters:: prompt (str)

async show(content, title='')[source]#

Return type:

None

Parameters:

content (str)
title (str)

class mtf.interface.HumanInterface[source]#

Bases: ABC

Abstract interface for human interaction.

abstractmethod async ask(prompt)[source]#

Return type:: str
Parameters:: prompt (str)

async ask_for_files()[source]#

Prompt the user to optionally provide input file paths (images or PDFs).

Returns a (possibly empty) list of valid file path strings. Default implementation returns an empty list; override in subclasses that support interactive input.

Return type:: list[str]

abstractmethod async confirm(prompt)[source]#

Return type:: bool
Parameters:: prompt (str)

abstractmethod async show(content, title='')[source]#

Return type:

None

Parameters:

content (str)
title (str)

API Reference

Contents

API Reference#

Core#

Agents#

Toolkit#

Tools#

Interface#