API Reference#
Auto-generated from source docstrings.
Core#
Configuration dataclass for MTF.
- class mtf.config.MTFConfig(n_literature=2, citation_verification=True, citation_verification_max=10, n_fitting=2, n_qualitative=2, n_reviewer=2, reviewer_verification_passes=1, reviewer_models=<factory>, literature_model='claude-haiku-4-5-20251001', fitting_model='claude-sonnet-4-6', reviewer_model='claude-sonnet-4-6', debate_model='claude-sonnet-4-6', image_digest_model='claude-sonnet-4-6', max_debate_rounds=3, fitting_enabled=True, fitting_scope='per_hypothesis', fitting_semaphore_limit=6, toolkit_items=<factory>, enable_gpd_mcp=True, physics_domains=<factory>, gpd_servers=<factory>, auto_detect_domains=False, gpd_domain_detection_max_domains=3, literature_plausibility_screen=True, auto_reject_physics_failures=False, fitting_convention_check=True, fitting_max_convention_retries=1, fitting_result_integrity_check=True, n_proposal=2, proposal_model='claude-sonnet-4-6', followup_model='claude-sonnet-4-6', pdf_enhanced_extraction=True, pdf_figure_extraction_max_tokens=8192, pdf_min_size_kb_for_enhanced=200)[source]#
Bases:
objectTop-level configuration for an MTF run.
- Parameters:
n_literature (int)
citation_verification (bool)
citation_verification_max (int)
n_fitting (int)
n_qualitative (int)
n_reviewer (int)
reviewer_verification_passes (int)
reviewer_models (list[str])
literature_model (str)
fitting_model (str)
reviewer_model (str)
debate_model (str)
image_digest_model (str)
max_debate_rounds (int)
fitting_enabled (bool)
fitting_scope (str)
fitting_semaphore_limit (int)
toolkit_items (dict[str, object])
enable_gpd_mcp (bool)
physics_domains (list[str])
gpd_servers (list[str])
auto_detect_domains (bool)
gpd_domain_detection_max_domains (int)
literature_plausibility_screen (bool)
auto_reject_physics_failures (bool)
fitting_convention_check (bool)
fitting_max_convention_retries (int)
fitting_result_integrity_check (bool)
n_proposal (int)
proposal_model (str)
followup_model (str)
pdf_enhanced_extraction (bool)
pdf_figure_extraction_max_tokens (int)
pdf_min_size_kb_for_enhanced (int)
- auto_detect_domains: bool = False#
- auto_reject_physics_failures: bool = False#
- citation_verification: bool = True#
- citation_verification_max: int = 10#
- debate_model: str = 'claude-sonnet-4-6'#
- enable_gpd_mcp: bool = True#
- fitting_convention_check: bool = True#
- fitting_enabled: bool = True#
- fitting_max_convention_retries: int = 1#
- fitting_model: str = 'claude-sonnet-4-6'#
- fitting_result_integrity_check: bool = True#
- fitting_scope: str = 'per_hypothesis'#
- fitting_semaphore_limit: int = 6#
- followup_model: str = 'claude-sonnet-4-6'#
- gpd_domain_detection_max_domains: int = 3#
- gpd_servers: list[str]#
- image_digest_model: str = 'claude-sonnet-4-6'#
- literature_model: str = 'claude-haiku-4-5-20251001'#
- literature_plausibility_screen: bool = True#
- max_debate_rounds: int = 3#
- n_fitting: int = 2#
- n_literature: int = 2#
- n_proposal: int = 2#
- n_qualitative: int = 2#
- n_reviewer: int = 2#
- pdf_enhanced_extraction: bool = True#
- pdf_figure_extraction_max_tokens: int = 8192#
- pdf_min_size_kb_for_enhanced: int = 200#
- physics_domains: list[str]#
- proposal_model: str = 'claude-sonnet-4-6'#
- reviewer_model: str = 'claude-sonnet-4-6'#
- reviewer_models: list[str]#
- reviewer_verification_passes: int = 1#
- toolkit_items: dict[str, object]#
Shared in-process memory for all agents and phases.
- class mtf.memory.MemoryEntry(kind, content, metadata=<factory>)[source]#
Bases:
object- Parameters:
kind (MemoryKind)
content (str)
metadata (dict[str, Any])
- content: str#
- kind: MemoryKind#
- metadata: dict[str, Any]#
- class mtf.memory.MemoryKind(value)[source]#
Bases:
str,Enum- CONVENTIONS = 'conventions'#
- DEBATE = 'debate'#
- DOMAIN_CLASSIFICATION = 'domain_classification'#
- DOMAIN_PATTERNS = 'domain_patterns'#
- FITTING_SKIPPED = 'fitting_skipped'#
- FITTING_WARNINGS = 'fitting_warnings'#
- FIT_RESULT = 'fit_result'#
- HYPOTHESIS = 'hypothesis'#
- IMAGE_DATA = 'image_data'#
- INTEGRITY_WARNING = 'integrity_warning'#
- LITERATURE = 'literature'#
- PHENOMENON = 'phenomenon'#
- PHYSICS_VERDICT = 'physics_verdict'#
- PROPOSALS = 'proposals'#
- QUALITATIVE_EVAL = 'qualitative_eval'#
- REVIEW = 'review'#
- TOOLKIT_DIGEST = 'toolkit_digest'#
- USER_FEEDBACK = 'user_feedback'#
Bases:
objectCentral store passed by reference to all agents and phases.
Thread-safe by virtue of asyncio single-threaded event loop.
- Return type:
None- Parameters:
kind (MemoryKind)
content (str)
metadata (Any)
- Return type:
list[MemoryEntry]- Parameters:
kinds (MemoryKind)
- Return type:
list[MemoryEntry]
- Return type:
str- Parameters:
kinds (MemoryKind)
Return a compact one-line-per-entry table of contents for all entries.
- Return type:
str
- Return type:
list[str]
MTFOrchestrator: sequences all phases.
- class mtf.orchestrator.MTFOrchestrator(config=None, interface=None, toolkit=None)[source]#
Bases:
objectTop-level orchestrator that sequences literature → fitting → review phases.
- Parameters:
config (MTFConfig | None)
interface (HumanInterface | None)
toolkit (ToolkitRegistry | None)
- async run(phenomenon, files=None)[source]#
Run the full MTF pipeline on a phenomenon description.
- Parameters:
phenomenon (
str) – Text description of the experimental phenomenon.files (
list[str|Path] |None) – Optional list of input file paths (images: PNG, JPG, GIF, WebP; documents: PDF) to digest before running the main analysis phases. Extracted data is stored in SharedMemory and made available to all downstream agents.
- Return type:
str- Returns:
Final report as a string.
DebateEngine: synthesizes multiple agent reports into a single coherent summary.
- class mtf.debate.DebateEngine(config, memory, gpd=None)[source]#
Bases:
objectSingle-shot Anthropic API call that synthesizes N agent reports.
- Parameters:
config (MTFConfig)
memory (SharedMemory)
gpd (GPDMCPClient | None)
- async synthesize(reports, phase, extra_context='', store_as_debate=True)[source]#
Synthesize reports from multiple agents into one summary.
Not an agentic call — a plain messages.create() for speed. For fitting and review phases, appends an objective dimensional check postscript using GPD if available (Addition 8).
- Parameters:
store_as_debate (
bool) – If True (default), store the result as MemoryKind.DEBATE. Pass False when the caller handles storage itself (e.g. review_phase stores proposals as MemoryKind.PROPOSALS to avoid duplicating the text).reports (list[str])
phase (str)
extra_context (str)
- Return type:
str
Agents#
Base agent wrapping claude-agent-sdk query().
- class mtf.agents.base.BaseAgent(agent_id, model, tools, memory, system_prompt)[source]#
Bases:
objectWraps sdk.query() and injects shared memory context into every prompt.
- Parameters:
agent_id (str)
model (str)
tools (list[Any])
memory (SharedMemory)
system_prompt (str)
File Digest Agent: extracts quantitative data from experimental images, plots, and PDFs.
- class mtf.agents.image_digest.FileDigestSubagent(config, client)[source]#
Bases:
objectDigests a single file (image or PDF) and returns a structured analysis.
This is the leaf-level subagent: it handles exactly one file and has no SharedMemory interaction — results are returned to the coordinating ImageDigestAgent which stores them.
- Parameters:
config (MTFConfig)
client (anthropic.Anthropic)
- class mtf.agents.image_digest.ImageDigestAgent(config, memory)[source]#
Bases:
objectCoordinates parallel file digestion and synthesizes a unified cross-file analysis.
For each input file, spawns a FileDigestSubagent in parallel. After all subagents complete, a synthesis call combines their outputs into a unified analysis stored in SharedMemory alongside the individual digests.
Supports images (PNG, JPG, GIF, WebP) and PDFs.
- Parameters:
config (MTFConfig)
memory (SharedMemory)
- async digest(file_path)[source]#
Analyse a single file and store the result in SharedMemory.
Returns the structured digest text.
- Return type:
str- Parameters:
file_path (str | Path)
- async digest_all(file_paths)[source]#
Digest multiple files in parallel via subagents, then synthesize.
Spawns one FileDigestSubagent per file concurrently. Individual digests are stored in SharedMemory as they complete. If more than one file is provided, a final synthesis call combines all digests into a unified cross-file analysis (also stored in SharedMemory).
Returns the list of individual digest texts.
- Return type:
list[str]- Parameters:
file_paths (list[str | Path])
LiteratureAgent: searches arxiv and semantic scholar for relevant papers.
- class mtf.agents.literature.LiteratureAgent(agent_id, model, memory, gpd_tools=None, config=None)[source]#
Bases:
BaseAgent- Parameters:
agent_id (str)
model (str)
memory (SharedMemory)
gpd_tools (list[Any] | None)
config (Any | None)
FittingAgent: proposes and runs fits for a given hypothesis.
- class mtf.agents.fitting.FittingAgent(agent_id, model, memory, toolkit, gpd_tools=None, gpd=None, config=None)[source]#
Bases:
BaseAgent- Parameters:
agent_id (str)
model (str)
memory (SharedMemory)
toolkit (ToolkitRegistry)
gpd_tools (list[Any] | None)
gpd (GPDMCPClient | None)
config (MTFConfig | None)
ReviewerAgent: evaluates theory validity and suggests further experiments.
Toolkit#
ToolkitRegistry: holds user-provided experimental data and model functions.
- class mtf.toolkit.registry.ToolkitRegistry[source]#
Bases:
objectRegistry for user-supplied data arrays and model callables.
Users register items before calling MTFOrchestrator.run() so that FittingAgents can request them by name.
Tools#
Fitting tools: executes agent-generated lmfit code in a sandboxed namespace.
- mtf.tools.fitting_tools.run_fitting_code(code, data, integrity_check=True)[source]#
Execute agent-generated fitting code in a sandboxed namespace.
The code has access to: numpy (np), lmfit (Model, Parameters, minimize), scipy.optimize, scipy.stats, and the ‘data’ dict provided by the user toolkit.
The code MUST assign its results to a variable named ‘result’ which is returned to the caller.
When integrity_check=True (default), sentinel wrappers detect whether a real optimizer was called and _validate_result() checks for obvious fabrications. When integrity_check=False, the real lmfit symbols are injected directly and no validation is performed.
- Parameters:
code (
str) – Python source code string produced by a FittingAgent.data (
dict[str,Any]) – Dict of user-provided arrays / scalars from the toolkit registry.integrity_check (
bool) – Whether to run anti-fabrication checks (default True).
- Return type:
dict[str,Any]- Returns:
The value of ‘result’ from the executed code, or an error dict. May include an ‘integrity_warnings’ key with a list of warning strings.
GPD MCP client: bridges GPD MCP servers to sdk.SdkMcpTool objects.
Each GPD server runs as a subprocess communicating over stdio MCP protocol. A dedicated background event loop handles all async MCP I/O, exposing SdkMcpTool-compatible callables to mtf agents.
- class mtf.tools.gpd_mcp.GPDMCPClient[source]#
Bases:
objectManages live connections to one or more GPD MCP servers.
All async MCP I/O runs in a dedicated background thread with its own event loop so that sync tool functions can block-call into the async MCP session without conflicting with the main asyncio event loop used by the rest of mtf.
Usage:
client = GPDMCPClient() client.start(["verification", "errors", "protocols", "conventions", "patterns"]) tool = client.make_tool("verification", "get_checklist", "Get domain-specific physics checklist.") # ... pass tool to agents ... client.close()
- async async_call(server, tool_name, **kwargs)[source]#
Async wrapper around call() for use inside asyncio.gather() fan-outs.
Offloads the blocking wait to a thread-pool thread so it does not block the main asyncio event loop.
- Return type:
str- Parameters:
server (str)
tool_name (str)
kwargs (Any)
- property available: bool#
True if at least one GPD MCP server started successfully.
- call(server, tool_name, **kwargs)[source]#
Synchronously call an MCP tool and return its text result.
Returns an empty string if the server is unavailable.
- Return type:
str- Parameters:
server (str)
tool_name (str)
kwargs (Any)
- close()[source]#
Shut down all MCP server subprocesses and stop the background loop.
- Return type:
None
- make_tool(server, tool_name, description)[source]#
Return an sdk.SdkMcpTool backed by the given MCP server tool.
Returns
Noneif the server was not started (e.g. GPD not installed), so callers can filter None values out of their tool lists.- Return type:
SdkMcpTool|None- Parameters:
server (str)
tool_name (str)
description (str)
Interface#
Human interface abstraction and CLI implementation.
- class mtf.interface.CLIInterface[source]#
Bases:
HumanInterfaceRich-based CLI interface.
- class mtf.interface.HumanInterface[source]#
Bases:
ABCAbstract interface for human interaction.