GPD Physics Verification#
MTF integrates with Get Physics Done (GPD) to shift hypothesis selection from chi-squared toward physical correctness. Rather than reimplementing physics verification, MTF uses GPD’s existing MCP servers as callable tools — the same way it uses arxiv and Semantic Scholar.
Install with pip install -e ".[gpd]". Controlled by config.enable_gpd_mcp (default True; no-ops gracefully if the package is missing).
GPD servers#
Server |
What MTF gets |
Which agents / phases call it |
|---|---|---|
verification |
Structured checks: dimensional (5.1), symmetry (5.2), limiting cases (5.3), fit-family (5.18) |
|
errors |
104 curated error classes with detection strategies (sign errors, missing 2π factors, gauge artifacts, etc.) |
|
protocols |
Step-by-step methodology with checkpoints for 47+ physics domains |
|
conventions |
Canonical defaults for 18 subfields (Fourier convention, metric signature, natural units, gauge choice) |
Convention locking (literature phase), |
patterns |
Persistent cross-session error pattern library in |
|
skills |
Programmatic discovery of available GPD capabilities and physics domain routing |
|
Physics-first ranking#
When GPD is active, DebateEngine.synthesize() adds a physics-first ranking criterion to the system prompt for fitting and review phases:
A model with χ²=1.5 that passes all verification checks ranks above χ²=0.9 with a dimensional analysis failure.
For fitting and review phases, DebateEngine also extracts LaTeX/dimensional expressions from the synthesis text and calls dimensional_check as an objective postscript — appended to the synthesis and stored as PHYSICS_VERDICT.
Auto domain classification#
Before the literature phase, MTFOrchestrator._classify_domains() calls route_protocol and route_skill with the phenomenon description, parses known GPD domain names from the responses, and overwrites config.physics_domains for the run (ephemeral — no persistence). Falls back to the configured default if no domains are detected.
Controlled by:
config.auto_detect_domains: bool = Falseconfig.gpd_domain_detection_max_domains: int = 3
The detected domains (or fallback notice) are written to MemoryKind.DOMAIN_CLASSIFICATION as an audit trail.
Convention locking and pre-exec validation#
At the start of the literature phase, MTF calls subfield_defaults once per domain in config.physics_domains and stores the result as MemoryKind.CONVENTIONS. Every subsequent agent sees these locked conventions in its prompt context, preventing silent mismatches (Fourier sign, metric signature, natural-unit choices) between agents.
Additionally, FittingAgent.fit() calls convention_check on generated fitting code before exec() — a phase-level check separate from the agent’s own subfield_defaults call. If the check returns FAIL, the violation is written to PHYSICS_VERDICT and the agent retries once with the violation text in context.
Controlled by:
config.fitting_convention_check: bool = Trueconfig.fitting_max_convention_retries: int = 1
Literature plausibility screening#
After each literature debate synthesis, _screen_hypothesis_plausibility() runs limiting_case_check on each candidate hypothesis (extracted from the synthesis text) with generic limits: classical_limit, zero_coupling, large_N. Results are shown to the user as [PASS] / [WARN] / [FAIL] badges before the approval gate, and stored as PHYSICS_VERDICT.
If config.auto_reject_physics_failures = True, hypotheses receiving a CRITICAL FAIL verdict are filtered from the approved list (with a non-empty fallback in case all hypotheses fail).
Controlled by:
config.literature_plausibility_screen: bool = Trueconfig.auto_reject_physics_failures: bool = False
Cross-session pattern memory#
GPD’s ~/.gpd/ pattern store is the only persistent cross-session memory in the MTF pipeline. Patterns are used at three points:
Literature pre-fetch:
_prefetch_domain_patterns()callslookup_pattern(category="convention-pitfall")per domain before the first literature fan-out. Results are stored asDOMAIN_PATTERNSand appear inLiteratureAgentprompt context.Fitting pre-fetch:
_prefetch_fitting_warnings()callslookup_pattern(category="sign-error"),lookup_pattern(category="convergence-issue"), andcheck_error_classesper hypothesis before the fitting fan-out. Results are stored asFITTING_WARNINGSand appear inFittingAgentprompt context.Review and recording:
ReviewerAgentcallslookup_patternbefore reviewing andadd_patternafter finding new errors.LiteratureAgentandFittingAgentalso haveadd_patternas a tool to record systematic errors found during their work.
Usage#
# Recommended (GPD enabled by default)
mtf "anomalous resistivity plateau"
# Disable for faster iteration
mtf "anomalous resistivity plateau" --no-gpd
# Cross-domain
mtf "neutron star cooling anomaly" --physics-domains gr nuclear amo
# Specific GPD servers only
mtf "..." --gpd-servers verification errors
Python API#
config = MTFConfig(
enable_gpd_mcp=True,
physics_domains=["condensed_matter", "qft"],
# Domain auto-detection
auto_detect_domains=False,
gpd_domain_detection_max_domains=3,
# Literature plausibility screen
literature_plausibility_screen=True,
auto_reject_physics_failures=False,
# Pre-exec convention validation
fitting_convention_check=True,
fitting_max_convention_retries=1,
)