Policy-Localizer-Validator | designpattern.fyi

Back to Catalog

Advantages

Each role uses the smallest sufficient model — total cost lower than a monolithic approach.
Failures attribute cleanly: bad plan, bad grounding, or bad commit decision.
Validator gives a real stop signal uncorrelated with the planner’s optimism.
Specialist VLMs can be trained on open weights without retraining the planner.

Disadvantages

Three models = three deployment targets, three training pipelines, three versioning surfaces.
The inter-model interface (textual action description) becomes a contract that must stay stable.
Validator must be calibrated or it stops too early or too late.
Until the Validator is trained on the target domain, completion judgments are weak.