Prompt Versioning
Treat prompts as immutable, hashed, semver'd artifacts in a registry — deploy and roll back like code, tie eval results to specific versions.
Intent & Description
🎯 Intent
Treat prompts as immutable, hashed, semver’d artifacts in a registry — deploy and roll back like code.
📋 Context
Multiple engineers edit prompts, sometimes inline in code, sometimes through a management tool. Without versioning, you can’t answer “what exact prompt was live when this bad output was generated?” and rolling back a prompt requires a code redeploy.
💡 Solution
Prompts live in a registry as immutable, hashed, version-tagged artifacts. Code references prompts by name + version (semver). Deployments pin specific versions; rollback is a version change. The eval harness ties metric outcomes to prompt versions. Optionally sign artifacts for provenance.
Real-world Use Case
- Prompts are edited often and audit, rollback, or A/B comparison is required.
- Eval outcomes need to be tied to specific prompt versions.
- A registry can hold immutable, hashed, semver-tagged artifacts.
Source
📌 TL;DR
Version your prompts like code — immutable, hashed, semver’d artifacts in a registry so you can roll back a bad prompt in seconds and tie every eval result to a specific version.
Advantages
- Prompt rollback without a code redeploy — just pin the previous version.
- Eval results map to specific, reproducible prompt versions.
Disadvantages
- Registry infrastructure adds setup and maintenance overhead.
- Version-pinning means prompts stop tracking model upgrades automatically — requires intentional bumps.