Skip to main content
Documentation · v0.10

Self-host the whole prompt & agent layer.

Eight resources, four storage drivers, three SDKs. Read the docs you'd write yourself.

v0.10 — traces, runs, evals, labelsUpdated 3d ago28 endpoints
⌘K
Quickstart
5 min
API reference
28 endpoints
Storage drivers
4 backends
Self-host guide
prod-ready
Quickstart · 5 min

Install, generate a key, and create your first prompt.

PromptMetrics ships as a single Node binary. Pick a path, set API_KEY_SALT, and you're running.

shell · npm
# 1. Install + start
$ npm install -g promptmetrics
$ export API_KEY_SALT=$(openssl rand -hex 32)
$ promptmetrics-server                    # listening on :3000

# 2. Mint a key (workspace=default, scopes=read,write)
$ node $(npm root -g)/promptmetrics/dist/scripts/generate-api-key.js \
    --workspace default read,write
# → pm_a91f0b3c…  (store this once)

# 3. Create a prompt
$ promptmetrics create-prompt --file welcome.json
✓ welcome · v1.0.0 · committed 7a3fe2c
Master keys cross workspaces.
Pass --workspace '*' to generate-api-key to create a key that ignores the X-Workspace-Id header. Use sparingly — it's an admin tool.
Authentication

Two headers. Three scopes.

Every endpoint except /health requires X-API-Key. Workspaces are scoped via X-Workspace-Id. Keys are HMAC-SHA256 hashed with a server-side salt — only the hash is stored.

read
GET endpoints — prompts, logs, traces, runs, labels, evals.
write
POST/PATCH/DELETE on prompts, logs, traces, spans, runs, labels, eval results.
admin
API keys, audit logs, config, evaluation suite delete.
curl
curl http://localhost:3000/v1/prompts/welcome \
  -H "X-API-Key: pm_xxxxxxxxxxxxxxxx" \
  -H "X-Workspace-Id: eu-prod"
Prompts · /v1/prompts

Git-backed prompt storage.

Each version is an immutable Git tag (prompts/{name}/{version}) backed by a JSON blob. SQLite indexes name → version_tag → commit_sha for sub-millisecond reads.

POST /v1/prompts
{
  "name": "welcome",
  "version": "1.0.0",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user",   "content": "Hello {{name}}!" }
  ],
  "variables": { "name": { "type": "string", "required": true } },
  "model_config": { "model": "gpt-4o", "temperature": 0.7 },
  "tags": ["greeting"]
}
Traces & spansnew in v0.10

First-class agent telemetry.

Track agent loops with trace_id → spans. Spans can nest via parent_id. Status is ok or error. metadata accepts up to 50 top-level keys with nested objects and arrays.

POST /v1/traces
{
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "prompt_name": "welcome",
  "version_tag": "1.0.0",
  "metadata": { "agent": "headline-agent", "loop": 1 }
}
POST /v1/traces/:trace_id/spans
{
  "name": "llm-call",
  "status": "ok",
  "start_time": 1000,
  "end_time": 2500,
  "metadata": { "model": "gpt-4o", "tokens_in": 120, "tokens_out": 340 }
}
Workflow runsnew in v0.10

High-level outcome → low-level steps.

Runs track end-to-end workflow executions. A run can optionally link to a trace_id, letting you drill from "this workflow failed" to "this span errored." Status: running · completed · failed.

POST /v1/runs
{
  "workflow_name": "headline-generator",
  "input":  { "topic": "AI" },
  "trace_id": "550e8400-e29b-41d4-a716-446655440001",
  "metadata": { "agent": "headline-v2" }
}
Prompt labels

Resolve production instead of v1.4.2.

Labels are pointers. One label per name per prompt (unique constraint). Move them to do staged rollouts without redeploys. Labels are workspace-scoped.

POST /v1/prompts/:name/labels
{
  "name": "production",
  "version_tag": "1.0.0"
}
// → 201 Created — or 409 Conflict if "production" already exists
Evaluationsnew in v0.10

Score prompts. Track quality over time.

Define a suite with criteria (free-form JSON). Submit results tied to a run_id. Query history; deletes cascade.

POST /v1/evaluations
{
  "name": "accuracy-check",
  "description": "Check output accuracy",
  "prompt_name": "welcome",
  "version_tag": "1.0.0",
  "criteria": { "min_score": 0.8 }
}
POST /v1/evaluations/:id/results
{
  "run_id": "run_7a3fe2c",
  "score": 0.95,
  "metadata": { "judge": "gpt-4" }
}
Endpoint reference

All 28 endpoints, grouped.

Prompts4 endpoints
GET/v1/promptsList prompts (paginated, searchable).
GET/v1/prompts/:nameGet a prompt with optional variable rendering.
GET/v1/prompts/:name/versionsList all versions of a prompt.
POST/v1/promptsCreate a new prompt version.
Logs2 endpoints
POST/v1/logsLog metadata for an LLM request.
GET/v1/logsList logs with pagination.
Tracesnew5 endpoints
POST/v1/tracesCreate a trace for an agent loop.
GET/v1/tracesList traces with pagination.
GET/v1/traces/:trace_idFetch a trace with all spans.
POST/v1/traces/:trace_id/spansAdd a span to an existing trace.
GET/v1/traces/:trace_id/spans/:span_idGet a single span by ID.
Runsnew4 endpoints
POST/v1/runsCreate a workflow run.
GET/v1/runsList runs with pagination.
GET/v1/runs/:run_idGet a run by ID.
PATCH/v1/runs/:run_idUpdate run status / output / metadata.
Labels4 endpoints
POST/v1/prompts/:name/labelsTag a prompt version.
GET/v1/prompts/:name/labelsList all labels for a prompt.
GET/v1/prompts/:name/labels/:label_nameResolve a label to a version.
DELETE/v1/prompts/:name/labels/:label_nameRemove a label.
Evaluationsnew6 endpoints
POST/v1/evaluationsCreate an evaluation suite.
GET/v1/evaluationsList evaluations.
GET/v1/evaluations/:idGet a single evaluation.
POST/v1/evaluations/:id/resultsAdd a result to an evaluation.
GET/v1/evaluations/:id/resultsList results.
DELETE/v1/evaluations/:idDelete an evaluation (cascades).
API Keys3 endpoints
POST/v1/api-keysCreate an API key (admin scope).
GET/v1/api-keysList API keys (admin scope).
DELETE/v1/api-keys/:idRevoke an API key.
Audit1 endpoints
GET/v1/audit-logsQuery audit logs (admin scope).
Production

Deploy to production.

The default SQLite setup handles millions of traces on a single node. When you need multi-node, switch to Postgres with one environment variable.

docker-compose.yml
version: "3.8"
services:
  promptmetrics:
    image: promptmetrics/promptmetrics:latest
    ports:
      - "3000:3000"
    environment:
      - API_KEY_SALT=${API_KEY_SALT}
      - DATABASE_URL=${DATABASE_URL:-sqlite:./pm.db}
      - DRIVER=${DRIVER:-filesystem}
    volumes:
      - ./data:/app/data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 5s
      retries: 3

Environment variables

VariableDefaultRequiredDescription
API_KEY_SALTyesServer-side salt for HMAC-SHA256 key hashing.
DATABASE_URLsqlite:./pm.dbnoConnection string. SQLite default; set postgres:// for multi-node.
DRIVERfilesystemnoStorage driver: filesystem, github, s3, or postgres.
GITHUB_REPOnoowner/repo when DRIVER=github.
S3_BUCKETnoBucket name when DRIVER=s3.
OTEL_EXPORTER_OTLP_ENDPOINTnoOpenTelemetry collector endpoint.

Health checks & upgrades

curl
curl -f http://localhost:3000/health
docker compose
docker compose pull
docker compose up -d
Migration

Migrate from Langfuse or Langsmith.

Both platforms export standard formats that PromptMetrics can ingest. Most migrations finish in an hour.

export from langfuse
curl https://api.langfuse.com/api/public/traces \
  -H "Authorization: Basic $(echo -n 'pk:sk' | base64)" \
  --output traces.json
convert + import
npx promptmetrics migrate --from langfuse --input traces.json --endpoint http://localhost:3000
verify
curl http://localhost:3000/v1/traces/count -H "X-API-Key: pm_xxx"
curl http://localhost:3000/v1/prompts/count -H "X-API-Key: pm_xxx"

Field mapping

Langfuse fieldPromptMetrics fieldNotes
trace.idtrace_idUUID string. Preserved verbatim.
trace.nameprompt_nameMaps to the prompt that started the trace.
trace.timestampcreated_atISO 8601 → Unix ms.
observation.namespan.nameEach observation becomes a span.
observation.startTimespan.start_timeRelative to trace start.
observation.endTimespan.end_timeComputed duration stored as metadata.
observation.metadataspan.metadataMerged; nested objects flattened one level.
score.valueeval_result.scoreRequires a pre-created evaluation suite.

Rollback strategy

dry-run
npx promptmetrics migrate --from langfuse --input traces.json --endpoint http://localhost:3000 --dry-run
snapshot before import
cp pm.db pm.db.pre-migration-$(date +%s)
# or for Postgres:
# pg_dump $DATABASE_URL > pm-pre-migration.sql
Storage drivers

Pick the backend that matches your team.

filesystemDRIVER

Default. Local JSON files at ./prompts/{name}/{version}.json. Best for single-node, dev, and air-gapped.

DRIVER=filesystem
githubDRIVER

Bare-clone + Contents API. Sync interval configurable via GITHUB_SYNC_INTERVAL_MS. Webhook-driven instant sync supported.

DRIVER=github
GITHUB_REPO=owner/repo
GITHUB_TOKEN=ghp_…
s3DRIVER

Object-storage backed. Works with MinIO via S3_ENDPOINT. Keys: prompts/{name}/{version}.json under S3_PREFIX.

DRIVER=s3
S3_BUCKET=…
S3_REGION=…
postgresDRIVER

Set DATABASE_URL to swap SQLite for Postgres. The DatabaseAdapter abstracts both — same SQL, multi-node ready.

DATABASE_URL=postgres://…
Edit this page on GitHub →Last updated: April 24, 2026 · v0.10