API Docs — SEER AI

How SEER works #

SEER sits between your code and the model. You pass your model call through seer.observe() instead of calling the model API directly. SEER forwards it, gets the response back, analyses it in parallel, and returns everything to you — the model's normal output plus a seer object attached to the response.

Your AI does the same thing it always did. Your users see the same outputs. What changes is what you know: quality scores, token cost to the cent, exact latency, anomaly flags, and a specific recommended action when something's wrong. All of it inline, in the same response, zero extra round-trips.

Before and after — the only thing that changes

# Before SEER
response = openai.chat.completions.create(model="gpt-4o", messages=[...])

# After SEER — two lines different, everything else stays the same
result = client.observe(model="gpt-4o", messages=[...], context="support")

# Same model output you always got
print(result.choices[0].message.content)

# Plus this — the intelligence layer
print(result.seer.status) # HEALTHY / DEGRADED / ANOMALY
print(result.seer.quality_score) # 0–100
print(result.seer.cost_usd) # e.g. 0.0012
print(result.seer.action) # specific fix if something's wrong, else None

Overhead: SEER's analysis runs in parallel with the model call. The added latency is under 8ms at P99. Your users won't feel it.

Authentication #

Every request needs your API key in the Authorization header as a Bearer token. Create and manage keys in your dashboard under Settings → API Keys.

Header — every request

Authorization: Bearer sk-cdnc-your-key-here

Keep it private. Never commit your key to version control or use it client-side. Set it via environment variable: SEER_API_KEY=sk-cdnc-...

We recommend creating a separate key per environment. If staging gets compromised, you revoke that key without touching production.

Base URL #

All requests go to

https://api.seerlabs.api/v1

The /v1 version prefix is stable. We won't make breaking changes to v1 without at least 60 days' notice and a migration guide. When v2 ships, v1 keeps working until you choose to upgrade.

Quick start #

Install, add your key, wrap one model call. That's it. The seer object comes back in the same response.

Pythonpip install cadence-seer

import os
from seer import SEER

client = SEER(api_key=os.environ["SEER_API_KEY"])

result = client.observe(
 model="gpt-4o",
 messages=[{"role": "user", "content": "Summarise this ticket..."}],
 context="support"# optional — groups calls in your dashboard
)

print(result.choices[0].message.content) # the model's reply, unchanged
print(result.seer.quality_score) # e.g. 94.2
print(result.seer.cost_usd) # e.g. 0.0012
print(result.seer.action) # specific fix, or None

Node.jsnpm install @cadence/seer

import { SEER } from '@cadence/seer'const client = new SEER({ apiKey: process.env.SEER_API_KEY })

const result = await client.observe({
 model: 'gpt-4o',
 messages: [{ role: 'user', content: 'Summarise this ticket...' }],
 context: 'support'
})

console.log(result.choices[0].message.content)
console.log(result.seer.qualityScore)
console.log(result.seer.costUsd)

cURLWorks with any language via REST

curl -X POST https://api.seerlabs.api/v1/observe \
 -H "Authorization: Bearer sk-cdnc-..." \
 -H "Content-Type: application/json" \
 -d '{
 "model": "gpt-4o",
 "messages": [{"role": "user", "content": "Summarise this ticket..."}],
 "context": "support"
 }'

Average setup time: 24 minutes. That includes installing the SDK, wrapping your first call, and getting a quality score back from a real model run.

POST /observe #

The core endpoint. Pass a model call through SEER — it runs the call, scores the result, and returns everything back to you inline.

POST/v1/observeObserve & analyse a model call

Request fields

Field	Type	Description
model	string required	Model to call. Any supported model string — `gpt-4o`, `claude-3-5-sonnet-20241022`, `mistral-large-latest`, etc. All providers supported.
messages	array required	Standard conversation array: `[{"role": "user", "content": "..."}]`. Works the same as the OpenAI and Anthropic APIs.
context	string optional	A label you choose — e.g. `"customer-support"` or `"summariser"`. Groups calls in your dashboard and in recommendations. Highly recommended.
eval_rubric	string optional	Plain-English quality criteria. Example: `"Reply must be under 100 words, friendly, and not mention pricing."` SEER scores every output against this.
metadata	object optional	Any extra data you want attached to this call's log. Example: `{"user_id": "u_123", "session": "s_456"}`. Useful for linking calls to your own users.

Response — the seer object

Every response from /observe includes the model's normal output plus a seer object. Here it is in full:

result.seer — full example200 OK

status"HEALTHY"HEALTHY · DEGRADED · ANOMALY

quality_score94.20–100. Higher is better.

cost_usd0.0012Exact dollar cost of this call

latency_ms312Model time only — not SEER overhead

tokens.prompt840Tokens in your input

tokens.completion400Tokens in the model's output

anomalynullPlain-English description if something's off, else null

actionnullSpecific fix if there's a problem, else null

call_id"call_a1b2c3d4"Unique ID — use to look up the trace in /logs

Status values #

Every call comes back with one of three statuses. Here's exactly what each one means and what to do about it.

HEALTHY

Everything is within your configured thresholds. Quality, cost, and latency are all normal for this context. No action needed.

DEGRADED

One or more metrics have drifted outside your thresholds — quality below your minimum, latency above your limit, or cost higher than usual. An alert will fire if configured.

ANOMALY

Something unusual happened — a sudden quality drop, a latency spike, or behaviour SEER hasn't seen before in this context. The anomaly field explains it in plain English.

POST /evaluate #

Run a batch of test cases through your prompt before you deploy it. SEER scores each case against your quality criteria and returns a clear pass/fail result. Plug this into your CI pipeline to block regressions before they hit production.

POST/v1/evaluateBatch-test a prompt version

Python

result = client.evaluate(
 model="gpt-4o",
 prompt_version="support-v4.2",
 test_cases=[
 {"input": "How do I cancel?", "expected": "polite, clear, under 60 words"},
 {"input": "My order is damaged", "expected": "empathetic, offers solution"},
 ],
 pass_threshold=85# any score below 85 = fail
)

print(result.passed) # True or False
print(result.pass_rate) # e.g. 0.92 → 92% of cases passed
print(result.failed_cases) # the cases that failed, with scores and why

Node.js

const result = await client.evaluate({
 model: 'gpt-4o',
 promptVersion: 'support-v4.2',
 testCases: [
 { input: 'How do I cancel?', expected: 'polite, clear, under 60 words' },
 { input: 'My order is damaged', expected: 'empathetic, offers solution' },
 ],
 passThreshold: 85
})

console.log(result.passed, result.passRate, result.failedCases)

CI integration: A pre-built GitHub Action runs your eval suite on every pull request and blocks the merge if quality drops. See the v1.3.0 changelog entry for setup details.

POST /alerts #

Define the conditions that should trigger an alert. Once set, SEER watches every call against your rules automatically. Notifications go to Slack, email, or any webhook URL within 30 seconds of a breach.

POST/v1/alertsCreate or update an alert rule

Python

client.alerts.set(
 name="production-health",
 conditions={
 "quality_score_min": 80, # alert if quality drops below 80"latency_max_ms": 2000, # alert if response takes over 2s"cost_spike_multiplier": 3.0, # alert if cost is 3× the 7-day avg"anomaly_any": True# alert immediately on any anomaly
 },
 notify={
 "slack": "#ai-alerts",
 "email": "alerts@yourdomain.com"
 }
)

GET /logs #

Retrieve your full call history. Filter by time window, model, context, or status. Logs are retained for 7 days on Starter, 90 days on Growth, and indefinitely on Scale.

GET/v1/logsRetrieve call history with filters

Parameter	Type	What it does
context	string	Filter to calls with this context label. e.g. `?context=support`
status	string	`HEALTHY`, `DEGRADED`, or `ANOMALY`. Leave out to return all.
model	string	Filter to a specific model. e.g. `?model=gpt-4o`
from	ISO 8601	Start of time window. e.g. `?from=2025-01-01T00:00:00Z`
to	ISO 8601	End of time window.
limit	integer	Results to return. Max 1000. Default 100.
after_call_id	string	Cursor for pagination. Pass the last `call_id` from the previous page.

GET /recommendations #

Get SEER's current ranked list of recommended improvements, generated from your actual call data and updated daily. Not generic advice — specific to how your stack is behaving right now.

GET/v1/recommendationsRanked improvement actions from your data

Example response

{
 "recommendations": [
 {
 "rank": 1,
 "category": "latency",
 "title": "Add caching to your top 40 prompt patterns",
 "detail": "68% of your calls use near-identical prompts. A cache layer would serve ~40% of them instantly.",
 "estimated_impact": "+23% speed improvement"
 },
 {
 "rank": 2,
 "category": "cost",
 "title": "Switch your classifier from gpt-4o to gpt-4o-mini",
 "detail": "Scores 1.2pts lower — but costs 94% less per call.",
 "estimated_impact": "-$340/month"
 }
 ]
}

Error codes #

SEER uses standard HTTP status codes. Every error response includes a code string and a message written in plain English — not a cryptic internal code.

Error response shape

{ "code": "INVALID_REQUEST", "message": "Field 'model' is required but was not provided." }

400

INVALID_REQUEST

Something in your request is missing or formatted incorrectly. The message will tell you exactly which field and what's wrong.

401

UNAUTHORIZED

Your API key is missing or incorrect. Check the Authorization header.

403

FORBIDDEN

Your key doesn't have permission for this endpoint. Check you're using the right key for this environment or plan.

429

RATE_LIMITED

Too many requests. Check the X-RateLimit-Remaining and X-RateLimit-Reset headers to see when to retry.

503

MODEL_UNAVAILABLE

The underlying model provider returned an error. The provider's original error is in message.

500

INTERNAL_ERROR

Something went wrong on our end. Email with your call_id and we'll investigate.

Rate limits #

Limits are per API key. Every response includes X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers so you can always see where you stand.

Starter

60 / min50K / month

Growth

600 / min500K / month

Scale

CustomUnlimited

Need more? Email with your usage pattern. We can increase limits quickly — often the same day.

Webhooks #

SEER can POST to any URL when an alert fires. Connect it to PagerDuty, your own systems, Zapier, or anything that accepts HTTP. Every payload is signed — verify the signature before acting on it.

Webhook payload

{
 "event": "alert.fired",
 "alert_name": "production-health",
 "triggered_by": "quality_score_min",
 "value": 74.2, // the value that breached"threshold": 80, // your configured threshold"call_id": "call_a1b2c3d4",
 "context": "customer-support",
 "model": "gpt-4o",
 "timestamp": "2025-06-14T10:23:45Z",
 "action": "Review prompt changes deployed today. Quality dropped after v3 update."
}

Verify the signature — Python

import hmac, hashlib

def verify(payload_bytes, signature, secret):
 expected = hmac.new(secret.encode(), payload_bytes, hashlib.sha256).hexdigest()
 return hmac.compare_digest(expected, signature)

# In your webhook handler:
sig = request.headers['X-SEER-Signature']
if not verify(request.body, sig, os.environ['SEER_WEBHOOK_SECRET']):
 return 401

Python SDK #

Install

pip install cadence-seer

Initialise

from seer import SEER

client = SEER(
 api_key="sk-cdnc-...", # or SEER_API_KEY env var — preferred
 timeout=30, # seconds, default 30
 max_retries=3# auto-retry on 429 and 503
)

Node.js SDK #

Install

npm install @cadence/seer

Initialise

import { SEER } from '@cadence/seer'const client = new SEER({
 apiKey: process.env.SEER_API_KEY,
 timeout: 30000, // ms
 maxRetries: 3
})

v1.4.0 · Feb 2025

Node.js SDK was fully rewritten with TypeScript types throughout, async/await everywhere, and better error messages. Backwards compatible with v1 code.

Raw REST #

No SDK? No problem. SEER's API is plain JSON over HTTPS. Set Authorization, send JSON, get JSON back. Works in any language that can make HTTP requests.

Testing quickly? Every endpoint in these docs has a cURL example you can run directly from your terminal. Just swap in your API key.

LangChain #

One import, one callback. Every model call made by your chain or agent is automatically observed. No changes to your chain logic.

Python

from seer.integrations.langchain import SeerCallback

callback = SeerCallback(api_key=os.environ["SEER_API_KEY"])

# Pass to any LangChain chain or agent
chain.invoke(inputs, config={"callbacks": [callback]})

Other frameworks #

LlamaIndex, CrewAI, and AutoGen all work via the same pattern — one import, one callback or wrapper. The integration is always one line once the SDK is installed.

LlamaIndex

from seer.integrations.llamaindex import SeerCallbackManager
Settings.callback_manager = SeerCallbackManager(api_key=os.environ["SEER_API_KEY"])

CrewAI

from seer.integrations.crewai import SeerObserver
crew = Crew(agents=[...], tasks=[...], observers=[SeerObserver()])

v1.4.0 · Feb 2025

CrewAI integration added. LlamaIndex integration improved — now captures full agent trace, not just individual LLM calls.

SEER API Reference

How SEER works #

Authentication #

Base URL #

Quick start #

POST /observe #

Request fields

Response — the seer object

Status values #

POST /evaluate #

POST /alerts #

GET /logs #

GET /recommendations #

Error codes #

Rate limits #

Webhooks #

Python SDK #

Node.js SDK #

Raw REST #

LangChain #

Other frameworks #