Getting started with the AskVault API
What you can build with the AskVault API
The AskVault REST API exposes the same RAG pipeline that powers the AskVault widget, WhatsApp, Telegram, and live chat. With one HTTP endpoint you can build a custom chat interface, embed AI search inside your own product, drive an internal helpdesk bot, or pipe answers into Slack, Discord, or a customer-success dashboard.
Every API response is grounded in the documents you've indexed in your workspace. The API doesn't hallucinate. If the answer isn't in your knowledge base, it returns confidence: "low" and a graceful fallback message so you can route low-confidence queries to a human via webhook.
There are two endpoints. POST /v1/query returns a synchronous JSON response. POST /v1/query/stream streams the response as Server-Sent Events. Streaming starts in under 300 ms and completes in 1.5 to 4 seconds depending on response length.
Before you start
You need three things to make your first API call:
- An AskVault workspace with content indexed. Sign up at askvault.co/signup (free) and run the onboarding wizard to crawl a website or upload a PDF. Indexing 50 pages takes about 90 seconds.
- An API key in the format
ak_xxx. Generate one under Dashboard > API Keys. - Any HTTP client.
curl, Postman, the Pythonrequestslibrary, or whatever you already use.
The free plan includes API access. Starter is ₹2,499 a month for 3,000 queries. Growth is ₹4,999 for 15,000. Business is ₹8,499 for 50,000.
Generate an API key
- Open Dashboard > API Keys.
- Click Create new key.
- Give it a recognizable name like
production-website,staging, ormobile-app. - Pick a rate limit. The default matches your plan; tighter per-key limits are useful when you're exposing AskVault to end users from a multi-tenant SaaS.
- Click Generate. The key (
ak_5b45ff_...) is shown once. Copy it now. AskVault stores only a SHA-256 hash; we can't recover the original.
Treat API keys like passwords. Store them in environment variables (ASKVAULT_API_KEY), not in source code. Rotate them every quarter. If a key leaks, revoke it immediately under Dashboard > API Keys > Revoke.
Your first request
Send your first chat query. Replace ak_xxx with your real API key and wt_xxx with your workspace ID (visible in Dashboard → Settings → General).
curl -X POST https://api.askvault.co/v1/query \-H "Authorization: Bearer ak_xxx" \-H "Content-Type: application/json" \-d '{ "workspace_id": "wt_xxx", "message": "How do I reset my password?", "top_k": 5}'import os, requests
response = requests.post( "https://api.askvault.co/v1/query", headers={ "Authorization": f"Bearer {os.environ['ASKVAULT_API_KEY']}", "Content-Type": "application/json", }, json={ "workspace_id": "wt_xxx", "message": "How do I reset my password?", "top_k": 5, }, timeout=15,)response.raise_for_status()print(response.json())const response = await fetch("https://api.askvault.co/v1/query", {method: "POST",headers: { "Authorization": `Bearer ${process.env.ASKVAULT_API_KEY}`, "Content-Type": "application/json",},body: JSON.stringify({ workspace_id: "wt_xxx", message: "How do I reset my password?", top_k: 5,}),});const data = await response.json();console.log(data);A successful response looks like this:
{ "answer": "To reset your password, go to askvault.co/forgot-password, enter your email, and click the reset link we send you. The link expires in 60 minutes.", "sources": [ { "document_id": "doc_a1b2c3", "document_title": "Password reset flow", "url": "https://docs.askvault.co/account/password-reset/", "relevance_score": 0.94, "snippet": "Visit /forgot-password, enter your email..." } ], "confidence": "high", "model": "askvault-standard", "tokens_used": 187, "latency_ms": 612, "request_id": "req_5b45ff_xxx"}The sources array is what makes AskVault answers verifiable. Every claim in answer traces back to a snippet you indexed. Store the request_id in your logs. If a customer ever complains about an answer, you can replay the exact retrieval in the dashboard's request inspector.
Stream responses with SSE
For real-time UX (typing-effect chat interfaces, low-perceived-latency interactions), use the streaming endpoint. It uses Server-Sent Events over HTTP.
const response = await fetch("https://api.askvault.co/v1/query/stream", { method: "POST", headers: { "Authorization": `Bearer ${process.env.ASKVAULT_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ workspace_id: "wt_xxx", message: "How does pricing work?", }),});
const reader = response.body.getReader();const decoder = new TextDecoder();
while (true) { const { value, done } = await reader.read(); if (done) break; const chunk = decoder.decode(value); for (const line of chunk.split("\n")) { if (line.startsWith("data: ")) { const event = JSON.parse(line.slice(6)); if (event.type === "token") process.stdout.write(event.text); else if (event.type === "source") console.log("Source:", event.document_title); else if (event.type === "done") console.log("\nLatency:", event.latency_ms, "ms"); } }}Event types you'll receive: token (partial answer chunks), source (a citation as it's selected), done (final stats), and error (any failure). The full SSE schema is in the streaming reference.
Request parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
workspace_id | string | Yes | The workspace to query (found in dashboard Settings) |
message | string | Yes | The user's question (1 to 4,000 characters) |
top_k | integer | No | Number of context chunks to retrieve (1 to 10, default 5) |
temperature | number | No | Answer creativity (0.0 to 1.0, default 0.3), keep low for factual support |
strictness | string | No | "strict" to refuse answers not in KB, "helpful" (default) to combine KB with reasoning |
document_ids | string[] | No | Restrict retrieval to specific document IDs (URL allowlist behavior) |
conversation_id | string | No | Continue a multi-turn conversation, passes prior messages as context |
user_id | string | No | Anonymous end-user identifier, enables per-user rate limiting + analytics |
Rate limits
The Starter plan allows 60 requests/minute and 3,000 requests/day per API key. Growth allows 200 requests/minute and 15,000/day. Business allows 1,000 requests/minute and 50,000/day. Enterprise is custom.
Every response includes rate-limit headers so you can avoid hitting the cap:
X-RateLimit-Limit-Minute: 60X-RateLimit-Remaining-Minute: 47X-RateLimit-Limit-Day: 3000X-RateLimit-Remaining-Day: 2845X-RateLimit-Reset-Day: 1747353600When you exceed the limit you get HTTP 429 Too Many Requests back, with Retry-After in seconds. Back off for that long, then retry. Full reference: rate limits per plan.
Authentication errors
The most common failures and what to check:
| Status | Meaning | Fix |
|---|---|---|
| 401 | Missing or malformed Authorization header | Header must be exactly Bearer ak_xxx. No "Token" or "Key" prefix. |
| 401 | Invalid or revoked API key | The key was deleted or never existed. Generate a new one in the dashboard. |
| 403 | Workspace not owned by this API key | The workspace_id doesn't belong to the API key's owner. Use a key from the same workspace owner. |
| 404 | Workspace not found | The workspace_id is wrong or the workspace was deleted. |
Every error response includes a JSON body with detail describing the problem. Log it for debugging, but never show raw error bodies to end users.
FAQ
How is API usage billed?
Each /v1/query and /v1/query/stream request that returns a successful answer counts as 1 query against your plan's monthly quota. Failed requests (4xx/5xx) don't count. Indexing operations (uploading documents, crawling URLs) are billed separately by content volume in MB.
Can I use the API from a browser?
Not directly. The API key would be exposed in client-side code. For browser usage, either proxy through your own backend, or use the embeddable widget which authenticates with a public workspace token plus per-domain rate limiting.
How do I support multiple end users?
Pass a unique user_id per end user. AskVault uses it for per-user rate limiting, conversation-history isolation, and analytics. If you're building multi-tenant SaaS, also use the conversation_id field to keep each user's conversation context isolated.
What's the maximum message length?
4,000 characters. Longer messages get truncated. For document analysis ("summarize this 50-page PDF"), upload the document to the knowledge base instead. The agent reads it via retrieval, not as a raw prompt.
Does the API have an SDK?
Official JavaScript and Python clients are on the roadmap. Until they ship, the API is OpenAPI-spec compatible. You can generate a client in any language from openapi.askvault.co/openapi.json.
Related guides
- API authentication: the Bearer ak_xxx flow in depth
- POST /v1/query reference
- POST /v1/query/stream and SSE event types
- Rate limits per plan
- Error codes and recovery
- Webhooks: receive events when conversations happen
- How AskVault grounds answers in your content (RAG explained)