Skip to content
Try Free →

Status page and incident communication

Last updated: · 3 min read

Where to find the status page

Visit status.askvault.co from any browser. No login required.

The page shows:

  • Overall status. Operational / partial outage / major outage.
  • Per-component status. API, widget, dashboard, channels (WhatsApp, Slack, etc.).
  • Active incidents. Description, affected components, last update.
  • Maintenance windows. Planned maintenance with start/end times.
  • Past 90 days of uptime per component.

What's monitored

15 distinct components tracked:

Core:

  • API (api.askvault.co).
  • Dashboard (askvault.co/dashboard).
  • Marketing site (askvault.co).
  • Authentication system.
  • Database tier.

Channels:

  • Widget delivery.
  • WhatsApp routing.
  • Telegram routing.
  • Slack routing.
  • Discord routing.
  • SMS routing.
  • Voice / telephony.
  • Email assistant.
  • Hosted page rendering.

Knowledge:

  • Indexing pipeline.
  • Knowledge retrieval.

Each component reports independently. A WhatsApp outage doesn't show widget as down.

How status is determined

Two signals:

  • Automated probes. External monitoring services hit critical paths every 60 seconds. Multiple failures within 5 minutes trigger an incident.
  • Internal metrics. Error rates, latency p95, queue depth tracked by our observability stack. Threshold breach triggers an incident.

Both signals must clear for "Operational" status.

Manual status overrides allowed during postmortem analysis; rare in practice.

Subscribing to alerts

Three subscription channels:

Email.

  1. Click "Subscribe" on the status page.
  2. Enter email.
  3. Pick components you care about (all, or specific channels).
  4. Confirm via verification email.

Get notified at incident start and resolution.

SMS. Same flow, enter phone number. Useful for on-call rotation.

RSS / webhook. RSS feed at status.askvault.co/history.rss. Webhook subscription via the API for programmatic alerting (PagerDuty, Opsgenie, your own systems).

Incident lifecycle

Each incident progresses through:

  1. Investigating. We've detected something; investigating root cause.
  2. Identified. Cause known; remediation in progress.
  3. Monitoring. Fix deployed; observing to confirm resolution.
  4. Resolved. Service back to normal; postmortem in progress.

Updates posted at each transition plus every 30 to 60 minutes during the incident.

Postmortems

After every customer-impacting incident:

  • Postmortem published within 48 hours.
  • Root cause analysis (technical detail).
  • Timeline of events from detection to resolution.
  • Customer impact summary (which workspaces, what duration).
  • Action items to prevent recurrence.

Published to the status page archive. Subscribers get an email when posted. Useful for your own audit trails.

Past-90-days uptime

For each component, the status page shows:

  • Daily uptime bar for the last 90 days.
  • Total uptime %.
  • Incidents during the period.

Green bars: 100% uptime that day. Yellow: partial outage. Red: major outage.

Subscribers and prospects use this for SLA evaluation.

Maintenance windows

Planned maintenance announces at least 48 hours in advance:

  • Subject line: "Scheduled maintenance - [component] - [date]".
  • Description: what's being changed, expected impact, duration.
  • Start time in multiple timezones.

Most maintenance is zero-impact (rolling deploys, capacity changes). Occasionally maintenance requires 5 to 15 minutes of downtime; we batch these to weekends in low-traffic windows.

SLA reference

Per-plan uptime commitments:

  • Free, Starter. Best effort. No SLA.
  • Growth. 99.5% target.
  • Business. 99.9% target.
  • Enterprise. 99.95% or custom per contract.

See SLA per plan for full details including credits.

The status page is the authoritative source for uptime calculation.

Integration with your monitoring

For teams running their own monitoring:

Webhook subscription. AskVault posts to your webhook on every incident transition.

{
"event": "incident.created",
"incident_id": "inc_xxx",
"component": "whatsapp_routing",
"severity": "major",
"started_at": "2026-05-15T10:00:00Z",
"title": "WhatsApp message delivery delayed"
}

RSS aggregation. Add status.askvault.co/history.rss to your team's RSS reader or status dashboard.

Status API.

Terminal window
curl https://status.askvault.co/api/v2/status.json

Returns current status as JSON. Useful for embedding live status in your own internal dashboard.

What counts as an incident

Three severity levels:

  • Minor. Latency above normal but service functional. Few customers notice.
  • Major. A component is degraded or down. Most customers using that component affected.
  • Critical. Multiple components down or full outage. Most customers affected.

We publish all three. Some platforms hide minor incidents; we don't.

Sample incident timeline

A real-ish flow:

10:00 UTC. Probes detect 30% error rate on WhatsApp delivery. Incident "WhatsApp message delivery delayed" published as Investigating.

10:08 UTC. Root cause identified: upstream provider rate-limit. Updated to Identified.

10:15 UTC. Workaround deployed (alternative routing). Updated to Monitoring.

10:35 UTC. Error rate normal for 20 minutes. Updated to Resolved.

Day 2. Postmortem published with full timeline, root cause, and action items.

Subscribers see emails at each transition. Total customer impact: 35 minutes of delayed (not lost) messages.

Planned features (on the roadmap)

Documented for accuracy:

  • Workspace-scoped incidents. Today, incidents are global. Planned: per-workspace incidents (e.g., "your specific WhatsApp number provider is having issues").
  • Custom-component status. Today, fixed set of monitored components. Custom monitored endpoints for Enterprise contracts planned.
  • Status page embed. Today, link to external status page. Embeddable widget showing your relevant components planned.

Limits

  • Probe frequency. Every 60 seconds.
  • Status update latency. Within 5 minutes of detection.
  • Postmortem publication. Within 48 hours of resolution.
  • Historical retention. 18 months of past incidents publicly visible.

Common pitfalls

Not subscribed to alerts. Easy to miss incidents. Subscribe via email or webhook.

Filtering for the wrong component. Subscribed to "All" but only care about WhatsApp. Update preferences to filter.

Trusting "Operational" during an in-progress investigation. Status updates lag detection by 1 to 5 minutes; if your monitoring sees issues before the status page updates, trust your monitoring.

Confusing planned maintenance with outage. Maintenance windows are pre-announced and typically zero-impact. Check the maintenance section, not just incidents.

FAQ

Where is the status page hosted?

Separate infrastructure from the main AskVault API. If AskVault is fully down, the status page is still up.

How accurate is the uptime number?

Calculated from automated probe data. Probe frequency is every 60 seconds; smallest detectable downtime is about 2 minutes.

Can I get SLA credits via the status page?

No. Credits process via support tickets. The status page is informational; SLA enforcement happens separately. See SLA per plan.

Will the bot warn me about ongoing incidents during a chat?

Today, no auto-warning. Planned for integration with the status API.

Can I see status data going back beyond 90 days?

Past incidents archived 18 months. Daily uptime granular for 90 days; older data aggregated to monthly.

Was this page helpful?