BYOK Scraper, bring your own scraping API key
When you need BYOK Scraper
AskVault's built-in scraper handles roughly 99% of websites out of the box. Plain HTTP fetch for the easy cases completes in under 2 seconds, escalation to headless browser for JavaScript-rendered sites takes about 5 seconds, anti-bot bypass logic for Cloudflare-protected hosts adds another 3 seconds.
The 1% that resists is usually one of three patterns:
- Aggressive anti-bot protection. Sites that explicitly target scrapers with sophisticated detection. Some financial-services and ticketing sites are this hostile.
- Geo-restricted content. Site only serves visitors from specific countries. AskVault's region might not match.
- Subscription content. Site requires a paid login that the customer has but AskVault doesn't.
For these cases, the customer can bring their own subscription to a premium scraping service. AskVault uses the customer's API key for that specific host, bypassing the built-in scraper.
Available on Business and above. Business+
How it works
A typical scrape attempt goes through:
- Plain HTTP fetch. Fast, free. Works on most pages.
- Headless browser. Slower but handles JavaScript-rendered sites.
- Built-in anti-bot bypass. Handles Cloudflare and similar challenges.
- BYOK Scraper. Your premium-scraper API key. Last resort.
The crawler tries each tier in sequence. If tier 4 (your BYOK key) succeeds, AskVault remembers the host needed it and uses your key for that host on every future fetch.
Supported premium-scraper services
BYOK Scraper works with any service that provides an HTTP-based scraping API. AskVault has tested integrations for the major ones:
- ScrapingBee.
- Bright Data.
- ScrapFly.
- ZenRows.
- Firecrawl.
- Apify.
For other services with a similar API shape, configure under Settings > BYOK Scraper > Custom Provider. Provide the request format and AskVault adapts.
Setup
Three steps.
- Sign up for a premium scraper. Pick the service that fits your hosts (we recommend ScrapingBee or ScrapFly for general-purpose use). Subscribe at their lowest tier; AskVault uses requests proportional to actual hard-host pages, usually a small fraction of your total crawl.
- Generate an API key in their dashboard.
- Paste the key into AskVault. Settings > BYOK Scraper > [Provider] > API Key.
That's the setup. AskVault now uses your key for any host that resists the built-in tiers.
Per-host configuration
By default BYOK Scraper applies to any host that fails the built-in tiers. For tighter control, configure per-host rules under Settings > BYOK Scraper > Host Allowlist:
- Allowlist. Only use BYOK for these specific hosts. Other hosts that fail get reported as crawl errors instead.
- Force-use list. Use BYOK for these hosts immediately, skipping the built-in tier 1 to 3 attempts. Useful for hosts you know need it.
- Exclude list. Never use BYOK for these hosts. Useful for hosts you know are unreachable and you want a fast failure rather than premium-scrape cost.
Most teams use the default (auto-fallback) without explicit lists.
Cost tracking
BYOK Scraper requests are billed by your scraper provider, not by AskVault. Cost depends on:
- Provider pricing. Most charge $0.0005 to $0.005 per request, scaling per provider plan.
- Pages per crawl. A 1,000-page initial crawl where 50 pages need BYOK = 50 BYOK requests = $0.025 to $0.25 at the low end.
- Re-sync frequency. Daily re-sync of changed pages adds ongoing BYOK cost.
AskVault tracks per-day BYOK requests under Settings > BYOK Scraper > Usage. Cross-reference with your provider's invoice.
When BYOK isn't enough
For sites that even premium scrapers can't reach:
- Manual upload. Visit the page in your browser, save as PDF, upload to AskVault.
- Custom integration via custom_webhook. Your own server runs the scrape with whatever logic the host needs.
- Snippet entry. For one-off pages, manually paste the content as a snippet.
Compliance considerations
Premium scrapers don't automatically make your scraping compliant with target sites' terms of service. Three things to keep in mind:
- Respect robots.txt. Even with BYOK enabled, AskVault respects robots.txt. Sites that disallow
/remain off-limits. - Don't scrape competitor sites for training. Using premium scrapers to ingest competitor content for your AI agent is legally and ethically dicey. Stick to content you own or have permission to use.
- Subscription content. If you're using BYOK for a subscription site, you must have a valid subscription. Sharing the credential across users you don't have rights to may violate the source's TOS.
These are your responsibility, not AskVault's. We don't gate scrapes by content legality; we trust customers to scrape within their rights.
Limits
- Plan availability. Business and above.
- API key storage. Encrypted at rest in your workspace's secret store. Never logged or surfaced in audit logs.
- Provider configurations. Up to 5 different providers configured per workspace.
Common pitfalls
Provider rejects every request. API key expired or wrong format. Test the key directly against the provider's API; if that fails, regenerate.
BYOK fires on hosts that don't need it. Auto-fallback is too aggressive. Configure a host allowlist to limit BYOK to specific hosts.
Provider cost spirals. A high-page-count host kept hitting BYOK on every page. Investigate whether the host has a sitemap (sitemap-crawl might work instead) or move the content to manual upload.
API key visible in logs. It shouldn't be. AskVault redacts API keys from audit logs by design. Report immediately if you see one.
FAQ
Does BYOK affect AskVault's own crawler costs?
No. BYOK is on top of AskVault's standard crawl. Your provider bill is separate from AskVault.
Can I use BYOK for cookies-required content?
Yes, if your scraper provider supports cookie passing. Configure cookies under Settings > BYOK Scraper > Cookies alongside the API key.
What happens if my BYOK key is rate-limited?
AskVault detects the rate-limit response, backs off, and retries with exponential delay. After 5 failures, the host enters a temporary skip-list for one hour.
Can I rotate BYOK keys?
Yes. Generate a new key in your provider's dashboard, update AskVault's settings. The new key applies immediately.
Does this work with my Enterprise contract?
Yes. Enterprise contracts often include support for unusual scraping setups including BYOK with multiple providers per host or fail-safe fallback chains.
Related guides
- URL crawling
- HTTP-first scraper architecture
- How to scrape a JavaScript-rendered website
- Snippets
- Recurring sync