Skip to content
Try Free →

How to ingest WordPress content as a knowledge source

Last updated: · 4 min read

What this integration does

Two flows enabled when WordPress is connected:

  1. Knowledge ingestion. Posts, pages, and custom post types ingest as a knowledge source. Bot retrieves from them when answering customer questions.
  2. content_recommender skill. When a visitor's question relates to a topic you've published about, the bot suggests "Read more: [post link]" pulling from your indexed WordPress content. Available on Growth and above. Growth+

For e-commerce stores running WooCommerce on WordPress, use the WooCommerce integration additionally to get the order-status flow.

When to use this integration vs simple URL crawling

Two ways to get WordPress content into AskVault:

  • URL crawling (details). Crawl your WordPress site like any other website. Works on every WordPress install. Doesn't require API setup. Slower per-page (3 to 8 seconds via web fetch).
  • WordPress integration. Use the WP REST API directly. Faster per-page (200 to 500 ms). Pulls structured metadata (categories, tags, author). Works for private and password-protected content.

If your WordPress site is fully public, URL crawling is simpler. If you have private content, custom post types, or want metadata-rich retrieval, the WordPress integration is better.

Setup

Five minutes end-to-end.

  1. In your WordPress admin, install Application Passwords if not already installed. Available natively on WordPress 5.6+; for older installs, use the Application Passwords plugin.
  2. Generate an Application Password. Go to Users > Profile > Application Passwords. Name: "AskVault". Click Add New Application Password. Copy the 24-character password shown.
  3. In AskVault, open Knowledge > Add Source > WordPress > Add Site.
  4. Paste the site URL, your WordPress username, and the Application Password.
  5. Configure which post types to ingest. Default: posts and pages. Add custom post types if you have them.
  6. Save.

AskVault validates the credentials, walks every post and page, and begins indexing. 100 posts takes about 3 minutes; 1,000 posts takes about 30 minutes.

What gets ingested

Default content types:

  • Posts (blog articles).
  • Pages (about, contact, policy pages, etc.).
  • Custom post types like Products, Portfolio, Events, etc.

For each piece of content, AskVault pulls:

  • Title.
  • Body (HTML-stripped to text).
  • Categories and tags (used as metadata for filtering).
  • Author.
  • Published date.
  • Featured image alt text.
  • Custom fields (configurable mapping).

Comments aren't indexed by default. Enable under Integrations > WordPress > Include Comments if you want them.

Sync behavior

  • Daily sync (default). Runs at midnight UTC. Picks up new posts, updates changed ones, removes deleted ones from the index.
  • Webhook sync (optional). Configure the WP Webhooks plugin to fire on post-publish/update. AskVault re-indexes within 60 seconds.
  • Manual sync. Knowledge Hub > WordPress > Resync now anytime.

Private and password-protected content

The WP REST API can read draft posts and password-protected pages if your Application Password user has the right capabilities. By default AskVault ingests only published content. To include drafts and private posts:

  1. Integrations > WordPress > Visibility Filter > All Statuses.
  2. Confirm your user has edit_posts capability or higher.

For password-protected pages, AskVault submits the password during ingestion. Configure passwords per page under Integrations > WordPress > Password Map.

Custom post types

Many WordPress sites have custom post types like Products, Properties, Vehicles, Events. AskVault discovers them automatically. Pick which to include under Integrations > WordPress > Post Types.

Each post type can have its own:

  • File pattern (which fields to include).
  • Audience tags (apply at-ingestion or at-retrieval).
  • Retrieval boost (rank some post types higher).

Multilingual sites (WPML, Polylang)

Multilingual WordPress sites work. AskVault indexes each language version as a separate document. Bot retrieves the version matching the conversation language.

For better multilingual handling, configure the language map under Integrations > WordPress > Language Map. Maps WP language codes (en_US, fr_FR) to AskVault language identifiers.

Limits

  • WP REST API rate limits. Most hosts don't throttle hard. AskVault makes about 5 to 20 requests per minute during sync.
  • Post count cap. No hard cap. Sites with 10,000+ posts take 2 to 4 hours for initial indexing.
  • Content size per workspace. 5 MB on Free, 15 MB on Starter, 40 MB on Growth, 100 MB on Business.

Common pitfalls

Authentication fails. Application Password requires WordPress 5.6+ or the plugin. Older installs need the plugin first.

Posts ingest but pages don't. Page post-type wasn't selected during setup. Add under Integrations > WordPress > Post Types.

Custom field content missing. AskVault indexes standard content by default. Custom fields require explicit mapping under Integrations > WordPress > Field Mapping.

Sync stops mid-way. WP REST API rate limit on your host. Reduce sync concurrency under Integrations > WordPress > Concurrency.

FAQ

Does this work on WordPress.com (hosted)?

Yes, on WordPress.com Business plan and above (where REST API + Application Passwords are enabled). Free and Personal plans don't expose the REST API.

Does it work with WP Engine or Kinsta?

Yes. Both hosts run standard WordPress with full REST API access.

Can I exclude specific posts from indexing?

Yes. Either:

  • Tag posts with noindex_askvault (a custom taxonomy AskVault recognizes).
  • Configure URL pattern exclusions under Integrations > WordPress > Exclude URLs.

Does this expose draft posts to the chatbot?

Only if you explicitly enable draft visibility. Default is published-only.

How does this compare to the URL crawl for SEO content?

URL crawl gets you the rendered HTML the public sees. WordPress integration gets you the source content and metadata via API. Crawl is simpler; integration is richer.

Was this page helpful?