Built-in website crawling tools

Web Scraper (Built-in)

When a Site has a Website URL, two built-in tools are enabled automatically:
  • list_site_pages: Reads
    text
    /sitemap.xml
    to list available pages
  • fetch_page_content: Fetches and cleans a page's main content for the AI

How it works

  1. You set Website URL on the Site
  2. The crawler/sitemap reader become available to the AI
  3. Results are cached (sitemap ~12h, pages ~2h) for speed

Tips

  • Ensure your site serves a sitemap at
    text
    /sitemap.xml
  • Keep important content in the main HTML (avoid heavy JS-rendered text)
  • Update
    text
    Website URL
    if you move domains; cache will refresh

When to disable

If you need strict answers only from your knowledge base or tools, leave Website URL empty to disable crawling.