Supadata

supadata_crawl

Create a crawl job to extract content from all pages on a website using Supadata's crawling API. **Purpose:** Crawl a whole website and get content of all pages on it. **Best for:** Extracting content from multiple related pages when you need comprehensive coverage. **Workflow:** 1) Create crawl job → 2) Receive job ID → 3) Check job status and retrieve results **Crawling Behavior:** - Follows only child links within the specified domain - Example: For https://supadata.ai/blog, crawls https://supadata.ai/blog/article-1 but not https://supadata.ai/about - To crawl entire website, use top-level URL like https://supadata.ai **Usage Example:** ```json { "name": "supadata_crawl", "arguments": { "url": "https://example.com", "limit": 100 } } ``` **Returns:** Job ID for status checking. Use supadata_check_crawl_status to check progress. **Job Status:** Possible statuses are 'scraping', 'completed', 'failed', or 'cancelled' **Important:** Respect robots.txt and website terms of service when crawling web content.

Supadata

Available Tools(6)