Get Started
Supadata
Supadata agent for data management and analytics operations.
Available Tools(6)
supadata_transcript
Extract transcript from supported video platforms (YouTube, TikTok, Instagram, Twitter) or file URLs using Supadata's transcript API. **Purpose:** Get transcripts from video content across multiple platforms. **Best for:** Video content analysis, subtitle extraction, content indexing. **Usage Example:** ```json { "name": "supadata_transcript", "arguments": { "url": "https://youtube.com/watch?v=example", "lang": "en", "text": false, "mode": "auto" } } ``` **Returns:** - Either immediate transcript content - Or job ID for asynchronous processing (use supadata_check_transcript_status) **Supported Platforms:** YouTube, TikTok, Instagram, Twitter, and file URLs
supadata_crawl
Create a crawl job to extract content from all pages on a website using Supadata's crawling API. **Purpose:** Crawl a whole website and get content of all pages on it. **Best for:** Extracting content from multiple related pages when you need comprehensive coverage. **Workflow:** 1) Create crawl job → 2) Receive job ID → 3) Check job status and retrieve results **Crawling Behavior:** - Follows only child links within the specified domain - Example: For https://supadata.ai/blog, crawls https://supadata.ai/blog/article-1 but not https://supadata.ai/about - To crawl entire website, use top-level URL like https://supadata.ai **Usage Example:** ```json { "name": "supadata_crawl", "arguments": { "url": "https://example.com", "limit": 100 } } ``` **Returns:** Job ID for status checking. Use supadata_check_crawl_status to check progress. **Job Status:** Possible statuses are 'scraping', 'completed', 'failed', or 'cancelled' **Important:** Respect robots.txt and website terms of service when crawling web content.
supadata_check_transcript_status
Check the status and retrieve results of a transcript job created with supadata_transcript. **Purpose:** Monitor transcript job progress and retrieve completed results. **Workflow:** Use the job ID returned from supadata_transcript to check status and get results. **Usage Example:** ```json { "name": "supadata_check_transcript_status", "arguments": { "id": "550e8400-e29b-41d4-a716-446655440000" } } ``` **Returns:** - Job status: 'queued', 'active', 'completed', 'failed' - For completed jobs: Full transcript content - Error details if job failed **Tip:** Poll this endpoint periodically until status is 'completed' or 'failed'.
supadata_scrape
Extract content from any web page to Markdown format using Supadata's powerful scraping API. **Purpose:** Single page content extraction with automatic formatting to Markdown. **Best for:** When you know exactly which page contains the information you need. **Usage Example:** ```json { "name": "supadata_scrape", "arguments": { "url": "https://example.com", "noLinks": false, "lang": "en" } } ``` **Returns:** - URL of the scraped page - Extracted content in Markdown format - Page name and description - Character count - List of URLs found on the page
supadata_map
Crawl a whole website and get all URLs on it using Supadata's mapping API. **Purpose:** Extract all links found on a website for content discovery and sitemap creation. **Best for:** Website content discovery, SEO analysis, content aggregation, automated web scraping and indexing. **Use cases:** Creating a sitemap, running a crawler to fetch content from all pages of a website. **Usage Example:** ```json { "name": "supadata_map", "arguments": { "url": "https://example.com" } } ``` **Returns:** Array of URLs found on the website.
supadata_check_crawl_status
Check the status and retrieve results of a crawl job created with supadata_crawl. **Purpose:** Monitor crawl job progress and retrieve completed results. **Workflow:** Use the job ID returned from supadata_crawl to check status and get results. **Usage Example:** ```json { "name": "supadata_check_crawl_status", "arguments": { "id": "550e8400-e29b-41d4-a716-446655440000" } } ``` **Returns:** - Job status: 'scraping', 'completed', 'failed', or 'cancelled' - For completed jobs: URL, Markdown content, page title, and description for each crawled page - Progress information and any error details if applicable **Tip:** Poll this endpoint periodically until status is 'completed' or 'failed'.
Supadata