FireCrawl

Action ID: firecrawl

Description

Extract structured data from web pages using Firecrawl's AI-powered extraction capabilities. Provide a URL and a natural language prompt describing what data to extract, and Firecrawl will intelligently parse the page content and return the requested information.

Input Parameters

Name

Type

Required

Default

Description

url

string

✓

URL to crawl and extract data from

prompt

string

✓

The prompt to use for the extraction without a schema

View JSON Schema

{
  "description": "FireCrawl node input.",
  "properties": {
    "url": {
      "description": "URL to crawl.",
      "title": "URL",
      "type": "string"
    },
    "prompt": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "The prompt to use for the extraction without a schema.",
      "title": "Prompt"
    }
  },
  "required": [
    "url",
    "prompt"
  ],
  "title": "FireCrawlNodeInput",
  "type": "object"
}

Output Parameters

Name

Type

Description

data

string (json_str)

The extracted data from the web URL in JSON format

View JSON Schema

{
  "description": "Web scraping node output.",
  "properties": {
    "data": {
      "description": "The extracted data from the web URL.",
      "title": "Data",
      "type": "string"
    }
  },
  "required": [
    "data"
  ],
  "title": "FireCrawlNodeOutput",
  "type": "object"
}

How It Works

This node uses Firecrawl's AI-powered extraction engine to intelligently parse web pages based on natural language prompts. It loads the target URL, renders any JavaScript content, and then uses the provided prompt to identify and extract the specific data you need. The AI understands the page structure and content context, extracting data according to your instructions and returning it as structured JSON.

Usage Examples

Example 1: Extract Product Information

Input:

url: "https://example.com/products/laptop-pro"
prompt: "Extract the product name, price, description, and availability status"

Output:

data: "{\"product_name\": \"Laptop Pro 15\", \"price\": \"$1,299.99\", \"description\": \"High-performance laptop with 16GB RAM and 512GB SSD\", \"availability\": \"In Stock\"}"

Example 2: Extract Article Metadata

Input:

url: "https://news.example.com/tech-article-2024"
prompt: "Get the article title, author name, publication date, and main topic"

Output:

data: "{\"title\": \"The Future of AI in 2024\", \"author\": \"Jane Smith\", \"publication_date\": \"2024-01-15\", \"main_topic\": \"Artificial Intelligence\"}"

Example 3: Extract Contact Information

Input:

url: "https://company.example.com/contact"
prompt: "Find the company's email address, phone number, and physical address"

Output:

data: "{\"email\": \"[email protected]\", \"phone\": \"+1-555-123-4567\", \"address\": \"123 Main St, San Francisco, CA 94105\"}"

Common Use Cases

Product Data Extraction: Extract product details, pricing, and specifications from e-commerce sites
Lead Generation: Collect contact information and business details from company websites
News Monitoring: Extract article content, headlines, and metadata from news sources
Real Estate Data: Gather property listings, prices, and details from real estate websites
Job Listings: Extract job titles, descriptions, requirements, and application details
Research Data Collection: Collect structured data from various websites for research purposes
Competitive Intelligence: Monitor competitor websites for pricing, product, and content changes

Error Handling

Error Type

Cause

Solution

Invalid URL

URL is malformed or inaccessible

Verify the URL is correctly formatted with http:// or https:// protocol

Page Not Found

The URL returns a 404 error or doesn't exist

Check that the URL is correct and the page is currently available

Extraction Failed

AI couldn't find data matching the prompt

Refine your prompt to be more specific or check if the data exists on the page

Timeout Error

Page took too long to load or process

Retry the request or check if the website is experiencing issues

Rate Limit Exceeded

Too many requests to Firecrawl API

Implement delays between requests or upgrade your Firecrawl plan

Authentication Required

Page requires login or authentication

Use a different scraping approach for authenticated pages

Prompt Too Vague

Prompt doesn't provide enough context for extraction

Make your prompt more specific about what data to extract and where

Notes

Natural Language Prompts: Write clear, specific prompts describing exactly what data you want to extract from the page.
AI-Powered: Firecrawl uses AI to understand page context, making it more flexible than traditional CSS selector-based scraping.
JavaScript Rendering: The service renders JavaScript, so it works with modern single-page applications.
Prompt Quality: More specific prompts yield better results—describe the data type, format, and location if possible.
JSON Output: Extracted data is returned as a JSON string, which can be parsed for use in subsequent workflow nodes.
No Schema Required: Unlike structured extraction APIs, this node works with natural language prompts instead of predefined schemas.
Best Practices: Test your prompts on sample pages first to ensure they extract the correct data.
Cost Considerations: Each extraction counts toward your Firecrawl API usage quota.

PreviousRun FAL Model NextFirecrawl Extract

Last updated 3 months ago

hashtagDescription

hashtagInput Parameters

hashtagOutput Parameters

hashtagHow It Works

hashtagUsage Examples

hashtagExample 1: Extract Product Information

hashtagExample 2: Extract Article Metadata

hashtagExample 3: Extract Contact Information

hashtagCommon Use Cases

hashtagError Handling

hashtagNotes

Description

Input Parameters

Output Parameters

How It Works

Usage Examples

Example 1: Extract Product Information

Example 2: Extract Article Metadata

Example 3: Extract Contact Information

Common Use Cases

Error Handling

Notes