Firecrawl Map
Action ID: firecrawl_map
Description
Generate a comprehensive map of all URLs on a website. This node quickly discovers and lists all accessible URLs on a domain, optionally filtered by search terms, subdomains, and configured limits.
Provider
Firecrawl
Connection
Firecrawl Connection
The Firecrawl connection to use for the map.
✓
firecrawl
Input Parameters
url
string
✓
-
The URL to scrape
search
string
-
-
Use the search feature to find URLs relevant to your query. For example, entering 'blog' will retrieve all URLs related to 'blog'.
include_subdomains
boolean
-
true
Include subdomains of the url in the result such as docs., blog., etc.
ignore_sitemap
boolean
-
false
Ignore the website sitemap when mapping
limit
integer
-
1000
The maximum number of URLs to return. Maximum is 5000.
Output Parameters
result
array
The output from the Firecrawl map
How It Works
This node analyzes a website's structure and generates a complete or filtered list of all accessible URLs. It can use the website's sitemap for efficiency, search for specific URL patterns, and optionally include subdomain URLs. The result is an array of URL strings that can be used for further processing.
Usage Examples
Example 1: Map Entire Website
Input:
url: "https://example.com"
search: null
include_subdomains: false
ignore_sitemap: false
limit: 1000Output:
result: [
"https://example.com/",
"https://example.com/about",
"https://example.com/products",
"https://example.com/products/item1",
"https://example.com/products/item2",
"https://example.com/contact",
"https://example.com/blog"
]Example 2: Map Blog URLs Only
Input:
url: "https://example.com"
search: "blog"
include_subdomains: false
ignore_sitemap: false
limit: 500Output:
result: [
"https://example.com/blog",
"https://example.com/blog/post1",
"https://example.com/blog/post2",
"https://example.com/blog/post3",
"https://example.com/blog/category/tech",
"https://example.com/blog/category/news"
]Example 3: Map with Subdomains
Input:
url: "https://example.com"
search: null
include_subdomains: true
ignore_sitemap: false
limit: 2000Output:
result: [
"https://example.com/",
"https://docs.example.com/",
"https://docs.example.com/api",
"https://blog.example.com/",
"https://blog.example.com/post1",
"https://api.example.com/v1"
]Common Use Cases
SEO Auditing: Discover all pages on a website for SEO analysis and optimization
Website Crawling: Generate a complete URL list before scraping or analyzing a site
Link Analysis: Map internal links and site structure for analysis
Backup Planning: Create a comprehensive list of all URLs before migrating or backing up a site
Content Inventory: Take inventory of all content pages on a website
API Discovery: Find all API documentation pages across a domain
Competitive Analysis: Map competitor websites to understand their structure
Error Handling
Invalid URL
URL format is incorrect or domain doesn't exist
Verify the URL is valid and properly formatted
Sitemap Not Found
Website doesn't have a sitemap.xml file
Set ignore_sitemap to true and let crawler discover URLs
Access Denied
Website blocks automated crawling
Check robots.txt and website terms; verify bot access is allowed
Timeout
Website structure is too complex or large
Increase timeout or reduce limit parameter
Empty Result
No URLs found matching the search criteria
Verify search term is correct or remove search filter
Rate Limited
Too many requests to the same domain
Space out requests or reduce limit per request
Notes
Sitemap Usage: By default, the mapper uses sitemap.xml for efficiency. Set ignore_sitemap to true to crawl instead.
Search Filter: The search parameter filters URLs containing specific keywords. Use lowercase for best results.
Subdomain Inclusion: Including subdomains significantly increases the URL count. Use carefully on large sites.
URL Limits: The maximum limit is 5000 URLs. For larger sites, use multiple requests with search filters.
Performance: Mapping large websites can take time. Start with a lower limit to test, then increase as needed.
URL Patterns: Results include all accessible URLs, including query parameters and fragments depending on the site structure.
Last updated
Was this helpful?