# Firecrawl Map

**Action ID:** `firecrawl_map`

## Description

Generate a comprehensive map of all URLs on a website. This node quickly discovers and lists all accessible URLs on a domain, optionally filtered by search terms, subdomains, and configured limits.

## Provider

**Firecrawl**

## Connection

| Name                 | Description                                  | Required | Category  |
| -------------------- | -------------------------------------------- | :------: | --------- |
| Firecrawl Connection | The Firecrawl connection to use for the map. |     ✓    | firecrawl |

## Input Parameters

| Name                | Type    | Required | Default | Description                                                                                                                        |
| ------------------- | ------- | :------: | ------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| url                 | string  |     ✓    | -       | The URL to scrape                                                                                                                  |
| search              | string  |     -    | -       | Use the search feature to find URLs relevant to your query. For example, entering 'blog' will retrieve all URLs related to 'blog'. |
| include\_subdomains | boolean |     -    | true    | Include subdomains of the url in the result such as docs.*, blog.*, etc.                                                           |
| ignore\_sitemap     | boolean |     -    | false   | Ignore the website sitemap when mapping                                                                                            |
| limit               | integer |     -    | 1000    | The maximum number of URLs to return. Maximum is 5000.                                                                             |

<details>

<summary>View JSON Schema</summary>

```json
{
  "description": "Firecrawl map node input.",
  "properties": {
    "url": {
      "title": "URL",
      "type": "string",
      "format": "uri",
      "description": "The URL to scrape."
    },
    "search": {
      "title": "Search",
      "type": "string",
      "description": "Use the search feature to find URLs relevant to your query. For example, entering 'blog' will retrieve all URLs related to 'blog'."
    },
    "include_subdomains": {
      "title": "Include Subdomains",
      "type": "boolean",
      "default": true,
      "description": "Include subdomains of the url in the result such as docs.*, blog.*, etc."
    },
    "ignore_sitemap": {
      "title": "Ignore Sitemap",
      "type": "boolean",
      "default": false,
      "description": "Ignore the website sitemap when mapping."
    },
    "limit": {
      "title": "Limit",
      "type": "integer",
      "default": 1000,
      "description": "The maximum number of URLs to return. Maximum is 5000."
    }
  },
  "required": [
    "url"
  ],
  "title": "FirecrawlMapInput",
  "type": "object"
}
```

</details>

## Output Parameters

| Name   | Type  | Description                       |
| ------ | ----- | --------------------------------- |
| result | array | The output from the Firecrawl map |

<details>

<summary>View JSON Schema</summary>

```json
{
  "description": "Firecrawl map node output.",
  "properties": {
    "result": {
      "title": "Result",
      "type": "array",
      "items": {"type": "string"},
      "description": "The output from the Firecrawl map."
    }
  },
  "required": [
    "result"
  ],
  "title": "FirecrawlMapOutput",
  "type": "object"
}
```

</details>

## How It Works

This node analyzes a website's structure and generates a complete or filtered list of all accessible URLs. It can use the website's sitemap for efficiency, search for specific URL patterns, and optionally include subdomain URLs. The result is an array of URL strings that can be used for further processing.

## Usage Examples

### Example 1: Map Entire Website

**Input:**

```
url: "https://example.com"
search: null
include_subdomains: false
ignore_sitemap: false
limit: 1000
```

**Output:**

```
result: [
  "https://example.com/",
  "https://example.com/about",
  "https://example.com/products",
  "https://example.com/products/item1",
  "https://example.com/products/item2",
  "https://example.com/contact",
  "https://example.com/blog"
]
```

### Example 2: Map Blog URLs Only

**Input:**

```
url: "https://example.com"
search: "blog"
include_subdomains: false
ignore_sitemap: false
limit: 500
```

**Output:**

```
result: [
  "https://example.com/blog",
  "https://example.com/blog/post1",
  "https://example.com/blog/post2",
  "https://example.com/blog/post3",
  "https://example.com/blog/category/tech",
  "https://example.com/blog/category/news"
]
```

### Example 3: Map with Subdomains

**Input:**

```
url: "https://example.com"
search: null
include_subdomains: true
ignore_sitemap: false
limit: 2000
```

**Output:**

```
result: [
  "https://example.com/",
  "https://docs.example.com/",
  "https://docs.example.com/api",
  "https://blog.example.com/",
  "https://blog.example.com/post1",
  "https://api.example.com/v1"
]
```

## Common Use Cases

* **SEO Auditing**: Discover all pages on a website for SEO analysis and optimization
* **Website Crawling**: Generate a complete URL list before scraping or analyzing a site
* **Link Analysis**: Map internal links and site structure for analysis
* **Backup Planning**: Create a comprehensive list of all URLs before migrating or backing up a site
* **Content Inventory**: Take inventory of all content pages on a website
* **API Discovery**: Find all API documentation pages across a domain
* **Competitive Analysis**: Map competitor websites to understand their structure

## Error Handling

| Error Type        | Cause                                           | Solution                                                         |
| ----------------- | ----------------------------------------------- | ---------------------------------------------------------------- |
| Invalid URL       | URL format is incorrect or domain doesn't exist | Verify the URL is valid and properly formatted                   |
| Sitemap Not Found | Website doesn't have a sitemap.xml file         | Set ignore\_sitemap to true and let crawler discover URLs        |
| Access Denied     | Website blocks automated crawling               | Check robots.txt and website terms; verify bot access is allowed |
| Timeout           | Website structure is too complex or large       | Increase timeout or reduce limit parameter                       |
| Empty Result      | No URLs found matching the search criteria      | Verify search term is correct or remove search filter            |
| Rate Limited      | Too many requests to the same domain            | Space out requests or reduce limit per request                   |

## Notes

* **Sitemap Usage**: By default, the mapper uses sitemap.xml for efficiency. Set ignore\_sitemap to true to crawl instead.
* **Search Filter**: The search parameter filters URLs containing specific keywords. Use lowercase for best results.
* **Subdomain Inclusion**: Including subdomains significantly increases the URL count. Use carefully on large sites.
* **URL Limits**: The maximum limit is 5000 URLs. For larger sites, use multiple requests with search filters.
* **Performance**: Mapping large websites can take time. Start with a lower limit to test, then increase as needed.
* **URL Patterns**: Results include all accessible URLs, including query parameters and fragments depending on the site structure.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.agenticflow.ai/reference/nodes/firecrawl_map.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
