Extract Structured Data with Claude

Action ID: claude_extract_structured_data

Description

Extract structured data from text or images using Claude. This node allows you to define schemas in simple or advanced mode and extract data in a structured format.

Provider

Anthropic Claude

Connection

Name

Description

Required

Input Parameters

Name

Type

Required

Default

Description

model

dropdown

claude-3-haiku-20240307

The model to use for extracting structured data. Available options: claude-3-haiku-20240307, claude-3-sonnet-20240229, claude-3-opus-20240229, claude-3-5-sonnet-latest, claude-3-5-haiku-latest, claude-3-7-sonnet-latest

text

string

Text to extract structured data from.

images

array

Images to extract structured data from. Supported formats: JPG, PNG, JPEG

prompt

string

"Extract the following data from the provided content."

Prompt to guide the AI in extracting structured data.

schema_mode

dropdown

simple

Mode for defining the schema. Available options: simple, advanced

simple_schema

array

Schema definition in simple mode. Each entry should define a field with name, description, type, and whether it's required.

advanced_schema

object

JSON Schema for advanced mode. Provide a complete JSON Schema object for complex data extraction.

max_tokens

integer

2000

The maximum number of tokens to generate.

View JSON Schema

{
  "description": "Extract Structured Data node input.",
  "properties": {
    "model": {
      "default": "claude-3-haiku-20240307",
      "description": "The model to use for extracting structured data.",
      "enum": [
        "claude-3-haiku-20240307",
        "claude-3-sonnet-20240229",
        "claude-3-opus-20240229",
        "claude-3-5-sonnet-latest",
        "claude-3-5-haiku-latest",
        "claude-3-7-sonnet-latest"
      ],
      "title": "Model",
      "type": "string"
    },
    "text": {
      "default": null,
      "description": "Text to extract structured data from.",
      "title": "Text",
      "type": "string"
    },
    "images": {
      "default": null,
      "description": "Images to extract structured data from.",
      "items": {
        "type": "string"
      },
      "title": "Images",
      "type": "array"
    },
    "prompt": {
      "default": "Extract the following data from the provided content.",
      "description": "Prompt to guide the AI in extracting structured data.",
      "title": "Guide Prompt",
      "type": "string"
    },
    "schema_mode": {
      "default": "simple",
      "description": "Mode for defining the schema.",
      "enum": [
        "simple",
        "advanced"
      ],
      "title": "Schema Mode",
      "type": "string"
    },
    "simple_schema": {
      "default": null,
      "description": "Schema definition in simple mode.",
      "items": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "title": "Name",
            "description": "Name of the field to extract."
          },
          "description": {
            "type": "string",
            "title": "Description",
            "description": "Description of the field to extract."
          },
          "type": {
            "type": "string",
            "title": "Data Type",
            "description": "Data type of the field to extract.",
            "default": "string"
          },
          "is_required": {
            "type": "boolean",
            "title": "Required",
            "description": "Whether the field is required.",
            "default": false
          }
        }
      },
      "title": "Simple Schema",
      "type": "array"
    },
    "advanced_schema": {
      "default": null,
      "description": "JSON Schema for advanced mode.",
      "title": "Advanced Schema",
      "type": "object"
    },
    "max_tokens": {
      "default": 2000,
      "description": "The maximum number of tokens to generate.",
      "title": "Maximum Tokens",
      "type": "integer"
    }
  },
  "required": [],
  "title": "ExtractStructuredDataInput",
  "type": "object"
}

Output Parameters

Name

Type

Description

data

object

The structured data extracted from the input

View JSON Schema

{
  "description": "Extract Structured Data node output.",
  "properties": {
    "data": {
      "title": "Extracted Data",
      "type": "object",
      "description": "The structured data extracted from the input."
    }
  },
  "required": [
    "data"
  ],
  "title": "ExtractStructuredDataOutput",
  "type": "object"
}

How It Works

This node sends your text or image content to Claude along with your defined schema. Claude analyzes the content against your schema definitions and extracts structured data that matches your specified fields. The extracted data is returned as a JSON object with the fields you defined. If your schema defines required fields but they're not found in the input, those fields may be null or omitted depending on the schema settings.

Usage Examples

Example 1: Extract Contact Information (Simple Mode)

Input:

text: "John Smith, Email: [email protected], Phone: 555-123-4567"
schema_mode: "simple"
simple_schema: [
  {"name": "full_name", "description": "Full name", "type": "string", "is_required": true},
  {"name": "email", "description": "Email address", "type": "string", "is_required": true},
  {"name": "phone", "description": "Phone number", "type": "string", "is_required": false}
]

Output:

data: {
  "full_name": "John Smith",
  "email": "[email protected]",
  "phone": "555-123-4567"
}

Example 2: Extract Product Details (Advanced Mode)

Input:

text: "Product: Blue Running Shoes, Size: 10, Price: $89.99, Colors: Blue, Red"
schema_mode: "advanced"
advanced_schema: {
  "type": "object",
  "properties": {
    "product_name": {"type": "string"},
    "size": {"type": "string"},
    "price": {"type": "number"},
    "available_colors": {"type": "array", "items": {"type": "string"}}
  }
}

Output:

data: {
  "product_name": "Blue Running Shoes",
  "size": "10",
  "price": 89.99,
  "available_colors": ["Blue", "Red"]
}

Example 3: Extract from Image

Input:

images: ["https://example.com/invoice.jpg"]
prompt: "Extract invoice details"
schema_mode: "simple"
simple_schema: [
  {"name": "invoice_number", "type": "string", "is_required": true},
  {"name": "total_amount", "type": "number", "is_required": true},
  {"name": "vendor", "type": "string", "is_required": true}
]

Output:

data: {
  "invoice_number": "INV-2024-001",
  "total_amount": 1250.50,
  "vendor": "Acme Supplies Inc"
}

Common Use Cases

Form Data Extraction: Extract structured information from unstructured form submissions
Document Processing: Pull key information from invoices, receipts, and other business documents
Web Scraping: Extract data from web pages and convert to structured JSON format
Image Analysis: Extract structured information from images like screenshots or scanned documents
API Response Parsing: Convert complex API responses into simplified structured formats
Bulk Data Migration: Transform CSV, email, or text data into consistent structured formats
Contact List Building: Extract names, emails, and contact details from various sources

Error Handling

Error Type

Cause

Solution

Invalid Schema

Schema definition doesn't follow JSON Schema format

Review your schema syntax and ensure it's valid JSON Schema in advanced mode

Extraction Failed

Content doesn't contain the required fields

Verify the input content has the information you're trying to extract

Null Values

Required fields not found in the input

Adjust your schema to make fields optional or improve your guide prompt

Type Mismatch

Extracted data doesn't match the defined type

Update your schema or guide prompt to clarify the expected data types

Token Limit Exceeded

Input content is too large

Reduce input size or increase max_tokens parameter

Ambiguous Schema

Schema is too vague to extract consistently

Add more detailed field descriptions in your schema definition

Notes

Simple Mode: Use simple mode for straightforward field extraction. Define fields with names, descriptions, data types, and required status.
Advanced Mode: Use advanced mode for complex schemas with nested objects, arrays, and conditional fields. Provide a complete JSON Schema.
Schema Design: Be specific in your schema definitions. The more detailed your schema, the more accurate the extraction.
Image Support: You can extract structured data from images (JPG, PNG, JPEG) in addition to text.
Model Selection: Opus and Sonnet models handle complex extractions better than Haiku, especially for detailed schemas.
Guide Prompt: Customize the guide prompt to provide additional context about how to extract and interpret the data.

PreviousAsk Claude NextComfy UI

Last updated 3 months ago

{ "description": "Extract Structured Data node input.", "properties": { "model": { "default": "claude-3-haiku-20240307", "description": "The model to use for extracting structured data.", "enum": [ "claude-3-haiku-20240307", "claude-3-sonnet-20240229", "claude-3-opus-20240229", "claude-3-5-sonnet-latest", "claude-3-5-haiku-latest", "claude-3-7-sonnet-latest" ], "title": "Model", "type": "string" }, "text": { "default": null, "description": "Text to extract structured data from.", "title": "Text", "type": "string" }, "images": { "default": null, "description": "Images to extract structured data from.", "items": { "type": "string" }, "title": "Images", "type": "array" }, "prompt": { "default": "Extract the following data from the provided content.", "description": "Prompt to guide the AI in extracting structured data.", "title": "Guide Prompt", "type": "string" }, "schema_mode": { "default": "simple", "description": "Mode for defining the schema.", "enum": [ "simple", "advanced" ], "title": "Schema Mode", "type": "string" }, "simple_schema": { "default": null, "description": "Schema definition in simple mode.", "items": { "type": "object", "properties": { "name": { "type": "string", "title": "Name", "description": "Name of the field to extract." }, "description": { "type": "string", "title": "Description", "description": "Description of the field to extract." }, "type": { "type": "string", "title": "Data Type", "description": "Data type of the field to extract.", "default": "string" }, "is_required": { "type": "boolean", "title": "Required", "description": "Whether the field is required.", "default": false } } }, "title": "Simple Schema", "type": "array" }, "advanced_schema": { "default": null, "description": "JSON Schema for advanced mode.", "title": "Advanced Schema", "type": "object" }, "max_tokens": { "default": 2000, "description": "The maximum number of tokens to generate.", "title": "Maximum Tokens", "type": "integer" } }, "required": [], "title": "ExtractStructuredDataInput", "type": "object" }

hashtagDescription

hashtagProvider

hashtagConnection

hashtagInput Parameters

hashtagOutput Parameters

hashtagHow It Works

hashtagUsage Examples

hashtagExample 1: Extract Contact Information (Simple Mode)

hashtagExample 2: Extract Product Details (Advanced Mode)

hashtagExample 3: Extract from Image

hashtagCommon Use Cases

hashtagError Handling

hashtagNotes

Description

Provider

Connection

Input Parameters

Output Parameters

How It Works

Usage Examples

Example 1: Extract Contact Information (Simple Mode)

Example 2: Extract Product Details (Advanced Mode)

Example 3: Extract from Image

Common Use Cases

Error Handling

Notes