Extract Structured Data with Claude
Action ID: claude_extract_structured_data
Description
Extract structured data from text or images using Claude. This node allows you to define schemas in simple or advanced mode and extract data in a structured format.
Provider
Anthropic Claude
Connection
Claude Connection
The Claude connection to use for extracting structured data.
✓
claude
Input Parameters
model
dropdown
-
claude-3-haiku-20240307
The model to use for extracting structured data. Available options: claude-3-haiku-20240307, claude-3-sonnet-20240229, claude-3-opus-20240229, claude-3-5-sonnet-latest, claude-3-5-haiku-latest, claude-3-7-sonnet-latest
text
string
-
-
Text to extract structured data from.
images
array
-
-
Images to extract structured data from. Supported formats: JPG, PNG, JPEG
prompt
string
-
"Extract the following data from the provided content."
Prompt to guide the AI in extracting structured data.
schema_mode
dropdown
-
simple
Mode for defining the schema. Available options: simple, advanced
simple_schema
array
-
-
Schema definition in simple mode. Each entry should define a field with name, description, type, and whether it's required.
advanced_schema
object
-
-
JSON Schema for advanced mode. Provide a complete JSON Schema object for complex data extraction.
max_tokens
integer
-
2000
The maximum number of tokens to generate.
Output Parameters
data
object
The structured data extracted from the input
How It Works
This node sends your text or image content to Claude along with your defined schema. Claude analyzes the content against your schema definitions and extracts structured data that matches your specified fields. The extracted data is returned as a JSON object with the fields you defined. If your schema defines required fields but they're not found in the input, those fields may be null or omitted depending on the schema settings.
Usage Examples
Example 1: Extract Contact Information (Simple Mode)
Input:
text: "John Smith, Email: [email protected], Phone: 555-123-4567"
schema_mode: "simple"
simple_schema: [
{"name": "full_name", "description": "Full name", "type": "string", "is_required": true},
{"name": "email", "description": "Email address", "type": "string", "is_required": true},
{"name": "phone", "description": "Phone number", "type": "string", "is_required": false}
]Output:
data: {
"full_name": "John Smith",
"email": "[email protected]",
"phone": "555-123-4567"
}Example 2: Extract Product Details (Advanced Mode)
Input:
text: "Product: Blue Running Shoes, Size: 10, Price: $89.99, Colors: Blue, Red"
schema_mode: "advanced"
advanced_schema: {
"type": "object",
"properties": {
"product_name": {"type": "string"},
"size": {"type": "string"},
"price": {"type": "number"},
"available_colors": {"type": "array", "items": {"type": "string"}}
}
}Output:
data: {
"product_name": "Blue Running Shoes",
"size": "10",
"price": 89.99,
"available_colors": ["Blue", "Red"]
}Example 3: Extract from Image
Input:
images: ["https://example.com/invoice.jpg"]
prompt: "Extract invoice details"
schema_mode: "simple"
simple_schema: [
{"name": "invoice_number", "type": "string", "is_required": true},
{"name": "total_amount", "type": "number", "is_required": true},
{"name": "vendor", "type": "string", "is_required": true}
]Output:
data: {
"invoice_number": "INV-2024-001",
"total_amount": 1250.50,
"vendor": "Acme Supplies Inc"
}Common Use Cases
Form Data Extraction: Extract structured information from unstructured form submissions
Document Processing: Pull key information from invoices, receipts, and other business documents
Web Scraping: Extract data from web pages and convert to structured JSON format
Image Analysis: Extract structured information from images like screenshots or scanned documents
API Response Parsing: Convert complex API responses into simplified structured formats
Bulk Data Migration: Transform CSV, email, or text data into consistent structured formats
Contact List Building: Extract names, emails, and contact details from various sources
Error Handling
Invalid Schema
Schema definition doesn't follow JSON Schema format
Review your schema syntax and ensure it's valid JSON Schema in advanced mode
Extraction Failed
Content doesn't contain the required fields
Verify the input content has the information you're trying to extract
Null Values
Required fields not found in the input
Adjust your schema to make fields optional or improve your guide prompt
Type Mismatch
Extracted data doesn't match the defined type
Update your schema or guide prompt to clarify the expected data types
Token Limit Exceeded
Input content is too large
Reduce input size or increase max_tokens parameter
Ambiguous Schema
Schema is too vague to extract consistently
Add more detailed field descriptions in your schema definition
Notes
Simple Mode: Use simple mode for straightforward field extraction. Define fields with names, descriptions, data types, and required status.
Advanced Mode: Use advanced mode for complex schemas with nested objects, arrays, and conditional fields. Provide a complete JSON Schema.
Schema Design: Be specific in your schema definitions. The more detailed your schema, the more accurate the extraction.
Image Support: You can extract structured data from images (JPG, PNG, JPEG) in addition to text.
Model Selection: Opus and Sonnet models handle complex extractions better than Haiku, especially for detailed schemas.
Guide Prompt: Customize the guide prompt to provide additional context about how to extract and interpret the data.
Last updated
Was this helpful?