Extract Structured Data

Learn how to use the Extract Structured Data action to pull specific, organized information from unstructured text.

The Extract Structured Data Action is a powerful tool for parsing unstructured text and turning it into organized, predictable data. It uses an AI model (like OpenAI's GPT or Anthropic's Claude) to intelligently find and extract the information you need based on a schema you provide.

This is extremely useful for processing emails, documents, or any block of text where you need to pull out specific details like names, dates, companies, or order numbers.

Configuration

There are two main parts to configuring this action: the text to process and the schema to extract.

Input Parameters

Parameter
Type
Description

Unstructured Text

Text

The block of text you want to extract data from. This can be static text or a variable from a previous action (e.g., {{email_body.content}}).

Data Definition

Array

This is where you define the schema of the data you want to extract. It's an array of objects, where each object defines one piece of information to find.

Defining Your Schema

For each piece of data you want to extract, you need to define the following:

  • Name: A unique name for the data point (e.g., company_name, invoice_date). This will become the key in the output JSON.

  • Description: A clear description of what the AI should look for. Be specific! For example, "The full name of the customer who sent the email."

  • Data Type: The type of data to expect (string, number, boolean).

  • Fail if Not present?: If checked, the action will fail if it cannot find this specific piece of data in the text.

Example Configuration

Imagine you have the following text in a variable {{email_content}}:

"Hi team, please be advised that our client, Innovate Corp, has scheduled their quarterly review for July 25, 2024. The invoice number is INV-9583. Thanks, John."

You could configure the Data Definition like this:

[
  {
    "prop_name": "client_company",
    "prop_description": "The name of the client company mentioned in the email.",
    "prop_data_type": "string",
    "prop_is_required": true
  },
  {
    "prop_name": "review_date",
    "prop_description": "The date of the scheduled review.",
    "prop_data_type": "string",
    "prop_is_required": true
  },
  {
    "prop_name": "invoice_number",
    "prop_description": "The unique invoice number.",
    "prop_data_type": "string",
    "prop_is_required": false
  }
]

Output

The action outputs a single JSON object containing the data it successfully extracted based on your schema.

Output Parameter

Parameter
Type
Description

data

JSON

A JSON object where the keys are the prop_name values from your schema and the values are the extracted information.

Example Output

Based on the example above, the output {{extract_action.data}} would look like this:

{
  "client_company": "Innovate Corp",
  "review_date": "July 25, 2024",
  "invoice_number": "INV-9583"
}

Last updated

Was this helpful?