Knowledge Retrieval

Action ID: knowledge_retrieval

Description

Retrieve knowledge from a dataset using semantic similarity search.

Input Parameters

Name

Type

Required

Default

Description

dataset

dropdown

✓

The dataset to retrieve knowledge from.

query

string

✓

The query to retrieve knowledge from the dataset (1-1000 characters).

top_k

integer

The number of documents to return (1-100).

View JSON Schema

{
  "description": "Knowledge Retrieval node input.",
  "properties": {
    "dataset": {
      "description": "The dataset to retrieve knowledge from.",
      "title": "Dataset",
      "type": "string"
    },
    "query": {
      "description": "The query to retrieve knowledge from the dataset.",
      "maxLength": 1000,
      "minLength": 1,
      "title": "Query",
      "type": "string"
    },
    "top_k": {
      "default": 5,
      "description": "The number of documents to return.",
      "maximum": 100,
      "minimum": 1,
      "title": "Top K",
      "type": "integer"
    }
  },
  "required": [
    "dataset",
    "query"
  ],
  "title": "KnowledgeRetrievalNodeInput",
  "type": "object"
}

Output Parameters

Name

Type

Description

documents

array

The documents that are relevant to the query.

Document Structure

Each document in the documents array contains:

Field

Type

Description

string

Unique identifier of the row

content

string

Combined content from all cell values

metadata

object

Metadata about the document including source dataset ID

View JSON Schema

{
  "description": "Knowledge Retrieval node output.",
  "properties": {
    "documents": {
      "description": "The documents that is relevant to the query.",
      "items": {
        "additionalProperties": true,
        "type": "object"
      },
      "title": "Documents",
      "type": "array"
    }
  },
  "required": [
    "documents"
  ],
  "title": "KnowledgeRetrievalNodeOutput",
  "type": "object"
}

How It Works

This node performs semantic similarity search on a dataset to retrieve relevant documents. It uses vector embeddings to find rows whose content is semantically similar to the query text. The node combines all cell values from each matching row into a single content string and returns the top K most relevant documents along with their IDs and metadata. This enables natural language querying of structured data for knowledge retrieval and context-aware applications.

Usage Examples

Example 1: Product Documentation Search

Input:

dataset: "product_docs_2024"
query: "How do I reset my password?"
top_k: 3

Output:

documents: [
  {
    "id": "doc_456",
    "content": "Password Reset: Navigate to settings, click 'Security', then 'Reset Password'...",
    "score": 0.92,
    "metadata": {"category": "authentication", "updated": "2024-01-10"}
  },
  {
    "id": "doc_123",
    "content": "Account Security: Managing your password and security settings...",
    "score": 0.87,
    "metadata": {"category": "security", "updated": "2024-01-05"}
  },
  {
    "id": "doc_789",
    "content": "Login Issues: Troubleshooting common authentication problems...",
    "score": 0.78,
    "metadata": {"category": "troubleshooting", "updated": "2023-12-20"}
  }
]

Example 2: Customer Support Knowledge Base

Input:

dataset: "support_kb_001"
query: "Shipping delays international orders"
top_k: 5

Output:

documents: [
  {
    "id": "kb_234",
    "content": "International shipping typically takes 7-14 business days. Customs processing may cause additional delays...",
    "score": 0.89
  },
  {
    "id": "kb_567",
    "content": "Track your international shipment using the tracking number provided...",
    "score": 0.82
  },
  ...
]

Example 3: Research Paper Database

Input:

dataset: "research_papers_ai"
query: "transformer architecture attention mechanisms"
top_k: 10

Output:

documents: [
  {
    "id": "paper_1123",
    "title": "Attention Is All You Need",
    "abstract": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks...",
    "score": 0.95,
    "authors": ["Vaswani et al."],
    "year": 2017
  },
  ...
]

Common Use Cases

Customer Support: Retrieve relevant help articles and documentation based on customer questions
Question Answering Systems: Find contextually relevant information to answer user queries
Semantic Search: Implement intelligent search that understands meaning beyond keywords
RAG (Retrieval-Augmented Generation): Provide context to AI models by retrieving relevant documents
Document Discovery: Help users discover related content and documents in large repositories
Knowledge Base Navigation: Enable natural language search across organizational knowledge bases
Research and Analysis: Find relevant research papers, reports, or documents based on topics

Error Handling

Error Type

Cause

Solution

Dataset Not Found

Dataset ID doesn't exist

Verify the dataset parameter contains a valid dataset ID

Empty Query

Query string is empty or contains only whitespace

Provide a meaningful query string with at least 1 character

Query Too Long

Query exceeds 1000 characters

Shorten your query to 1000 characters or less

Invalid Top K

top_k value is outside range 1-100

Set top_k to a value between 1 and 100

Dataset Not Indexed

Dataset lacks vector embeddings

Ensure the dataset has been properly indexed for semantic search

No Results Found

No documents match the query

Try rephrasing your query or expanding the search criteria

Embedding Error

Failed to generate query embeddings

Check embedding service availability and retry

Notes

Semantic vs Keyword Search: This node uses semantic search, which understands context and meaning rather than just matching keywords.
Query Quality: More specific, well-formed queries generally produce better results. Avoid overly vague or generic queries.
Top K Selection: Balance between retrieving enough context (higher top_k) and maintaining relevance (lower top_k). Default of 5 works well for most cases.
Result Ranking: Documents are returned in order of relevance, with the most relevant documents first.
Score Interpretation: Similarity scores typically range from 0 to 1, with higher scores indicating greater relevance.
Dataset Preparation: Ensure your dataset is properly indexed with embeddings before using this node.
Performance: Retrieval speed depends on dataset size. Larger datasets may take slightly longer to search.
Use with AI Nodes: Combine with AI nodes like Claude or GPT to build RAG systems that answer questions based on retrieved context.

PreviousJSON to Google Sheet NextLipsync

Last updated 2 months ago

hashtagDescription

hashtagInput Parameters

hashtagOutput Parameters

hashtagDocument Structure

hashtagHow It Works

hashtagUsage Examples

hashtagExample 1: Product Documentation Search

hashtagExample 2: Customer Support Knowledge Base

hashtagExample 3: Research Paper Database

hashtagCommon Use Cases

hashtagError Handling

hashtagNotes

Description

Input Parameters

Output Parameters

Document Structure

How It Works

Usage Examples

Example 1: Product Documentation Search

Example 2: Customer Support Knowledge Base

Example 3: Research Paper Database

Common Use Cases

Error Handling

Notes