Knowledge

Enhance LLMs with Your Own Context and Data

The Knowledge feature empowers the Language Model (LLM) in your workflow with the ability to understand and answer questions related to specific topics or domains that might not be covered by its general training data. This document provides a comprehensive guide to using the Knowledge feature to enhance the capabilities of your LLM in answering user queries accurately.

Uploading Knowledge

When you create a new workflow, you’ll see a section called “Knowledge” in the builder. This is where you can upload documents or datasets that you want your LLM to learn from.

Adding Data

Click on “Add data” to open a modal where you can add data from an uploaded file, a website crawl, a third-party integration, or an existing dataset.

Supported Data Types

  • File: CSV, Excel, PDF, and audio files are supported. For files like PDF and audio, text will automatically be extracted and stored.

  • Website: Crawl a website to extract content.

  • Third-party: Integrate with third-party services to pull in relevant data.

  • Existing dataset: Use a dataset that has already been uploaded and processed.

Adding Knowledge to LLM

Once you have selected the knowledge items you’d like to use in the workflow, they are now available to be used in LLM actions.

Using Knowledge in LLM Actions

If you add an LLM action to the workflow, you will now be able to inject the prompt with the variable {{ knowledge }}. This will include information from the dataset based on the settings you have configured.

If you have multiple knowledge datasets selected, you can reference them individually by using {{ knowledge.dataset_name }}.

How is Knowledge Selected?

LLMs have a limit on the amount of context you can feed them, based on the model you use. Because of this, if a dataset is too large, it must be edited to fit into the context size. This is managed by features that give you a number of options on how this is configured.

You can check the context limit of models and learn how to manage large datasets for optimal use with your LLMs.

Last updated