πŸ€–AI & LLM Nodes

Complete reference for all AI and language model nodes in AgenticFlow.

AI nodes are the intelligence layer of AgenticFlow workflows, powered by state-of-the-art language models, image generators, and specialized AI services. These nodes transform data, generate content, analyze information, and make intelligent decisions in your automation workflows.


🧠 AI Node Categories

Text Generation & Analysis (12 nodes)

Advanced language models for text processing, conversation, and content creation.

Document Analysis (8 nodes)

AI-powered document processing, OCR, and content extraction.

Image & Video Processing (15 nodes)

AI-driven visual content creation, editing, and analysis.

Audio Processing (6 nodes)

Speech synthesis, transcription, and audio generation.

Search & Knowledge (4 nodes)

Intelligent search, research, and knowledge retrieval.


πŸ“ Text Generation & Analysis Nodes

Primary Language Models

OpenAI GPT Models

Node

Model

Use Cases

Cost Level

openai_ask_assistant

GPT-4/GPT-3.5

Complex reasoning, analysis

Medium-High

openai_ask_chat_gpt

GPT-4/GPT-3.5

Conversations, Q&A

Medium-High

Configuration Options:

  • Model Selection: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo

  • Temperature: 0.0-2.0 (creativity control)

  • Max Tokens: Response length limit

  • System Prompt: Role and behavior definition

  • Response Format: JSON, text, or structured output

Anthropic Claude

Node

Model

Strengths

Cost Level

claude_ask

Claude 3.5 Sonnet/Haiku

Analysis, writing, reasoning

Medium-High

Key Features:

  • Large Context Window: Handle extensive documents

  • Strong Reasoning: Excellent for analysis tasks

  • Safety Focused: Built-in content filtering

  • Code Understanding: Programming and technical content

Google Gemini

Node

Model

Features

Cost Level

google_gen_ai_ask_gemini

Gemini Pro/Flash

Multimodal, code generation

Medium

Capabilities:

  • Multimodal Input: Text, images, and documents

  • Code Generation: Programming in multiple languages

  • Real-time Data: Access to current information

  • Cost Effective: Competitive pricing

Specialized Models

Node

Provider

Specialization

Cost Level

pml_llm

PixelML

Custom fine-tuned models

Low-Medium

llama3

Meta

Open source, self-hosted

Low

straico_prompt_completion

Straico

Multi-model platform

Variable

Structured Data Extraction

Transform unstructured text into structured data formats.

Node

Input

Output

Use Cases

claude_extract_structured_data

Raw text

JSON/XML

Form processing, data mining

openai_extract_structured_data

Documents

Structured data

Invoice processing, resume parsing

string_to_json

Text

JSON

API response parsing

Common Extraction Patterns:

{
  "contact_info": {
    "name": "John Smith",
    "email": "[email protected]",
    "phone": "+1-555-123-4567"
  },
  "company": {
    "name": "Tech Corp",
    "industry": "Technology"
  }
}

πŸ“„ Document Analysis Nodes

Text Extraction

Node

Input Formats

Output

Features

text_extract

PDF, DOC, images

Plain text

OCR, layout preservation

describe_image

Images

Text description

Visual analysis, content description

Content Processing

Node

Function

Use Cases

url_to_markdown

Convert web pages

Content archival, documentation

html_to_image

Render HTML as image

Report generation, screenshots

Supported Document Types:

  • PDFs: Text extraction, table parsing

  • Images: OCR, handwriting recognition

  • Web Pages: Content extraction, formatting preservation

  • Office Documents: Word, Excel, PowerPoint


🎨 Image & Video Processing Nodes

Image Generation

Create images from text descriptions using various AI models.

Node

Model

Style

Resolution

generate_image

Stable Diffusion

Realistic, artistic

Up to 1024x1024

generate_image_v2

Enhanced models

Professional quality

Up to 2048x2048

openai_generate_image

DALL-E 3

Creative, detailed

1024x1024, 1792x1024

straico_image_generate

Multiple models

Various styles

Model-dependent

imagine_v4

Midjourney-style

Artistic, creative

High resolution

Generation Parameters:

  • Prompt Engineering: Detailed descriptions, style modifiers

  • Aspect Ratios: Square, portrait, landscape

  • Style Controls: Photorealistic, artistic, cartoon

  • Quality Settings: Speed vs quality trade-offs

Image Enhancement & Manipulation

Node

Function

Technology

enhance_image_v2

Quality improvement

AI upscaling

magic_upscale

Resolution enhancement

Advanced upscaling

remove_background

Background removal

Object detection

face_swap

Face replacement

Facial recognition

face_detailer

Facial enhancement

Feature improvement

inpainting

Fill/remove objects

Content-aware fill

outpainting

Extend image borders

Context generation

Video Generation & Processing

Node

Input

Output

Duration

text_to_video

Text prompt

MP4 video

3-10 seconds

image_to_video

Static image

Animated video

3-5 seconds

image_to_video_v2

Enhanced input

Higher quality

5-10 seconds

image_to_video_v3

Latest model

Best quality

Up to 10 seconds

create_video

Multiple inputs

Custom video

Variable

edit_video

Existing video

Modified video

Original length

Video Capabilities:

  • Motion Generation: Natural movement from static images

  • Style Transfer: Apply artistic styles to video

  • Object Animation: Animate specific elements

  • Quality Enhancement: Upscaling and stabilization


🎡 Audio Processing Nodes

Speech Synthesis

Convert text to natural-sounding speech with various voices and styles.

Node

Provider

Voice Options

Languages

text_to_speech

Multiple

50+ voices

40+ languages

text_to_speech_custom

Custom voices

Brand-specific

Custom training

text_to_speech_voice_clone

Voice cloning

Personal voices

Any language

openai_text_to_speech

OpenAI

6 premium voices

Multiple languages

Voice Features:

  • Emotional Control: Happy, sad, excited, calm

  • Speed Control: 0.5x to 2.0x playback speed

  • Pitch Adjustment: Higher or lower pitch

  • SSML Support: Advanced speech markup

Speech Recognition

Node

Provider

Accuracy

Features

speech_to_text

Generic

High

Multiple formats

openai_transcriptions

Whisper

Very High

99+ languages, punctuation

Audio Generation

Node

Function

Output

text_to_music

Generate music

MP3, WAV

search_sound_in_yt_sound_lib

Find audio clips

YouTube library


πŸ” Search & Knowledge Nodes

Node

Data Source

Search Type

Results

knowledge_retrieval

Internal KB

Semantic

Relevant documents

openai_search

Vector database

Embedding-based

Contextual matches

google_search

Web

Traditional

Web pages

perplexity_search

Web + AI

AI-powered

Synthesized answers

research_deep_research

Multiple sources

Comprehensive

Research reports

Search Capabilities:

  • Semantic Understanding: Intent-based matching

  • Multi-source Aggregation: Combine multiple data sources

  • Contextual Ranking: Relevance-based result ordering

  • Real-time Updates: Access to current information


βš™οΈ AI Node Configuration

Common Parameters

Model Settings

{
  "model": "gpt-4",
  "temperature": 0.7,
  "max_tokens": 1000,
  "top_p": 0.9,
  "frequency_penalty": 0.0
}

System Prompts

You are a professional assistant that helps with [specific task].

Guidelines:
- Be concise and accurate
- Provide actionable advice
- Ask clarifying questions when needed
- Format responses in [specific format]

Output Formatting

  • JSON Schema: Structured data output

  • Template-based: Consistent formatting

  • Conditional Logic: Adaptive responses

  • Multi-format: Text, JSON, CSV, etc.

Performance Optimization

Cost Management

  • Right-size Models: Use appropriate model for task complexity

  • Token Limits: Set reasonable response length limits

  • Batch Processing: Group similar requests

  • Caching: Store frequently used results

Quality Control

  • Temperature Settings: Balance creativity with consistency

  • Validation: Check output format and content

  • Fallback Models: Use backup models for high availability

  • A/B Testing: Compare model performance


πŸ› οΈ Integration Patterns

Common AI Workflows

Content Generation Pipeline

Input Data β†’ AI Analysis β†’ Content Generation β†’ Quality Check β†’ Output

Document Processing

Upload Document β†’ Text Extraction β†’ AI Analysis β†’ Structured Data β†’ Database

Conversational AI

User Input β†’ Intent Detection β†’ Knowledge Retrieval β†’ Response Generation β†’ User

Media Creation

Text Prompt β†’ Image Generation β†’ Enhancement β†’ Video Creation β†’ Publishing

Error Handling

  • Model Failures: Automatic retry with exponential backoff

  • Rate Limiting: Queue management for API limits

  • Quality Issues: Content validation and regeneration

  • Timeout Handling: Graceful degradation for slow responses


πŸ“Š Performance Metrics

Key Metrics to Monitor

  • Response Time: Average processing duration

  • Success Rate: Percentage of successful completions

  • Cost per Operation: Credit consumption tracking

  • Quality Scores: Output relevance and accuracy

Optimization Strategies

  • Model Selection: Choose optimal model for each task

  • Prompt Engineering: Improve input quality

  • Batch Processing: Reduce overhead costs

  • Caching Strategy: Store and reuse common results


AI nodes are the core intelligence of AgenticFlow workflows. By combining different AI capabilities, you can create sophisticated automation that understands context, generates content, and makes intelligent decisions. The key to success is choosing the right AI model for each specific task and optimizing for both quality and cost.

Need help with AI node selection? Join our Discord community where AI experts share optimization strategies and implementation patterns.

Last updated

Was this helpful?