π€AI & LLM Nodes
Complete reference for all AI and language model nodes in AgenticFlow.
AI nodes are the intelligence layer of AgenticFlow workflows, powered by state-of-the-art language models, image generators, and specialized AI services. These nodes transform data, generate content, analyze information, and make intelligent decisions in your automation workflows.
π§ AI Node Categories
Text Generation & Analysis (12 nodes)
Advanced language models for text processing, conversation, and content creation.
Document Analysis (8 nodes)
AI-powered document processing, OCR, and content extraction.
Image & Video Processing (15 nodes)
AI-driven visual content creation, editing, and analysis.
Audio Processing (6 nodes)
Speech synthesis, transcription, and audio generation.
Search & Knowledge (4 nodes)
Intelligent search, research, and knowledge retrieval.
π Text Generation & Analysis Nodes
Primary Language Models
OpenAI GPT Models
Node
Model
Use Cases
Cost Level
openai_ask_assistant
GPT-4/GPT-3.5
Complex reasoning, analysis
Medium-High
openai_ask_chat_gpt
GPT-4/GPT-3.5
Conversations, Q&A
Medium-High
Configuration Options:
Model Selection: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo
Temperature: 0.0-2.0 (creativity control)
Max Tokens: Response length limit
System Prompt: Role and behavior definition
Response Format: JSON, text, or structured output
Anthropic Claude
Node
Model
Strengths
Cost Level
claude_ask
Claude 3.5 Sonnet/Haiku
Analysis, writing, reasoning
Medium-High
Key Features:
Large Context Window: Handle extensive documents
Strong Reasoning: Excellent for analysis tasks
Safety Focused: Built-in content filtering
Code Understanding: Programming and technical content
Google Gemini
Node
Model
Features
Cost Level
google_gen_ai_ask_gemini
Gemini Pro/Flash
Multimodal, code generation
Medium
Capabilities:
Multimodal Input: Text, images, and documents
Code Generation: Programming in multiple languages
Real-time Data: Access to current information
Cost Effective: Competitive pricing
Specialized Models
Node
Provider
Specialization
Cost Level
pml_llm
PixelML
Custom fine-tuned models
Low-Medium
llama3
Meta
Open source, self-hosted
Low
straico_prompt_completion
Straico
Multi-model platform
Variable
Structured Data Extraction
Transform unstructured text into structured data formats.
Node
Input
Output
Use Cases
claude_extract_structured_data
Raw text
JSON/XML
Form processing, data mining
openai_extract_structured_data
Documents
Structured data
Invoice processing, resume parsing
string_to_json
Text
JSON
API response parsing
Common Extraction Patterns:
{
"contact_info": {
"name": "John Smith",
"email": "[email protected]",
"phone": "+1-555-123-4567"
},
"company": {
"name": "Tech Corp",
"industry": "Technology"
}
}
π Document Analysis Nodes
Text Extraction
Node
Input Formats
Output
Features
text_extract
PDF, DOC, images
Plain text
OCR, layout preservation
describe_image
Images
Text description
Visual analysis, content description
Content Processing
Node
Function
Use Cases
url_to_markdown
Convert web pages
Content archival, documentation
html_to_image
Render HTML as image
Report generation, screenshots
Supported Document Types:
PDFs: Text extraction, table parsing
Images: OCR, handwriting recognition
Web Pages: Content extraction, formatting preservation
Office Documents: Word, Excel, PowerPoint
π¨ Image & Video Processing Nodes
Image Generation
Create images from text descriptions using various AI models.
Node
Model
Style
Resolution
generate_image
Stable Diffusion
Realistic, artistic
Up to 1024x1024
generate_image_v2
Enhanced models
Professional quality
Up to 2048x2048
openai_generate_image
DALL-E 3
Creative, detailed
1024x1024, 1792x1024
straico_image_generate
Multiple models
Various styles
Model-dependent
imagine_v4
Midjourney-style
Artistic, creative
High resolution
Generation Parameters:
Prompt Engineering: Detailed descriptions, style modifiers
Aspect Ratios: Square, portrait, landscape
Style Controls: Photorealistic, artistic, cartoon
Quality Settings: Speed vs quality trade-offs
Image Enhancement & Manipulation
Node
Function
Technology
enhance_image_v2
Quality improvement
AI upscaling
magic_upscale
Resolution enhancement
Advanced upscaling
remove_background
Background removal
Object detection
face_swap
Face replacement
Facial recognition
face_detailer
Facial enhancement
Feature improvement
inpainting
Fill/remove objects
Content-aware fill
outpainting
Extend image borders
Context generation
Video Generation & Processing
Node
Input
Output
Duration
text_to_video
Text prompt
MP4 video
3-10 seconds
image_to_video
Static image
Animated video
3-5 seconds
image_to_video_v2
Enhanced input
Higher quality
5-10 seconds
image_to_video_v3
Latest model
Best quality
Up to 10 seconds
create_video
Multiple inputs
Custom video
Variable
edit_video
Existing video
Modified video
Original length
Video Capabilities:
Motion Generation: Natural movement from static images
Style Transfer: Apply artistic styles to video
Object Animation: Animate specific elements
Quality Enhancement: Upscaling and stabilization
π΅ Audio Processing Nodes
Speech Synthesis
Convert text to natural-sounding speech with various voices and styles.
Node
Provider
Voice Options
Languages
text_to_speech
Multiple
50+ voices
40+ languages
text_to_speech_custom
Custom voices
Brand-specific
Custom training
text_to_speech_voice_clone
Voice cloning
Personal voices
Any language
openai_text_to_speech
OpenAI
6 premium voices
Multiple languages
Voice Features:
Emotional Control: Happy, sad, excited, calm
Speed Control: 0.5x to 2.0x playback speed
Pitch Adjustment: Higher or lower pitch
SSML Support: Advanced speech markup
Speech Recognition
Node
Provider
Accuracy
Features
speech_to_text
Generic
High
Multiple formats
openai_transcriptions
Whisper
Very High
99+ languages, punctuation
Audio Generation
Node
Function
Output
text_to_music
Generate music
MP3, WAV
search_sound_in_yt_sound_lib
Find audio clips
YouTube library
π Search & Knowledge Nodes
Intelligent Search
Node
Data Source
Search Type
Results
knowledge_retrieval
Internal KB
Semantic
Relevant documents
openai_search
Vector database
Embedding-based
Contextual matches
google_search
Web
Traditional
Web pages
perplexity_search
Web + AI
AI-powered
Synthesized answers
research_deep_research
Multiple sources
Comprehensive
Research reports
Search Capabilities:
Semantic Understanding: Intent-based matching
Multi-source Aggregation: Combine multiple data sources
Contextual Ranking: Relevance-based result ordering
Real-time Updates: Access to current information
βοΈ AI Node Configuration
Common Parameters
Model Settings
{
"model": "gpt-4",
"temperature": 0.7,
"max_tokens": 1000,
"top_p": 0.9,
"frequency_penalty": 0.0
}
System Prompts
You are a professional assistant that helps with [specific task].
Guidelines:
- Be concise and accurate
- Provide actionable advice
- Ask clarifying questions when needed
- Format responses in [specific format]
Output Formatting
JSON Schema: Structured data output
Template-based: Consistent formatting
Conditional Logic: Adaptive responses
Multi-format: Text, JSON, CSV, etc.
Performance Optimization
Cost Management
Right-size Models: Use appropriate model for task complexity
Token Limits: Set reasonable response length limits
Batch Processing: Group similar requests
Caching: Store frequently used results
Quality Control
Temperature Settings: Balance creativity with consistency
Validation: Check output format and content
Fallback Models: Use backup models for high availability
A/B Testing: Compare model performance
π οΈ Integration Patterns
Common AI Workflows
Content Generation Pipeline
Input Data β AI Analysis β Content Generation β Quality Check β Output
Document Processing
Upload Document β Text Extraction β AI Analysis β Structured Data β Database
Conversational AI
User Input β Intent Detection β Knowledge Retrieval β Response Generation β User
Media Creation
Text Prompt β Image Generation β Enhancement β Video Creation β Publishing
Error Handling
Model Failures: Automatic retry with exponential backoff
Rate Limiting: Queue management for API limits
Quality Issues: Content validation and regeneration
Timeout Handling: Graceful degradation for slow responses
π Performance Metrics
Key Metrics to Monitor
Response Time: Average processing duration
Success Rate: Percentage of successful completions
Cost per Operation: Credit consumption tracking
Quality Scores: Output relevance and accuracy
Optimization Strategies
Model Selection: Choose optimal model for each task
Prompt Engineering: Improve input quality
Batch Processing: Reduce overhead costs
Caching Strategy: Store and reuse common results
AI nodes are the core intelligence of AgenticFlow workflows. By combining different AI capabilities, you can create sophisticated automation that understands context, generates content, and makes intelligent decisions. The key to success is choosing the right AI model for each specific task and optimizing for both quality and cost.
Need help with AI node selection? Join our Discord community where AI experts share optimization strategies and implementation patterns.
Last updated
Was this helpful?