Text to speech with voice clone

Action ID: text_to_speech_voice_clone

Description

Convert text to speech with voice cloning across 32 languages.

Connection

Name

Description

Required

Input Parameters

Name

Type

Required

Default

Description

voice_id

string

✓

Voice ID of the cloned voice to use

text

string

✓

Text to convert to speech

file_name

string

✓

Output file name for the generated audio

View JSON Schema

Input Schema

{
  "description": "Text To speech with voice clone node input.",
  "properties": {
    "voice_id": {
      "description": "Voice ID",
      "title": "Voice ID",
      "type": "string"
    },
    "text": {
      "description": "Text",
      "title": "Text",
      "type": "string"
    },
    "file_name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "File name",
      "title": "File name"
    }
  },
  "required": [
    "voice_id",
    "text",
    "file_name"
  ],
  "title": "TextToSpeechVoiceCloneNodeInput",
  "type": "object"
}

Output Parameters

Name

Type

Description

voice_url

string

URL of the generated audio file with cloned voice

View JSON Schema

Output Schema

{
  "description": "Text to speech voice clone node output.",
  "properties": {
    "voice_url": {
      "title": "Audio URL",
      "type": "string"
    }
  },
  "required": [
    "voice_url"
  ],
  "title": "TextToSpeechVoiceCloneNodeOutput",
  "type": "object"
}

How It Works

This node converts text to natural-sounding speech using a cloned voice profile. You provide a voice_id that references a previously cloned voice, along with the text you want to convert. The AI analyzes the voice characteristics from the cloned profile and generates speech that matches the tone, accent, and speaking style of the original voice across 32 supported languages.

Usage Examples

Example 1: Basic Voice Cloning

Input:

voice_id: "voice_abc123xyz"
text: "Hello, welcome to our platform. We're excited to have you here."
file_name: "welcome-message.mp3"

Output:

voice_url: "https://storage.pixelml.com/welcome-message.mp3"

Example 2: Multilingual Content

Input:

voice_id: "voice_def456uvw"
text: "Bonjour! Comment allez-vous aujourd'hui?"
file_name: "french-greeting.mp3"

Output:

voice_url: "https://storage.pixelml.com/french-greeting.mp3"

Example 3: Long-Form Content

Input:

voice_id: "voice_ghi789rst"
text: "In today's episode, we'll explore the fascinating world of artificial intelligence and how it's transforming the way we live and work. From machine learning to natural language processing, AI is revolutionizing every industry."
file_name: "podcast-intro.mp3"

Output:

voice_url: "https://storage.pixelml.com/podcast-intro.mp3"

Common Use Cases

Content Localization: Create multilingual audio content using the same voice across 32 languages
Podcast Production: Generate podcast episodes with consistent voice characteristics
Audiobook Creation: Convert written content to audiobooks with a specific narrator's voice
Video Narration: Create voiceovers for videos using cloned voice profiles
Virtual Assistants: Build personalized voice assistants with custom voice characteristics
E-Learning: Produce educational content with consistent instructor voices
Personalized Messages: Generate custom audio messages for customers or users

Error Handling

Error Type

Cause

Solution

Invalid Voice ID

Voice ID doesn't exist or is inaccessible

Verify the voice_id is correct and the voice has been properly cloned

Text Too Long

Input text exceeds maximum length

Split text into smaller chunks and process separately

Empty Text

Text field is empty or only whitespace

Provide valid text content to convert to speech

Language Not Supported

Text language is not among the 32 supported languages

Use text in one of the supported languages

Voice Profile Error

Voice clone profile is corrupted or incomplete

Re-clone the voice or use a different voice_id

Connection Failed

Unable to access PixelML API

Check PixelML connection credentials and API availability

Processing Timeout

Audio generation took too long

Try with shorter text or retry the operation

Notes

Voice Quality: The cloned voice quality depends on the quality and characteristics of the original voice sample used for cloning.
Language Support: This node supports 32 languages, making it ideal for international content creation.
Text Length: Longer text may take more time to process. Consider splitting very long content into smaller segments.
Voice Consistency: Using the same voice_id ensures consistent voice characteristics across multiple audio generations.
File Naming: Use descriptive file names to easily identify and organize your generated audio files.
Processing Time: Generation typically takes 5-15 seconds depending on text length and complexity.

PreviousText to speech custom NextText to video

Last updated 3 months ago

hashtagDescription

hashtagConnection

hashtagInput Parameters

hashtagInput Schema

hashtagOutput Parameters

hashtagOutput Schema

hashtagHow It Works

hashtagUsage Examples

hashtagExample 1: Basic Voice Cloning

hashtagExample 2: Multilingual Content

hashtagExample 3: Long-Form Content

hashtagCommon Use Cases

hashtagError Handling

hashtagNotes

Description

Connection

Input Parameters

Input Schema

Output Parameters

Output Schema

How It Works

Usage Examples

Example 1: Basic Voice Cloning

Example 2: Multilingual Content

Example 3: Long-Form Content

Common Use Cases

Error Handling

Notes