Lipsync

Action ID: lipsync

Description

Synchronize lip movements in a video to match a provided audio track.

Connection

Name
Description
Required
Category

PixelML Connection

The PixelML connection to call PixelML API.

True

pixelml

Input Parameters

Name
Type
Required
Default
Description

input_video_url

string

-

Input video URL to synchronize

audio_url

string

-

Audio URL to match with video lip movements

file_name

string

-

Output file name for the lip-synced video

View JSON Schema

Input Schema

{
  "description": "Lipsync node input.",
  "properties": {
    "input_video_url": {
      "description": "Input video URL",
      "title": "Input video URL",
      "type": "string"
    },
    "audio_url": {
      "description": "Audio URL",
      "title": "Audio URL",
      "type": "string"
    },
    "file_name": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "description": "File name",
      "title": "File name"
    }
  },
  "required": [
    "input_video_url",
    "audio_url",
    "file_name"
  ],
  "title": "LipsyncNodeInput",
  "type": "object"
}

Output Parameters

Name
Type
Description

video_url

string

URL of the lip-synced video

View JSON Schema

Output Schema

{
  "description": "Lipsync node output.",
  "properties": {
    "video_url": {
      "title": "Video URL",
      "type": "string"
    }
  },
  "required": [
    "video_url"
  ],
  "title": "LipsyncNodeOutput",
  "type": "object"
}

How It Works

This node uses advanced AI technology to analyze both the input video and audio track, then automatically synchronizes the lip movements of any visible faces in the video to match the provided audio. The process involves detecting faces in the video, analyzing the audio waveform and phonemes, and generating precise lip movements that naturally match the spoken words or sounds in the audio track.

Usage Examples

Example 1: Video Dubbing

Input:

input_video_url: "https://example.com/original-video.mp4"
audio_url: "https://example.com/translated-audio.mp3"
file_name: "dubbed-video.mp4"

Output:

video_url: "https://storage.pixelml.com/dubbed-video.mp4"

Example 2: Voice Replacement

Input:

input_video_url: "https://cdn.example.com/interview.mp4"
audio_url: "https://cdn.example.com/cleaned-audio.wav"
file_name: "enhanced-interview.mp4"

Output:

video_url: "https://storage.pixelml.com/enhanced-interview.mp4"

Example 3: Content Localization

Input:

input_video_url: "https://media.example.com/tutorial-english.mp4"
audio_url: "https://media.example.com/tutorial-spanish.mp3"
file_name: "tutorial-spanish.mp4"

Output:

video_url: "https://storage.pixelml.com/tutorial-spanish.mp4"

Common Use Cases

  • Video Dubbing: Create dubbed versions of videos in different languages with matching lip movements

  • Content Localization: Adapt video content for international audiences with synchronized audio

  • Audio Quality Enhancement: Replace poor-quality audio in videos while maintaining natural lip sync

  • Voiceover Matching: Synchronize professional voiceovers with existing video footage

  • Dialogue Replacement: Replace or edit dialogue in videos while maintaining realistic lip movements

  • Video Production: Create polished video content with perfect audio-visual synchronization

  • Educational Content: Produce multi-language educational videos with accurate lip sync

Error Handling

Error Type
Cause
Solution

No Face Detected

Video does not contain visible faces

Ensure the video has clearly visible faces for lip sync processing

Invalid Video Format

Video format is not supported

Use common video formats like MP4, MOV, or AVI

Invalid Audio Format

Audio format is not supported

Convert audio to MP3, WAV, or AAC format

Video Too Long

Video exceeds maximum duration limit

Split the video into shorter segments and process separately

Audio Mismatch

Audio duration significantly differs from video

Ensure audio length is compatible with the video duration

Connection Failed

Unable to access PixelML API

Check PixelML connection credentials and API availability

Processing Timeout

Processing took longer than expected

Try with a shorter video or retry the operation

Notes

  • Video Quality: Higher resolution videos will produce better lip sync results, but may take longer to process.

  • Face Visibility: Ensure faces are clearly visible and well-lit for optimal lip synchronization.

  • Audio Quality: Use clear audio files for best results. Background noise may affect synchronization accuracy.

  • Processing Time: Lip sync processing can take several minutes depending on video length and complexity.

  • Multiple Faces: The node can handle multiple faces in a video, synchronizing all visible speakers.

  • File Naming: Use descriptive file names to easily identify your lip-synced videos in storage.

Last updated

Was this helpful?