Audio Transcription

A guide to the Audio Transcription action for converting speech into written text.

The Audio Transcription Action (also known as Speech-to-Text) uses powerful AI models to convert spoken audio into written text. It can process audio files to produce an accurate, readable transcript.

This is essential for:

Transcribing interviews, meetings, or customer service calls.
Creating subtitles for videos.
Making audio content searchable.
Analyzing spoken content with text-based AI tools.

How It Works

You provide an audio file as input. The action sends this file to an advanced speech recognition model, which analyzes the audio and generates a text transcription. For the highest accuracy, it's best to use audio that is clear and has minimal background noise.

Configuration

Input Parameters

Parameter

Type

Description

Audio

File

The audio file you want to transcribe (e.g., MP3, WAV, M4A). This should be a file object from a previous action or an uploaded file.

Language

Select / Text

The language spoken in the audio. You can select from a list or, if left blank, the model will attempt to auto-detect the language.

Output Parameters

Parameter

Type

Description

Output

Text

The full text transcription of the audio file.

Example: Transcribe and Summarize a Customer Call

Imagine you want to automatically log and summarize every customer support call.

Get the Audio: Your workflow is triggered when a new call recording is saved. The audio file is the trigger's output.
Configure the Audio Transcription Action:
- Audio: Connect the call recording file from the trigger ({{trigger.audio_file}}).
- Language: Leave blank for auto-detection, or specify "English" if all your calls are in English.
Summarize the Transcript:
- The Output of the transcription action will be the full text of the conversation.
- Connect this output to an LLM Action.
- Set the LLM's prompt to: Please summarize the following customer service call transcript and identify the customer's main issue and the resolution: {{audio_transcription_action.output}}
Log the Summary:
- Connect the LLM's output to a Google Sheet Action or your CRM to log the call summary for future reference.

This workflow transforms an unstructured audio recording into a concise, actionable summary, saving significant manual effort.

PreviousImage and Video NextFace Swap

Last updated 1 month ago

Was this helpful?