Audio Transcription
Build a workflow to transcribe audio files, identify different speakers, and use an LLM to generate a summary and extract key topics and quotes.
This guide demonstrates how to build a powerful workflow that not only transcribes audio files but also performs a high-level analysis, including generating a summary, identifying key themes, and extracting notable quotes. This is perfect for processing interviews, meetings, or any recorded audio.
Goal
The workflow will take an audio file as input and produce a structured analysis containing:
An accurate transcription with speaker labels and timestamps.
A high-level summary of the conversation.
A list of the main topics and themes discussed.
A collection of key quotes from the audio.
Required Nodes & MCPs
File Input Node: To upload the audio file (e.g., MP3, WAV, M4A).
OpenAI MCP: To use the Whisper model for transcription and a GPT model for analysis.
Save to File Node: To save the final, structured analysis.
Workflow Steps
Step 1: Transcribe the Audio
The first step is to convert the audio into text using OpenAI's Whisper model.
Node:
OpenAI MCPPurpose: To get a clean transcription of the audio file.
Setup:
Action:
Create TranscriptionFile: Use the output from your
File Inputnode:{{file_input_1.file}}.Response Format:
verbose_json(This provides detailed segments with timestamps).Enable Speaker Diarization:
True(This will label the different speakers, e.g., Speaker 0, Speaker 1).
Step 2: Analyze the Transcription
Now that we have the text, we'll use a powerful GPT model to analyze it.
Node:
OpenAI MCP(a second one)Purpose: To generate a summary, identify themes, and extract quotes.
Setup:
Action:
ChatModel:
gpt-4-turboPrompt:
Analyze the following audio transcription. Please provide a response in a structured JSON format with three keys: "summary", "themes", and "quotes". - "summary": A concise, one-paragraph summary of the entire conversation. - "themes": A list of the top 5 most important topics discussed. - "quotes": A list of the 3-5 most impactful or representative quotes from the text, including who said them if possible. Here is the transcription: {{openai_mcp_1.text}}
Step 3: Save the Structured Output
Finally, we'll save the JSON output from the analysis step into a file for easy access.
Node:
Save to FilePurpose: To store the complete analysis.
Setup:
File Name:
analysis_output.jsonContent:
{{openai_mcp_2.choices[0].message.content}}
Final Workflow
This workflow efficiently transforms a raw audio file into a structured, analyzable JSON object. You can easily access the summary, themes, or quotes in subsequent workflow steps for further processing, such as sending the summary in an email or saving the quotes to a database.
Last updated
Was this helpful?