Audio Transcription + High-Level Analysis

Transcribe audio (speaker and timestamp) with optional high-level analysis (topic identification, quote extraction, and summary generation)

Introduction

Welcome to the documentation for the "Audio Transcription + High-Level Analysis" Workflow! This Workflow is designed to transcribe audio files and provide a high-level summary of the main themes/topics as well as quote extraction. Whether you are a journalist, researcher, or business professional, this Workflow will assist you in quickly extracting valuable information from audio recordings. With its advanced speech recognition technology and intuitive interface, this Workflow provides you with high-quality transcription and high-level analysis within seconds.

Overview

The "Audio Transcription + High-Level Analysis" Workflow leverages state-of-the-art speech recognition algorithms to transcribe audio files accurately. It goes beyond simple transcription and provides metadata such as speaker labels and timestamps. There is an option for high-level theme and topic identification plus generating a summary, which allows you to quickly grasp the key points and insights. By combining audio transcription with high-level analysis relying on large language models, this Workflow enables you to save time and effort in extracting valuable information from your audio recordings. With its powerful capabilities and user-friendly design, it is the perfect solution for efficient audio analysis.

Key Features

Accurate Audio Transcription

The "Audio Transcription + High-Level Analysis" Workflow utilizes advanced speech recognition technology to transcribe audio files accurately. It can handle various audio formats, including MP3, WAV, and M4A, and convert them into text with high precision. This feature ensures that you receive reliable and accurate transcriptions, saving you time and effort in manual transcription.

When working with video files, to avoid upload issues, extract the audio before uploading and directly work with the audio content.

Theme/Topic Identification

In addition to transcription, the Workflow identifies the main discussed themes and topics. It analyzes the transcribed text and extracts key points relying on state-of-the-art language models. This frees you from having to listen to the recording repeatedly to understand the gist.

High-Level Summary

The Workflow provides a high-level summary of the audio content. It analyzes the transcribed text and extracts key points, main ideas, and important details. This summary allows you to quickly grasp the essence of the audio recording without having to listen to the entire file. It is a valuable time-saving feature for busy professionals who need to extract information efficiently.

Customizable Analysis

You have the flexibility to customize the level of detail in the high-level analysis. Whether you want to exclude the first speaker, a concise summary or a more detailed overview, you can specify the desired depth of analysis. This customization ensures that the high-level summary aligns with your specific needs and preferences.

Speaker Identification

The Workflow can identify different speakers in the audio recording. This speaker identification feature is particularly useful in interviews, panel discussions, or any audio recording with multiple participants. It allows you to differentiate between speakers and understand who said what in the transcribed text. The first heard voice is labeled with 1, the second one is 2 and so on.

To emphasize interviewees, make sure to trim your audio files in a way that the first heard voice is the interviewer/moderator. This way, you can use the "Exclude the first speaker" option.

Time Stamps

The Workflow provides timestamps in the transcribed text, indicating the exact timing of each segment/utterance. These timestamps enable you to navigate through the audio recording and locate specific parts of interest quickly. It is a convenient feature for reviewing and referencing the audio content.

How to Use the Workflow

Locate the Workflow in the template page and click on "Use template." You can use the Workflow as is or clone it.

Steps to Transcribe and Analyze Audio Files

  1. Upload Audio File: Select the audio file you want to transcribe and analyze. The Workflow supports various audio formats, including MP3, WAV, and M4A. Ensure that the audio file is accessible and available for upload.

  2. Customize Analysis (Optional): If desired, you can customize the level of detail in the high-level analysis. Select your desired options:

    • Only transcribe / Transcribe and further analysis (i.e., summary over themes and quote extraction)

    • Keep all utterances / Exclude speaker 1

    • Instructions for summary and quotes

    • Summary and quote extraction will be done only when the "Transcribe and further analysis" option is selected. If left unspecified, the default analysis settings will be used.

  3. Run the Workflow: Once you have provided the input and configured your desired setup, click the “Run Workflow” button (on the App page) or use the run options on your data table (bulk/single run) to initiate the analysis process.

  4. View Results: The Workflow will display the transcribed text and provide a high-level summary of the audio content in separate tabs (columns in bulk run). You can explore the transcribed text, review the speaker identification, and navigate through the timestamps. The high-level summary will help you quickly grasp the key points and insights from the audio recording.

Bulk Run

Apply audio Workflows to multiple files at the same time:

  1. Upload your audio files together in a data table.

  2. Add an enrichment/bulk-run using the desired audio Workflow.

  3. Select the file_url field from the table to access the audio file for the bulk-run. (Note: The URLs expire within 72 hours from upload.)

  4. When the source file name is required, use the source_file_name field from the table.

Deep Dive in the Workflow

Workflow Components

If you clone a template or create a Workflow from scratch, you will have access to the Build tab. This is where different components are put together to build a Workflow suitable for your needs.

User Inputs

User Inputs:

  • File to Text: An easy-to-use, one-step component that takes care of all you need when uploading a file and extracting text from it.

  • Text Input: An input text component suitable for short text pieces, such as names, topics, or questions.

Workflow Actions

  • Large Language Model (LLM): A component set up to provide access to GPT and other LLMs. In the prompt section, you will provide the required information and instructions for what needs to be done.

Crafting a Good Prompt

  • Provide the context at the top.

  • Be short and precise with your instructions/request.

  • Explicitly note constraints and goals.

  • Include a few examples when possible.

By following these steps and utilizing the powerful features of the "Audio Transcription + High-Level Analysis" Workflow, you can efficiently transcribe and analyze your audio recordings to gain valuable insights and save time.

Last updated