专栏名称: Dance with GenAI

关于生成式人工智能AIGC的一切

用谷歌Colab免费批量将本地电脑上的Mp3语音文件转文字

Dance with GenAI · 公众号 · · 2024-10-17 17:22

正文

首先在谷歌硬盘上传Mp3语音文件

可以下载电脑版谷歌drive软件：Google Drive for desktop，使用更方便：

音频很快自动同步上传。

接下来使用Openai的whisper模型： https:// github.com/openai/whisp er

在ChatGPT中输入提示词：

你是一个编程高手，写一个谷歌colab的ipynb脚本，实现任务如下：

从huggingface下载Whisper large-v3-turbo语音转录模型文件，然后保存到谷歌Drive中的myaudio文件夹中；

读取谷歌Drive中的myaudio文件目录中所有子文件夹中的音频文件；

从谷歌Drive中调用Whisper large-v3-turbo模型将所有音频文件转录成文字，保存为txt文本文件，txt文件名和音频文件名保持同一个名称，txt文件保存在和音频文件的同一个文件夹中；

注意：

安装pydub库；

安装ffmpeg；

在调用Whisper模型时显式指定语言为英文；

代码示例：

import whisper

model = whisper.load_model("turbo")

result = model.transcribe("audio.mp3")

print(result["text"])

ChatGPT生成的源代码：

# 安装必要的库

!pip install pydub

!apt-get install ffmpeg

!pip install -q git+ https:// github.com/openai/whisp er.git

!pip install tqdm

import os

import whisper

from tqdm import tqdm

from google.colab import drive

from pydub import AudioSegment

# 挂载 Google Drive

drive.mount('/content/drive')

# 下载并加载 Whisper 模型

model = whisper.load_model("large-v3-turbo")

# 设置音频文件夹路径

audio_folder = '/content/drive/MyDrive/myaudio'

# 遍历所有子文件夹，找到音频文件

for root, dirs, files in os.walk(audio_folder):

for file in tqdm(files):

if file.endswith(('.mp3', '.wav', '.m4a')): # 支持的音频格式

audio_path = os.path.join(root, file)

print(f"正在转录: {audio_path}")

# 转录音频并指定语言为英文

result = model.transcribe(audio_path, language="en")

transcript = result['text']

# 保存转录文本

txt_filename = os.path.splitext(file)[0] + '.txt'

txt_path = os.path.join(root, txt_filename)

with open(txt_path, 'w') as txt_file:

txt_file.write(transcript)

请到「今天看啥」查看全文

推荐文章

比亚迪汽车 · 国货之光·悦己宜家 | 方程豹豹5智驾版，正式上市

昨天

比亚迪汽车 · 汉家族智驾版｜轻松应对通勤难题

2 天前

蔚来 · 你与春天的距离，就差一场雨

5 天前

高工智能汽车 · 竞争升级！比亚迪前脚普及高阶智驾，其他车企L3后脚就到

4 天前

高工智能汽车 · 奖项申报｜年度智能汽车产业链「百强」供应商评选进行时

5 天前

虎嗅APP · 早报 | 三星电子副会长李在镕被批捕；苹果上线 WWDC 宣传页面

8 年前

车早茶 · 技术要多牛X才能把车开成这样......

7 年前

i黑马 · 李笑来：我的专栏为什么能卖2600万？

7 年前

待字闺中 · 深度学习如何帮助搜索引擎提高效果？

7 年前

萌宠萌 · 一个故事，愿天下铲屎官和自己的主子永不分离！

7 年前