gpt4 book ai didi

Python:如何使用 pyaudio 为 Google Cloud Speech API 获取原始音频文件

转载 作者:太空宇宙 更新时间:2023-11-04 02:59:07 25 4
gpt4 key购买 nike

我在 linux 上使用下面链接中给出的程序。

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/cloud-client/quickstart.py

我面临的问题是如何使用 pyaudio 通过麦克风录制我自己的原始音频文件,以使用上述程序获取我录制内容的文本。

我有下面的 pyaudio 程序,但它给了我 wav 文件。但我想为 google cloud speech api 保存原始音频文件。我不想将 wav 转换为原始音频文件。我直接想使用 pyaudio 保存原始音频文件。

import pyaudio
import wave

FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
CHUNK = 1024
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "file.wav"

audio = pyaudio.PyAudio()

# start Recording
stream = audio.open(format=FORMAT, channels=CHANNELS,
rate=RATE, input=True,
frames_per_buffer=CHUNK)
print "recording..."
frames = []

for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print "finished recording"


# stop Recording
stream.stop_stream()
stream.close()
audio.terminate()

waveFile = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
waveFile.setnchannels(CHANNELS)
waveFile.setsampwidth(audio.get_sample_size(FORMAT))
waveFile.setframerate(RATE)
waveFile.writeframes(b''.join(frames))
waveFile.close()

最佳答案

我找到了答案。抱歉发布问题。我是编程新手..

import pyaudio
import wave

FORMAT = pyaudio.paInt16

CHANNELS = 1
RATE = 16000
CHUNK = int(RATE / 10)
RECORD_SECONDS = 5

audio = pyaudio.PyAudio()

# start Recording
stream = audio.open(format=FORMAT, channels=CHANNELS,
rate=RATE, input=True,
frames_per_buffer=CHUNK)
print "recording..."
frames = []

for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print "finished recording"


# stop Recording
stream.stop_stream()
stream.close()
audio.terminate()



file = open("newfile.raw", "w")
file.write(b''.join(frames))
file.close()

关于Python:如何使用 pyaudio 为 Google Cloud Speech API 获取原始音频文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41533047/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com