gpt4 book ai didi

python - 如何在函数之间传递已编辑的WAV而不在两者之间保存WAV?

转载 作者:行者123 更新时间:2023-12-02 23:01:24 25 4
gpt4 key购买 nike

我有2个人的WAV对话(客户和技术支持)
我有3个独立的功能,可提取1个声音,缩短10秒钟并将其转换为嵌入。

def get_customer_voice(file):

print('getting customer voice only')
wav = wf.read(file)
ch = wav[1].shape[1]#customer voice always in 1st track
sr = wav[0]
c1 = wav[1][:,1]
#print('c0 %i'%c0.size)

if ch==1:
exit()
vad = VoiceActivityDetection()
vad.process(c1)
voice_samples = vad.get_voice_samples()
#this is trouble - how to pass it without saving anywhere as wav?
wf.write('%s_customer.wav'%file,sr,voice_samples)
下面的功能比上面的功能减少10秒的wav文件。
import sys
from pydub import AudioSegment

def get_customer_voice_10_seconds(file):
voice = AudioSegment.from_wav(file)
new_voice = voice[0:10000]
file = str(file) + '_10seconds.wav'
new_voice.export(file, format='wav')


if __name__ == '__main__':
if len(sys.argv) < 2:
print('give wav file to process!')
else:
print(sys.argv)
get_customer_voice_10_seconds(sys.argv[1])
如何将其以wav或其他格式传递而不将其保存到某个目录?它将在rest api中使用,我不知道它将在哪里保存该wav,因此最好以某种方式传递它。

最佳答案

我想通了-下面的功能可以正常工作而无需保存,缓冲区等
它接收一个wav文件并对其进行编辑,然后直接发送给get math embedding函数:

def get_customer_voice_and_cutting_10_seconds_embedding(file):

print('getting customer voice only')
wav = read(file)
ch = wav[1].shape[1]
sr = wav[0]

c1 = wav[1][:,1]

vad = VoiceActivityDetection()
vad.process(c1)
voice_samples = vad.get_voice_samples()
audio_segment = AudioSegment(voice_samples.tobytes(), frame_rate=sr,sample_width=voice_samples.dtype.itemsize, channels=1)
audio_segment = audio_segment[0:10000]
file = str(file) + '_10seconds.wav'

return get_embedding(file)
关键是音频段中的tobytes(),它将它们全部重新组合到1个轨道中

关于python - 如何在函数之间传递已编辑的WAV而不在两者之间保存WAV?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63467345/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com