gpt4 book ai didi

python - Azure 语音转文本 - 连续识别

转载 作者:行者123 更新时间:2023-12-02 22:57:23 25 4
gpt4 key购买 nike

我想了解 Azure 语音服务的准确性,特别是使用音频文件的语音到文本的准确性。

我一直在阅读文档https://learn.microsoft.com/en-us/python/api/azure-cognitiveservices-speech/?view=azure-python并尝试使用 MS Quickstar 页面中的建议代码。代码工作正常,我可以得到一些转录,但它只是转录音频的开头(第一句话):

import azure.cognitiveservices.speech as speechsdk

speechKey = 'xxx'
service_region = 'westus'

speech_config = speechsdk.SpeechConfig(subscription=speechKey, region=service_region, speech_recognition_language="es-MX")
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=False, filename='lala.wav')

sr = speechsdk.SpeechRecognizer(speech_config, audio_config)

es = speechsdk.EventSignal(sr.recognized, sr.recognized)

result = sr.recognize_once()

if result.reason == speechsdk.ResultReason.RecognizedSpeech:
print("Recognized: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.NoMatch:
print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print("Speech Recognition canceled: {}".format(cancellation_details.reason))
if cancellation_details.reason == speechsdk.CancellationReason.Error:
print("Error details: {}".format(cancellation_details.error_details))

根据文档,看起来我必须使用信号和事件来使用方法 start_continuous_recognition 来捕获完整的音频(该方法没有针对 python 进行记录,但看起来该方法和相关类已实现)。我尝试遵循 C# 和 Java 中的其他示例,但无法在 Python 中实现这一点。

有人能够做到这一点并提供一些指示吗?非常感谢!

最佳答案

检查 Azure python 示例:https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/python/console/speech_sample.py

或其他语言示例:https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples

基本上,如下:

def speech_recognize_continuous_from_file():
"""performs continuous speech recognition with input from an audio file"""
# <SpeechContinuousRecognitionWithFile>
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
audio_config = speechsdk.audio.AudioConfig(filename=weatherfilename)

speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

done = False

def stop_cb(evt):
"""callback that stops continuous recognition upon receiving an event `evt`"""
print('CLOSING on {}'.format(evt))
speech_recognizer.stop_continuous_recognition()
nonlocal done
done = True

# Connect callbacks to the events fired by the speech recognizer
speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
# stop continuous recognition on either session stopped or canceled events
speech_recognizer.session_stopped.connect(stop_cb)
speech_recognizer.canceled.connect(stop_cb)

# Start continuous speech recognition
speech_recognizer.start_continuous_recognition()
while not done:
time.sleep(.5)
# </SpeechContinuousRecognitionWithFile>

关于python - Azure 语音转文本 - 连续识别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54166387/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com