gpt4 book ai didi

python - 如何使用 Azure 认知服务同时听和说

转载 作者:行者123 更新时间:2023-12-03 02:04:12 27 4
gpt4 key购买 nike

我正在尝试构建一个能够同时说话和倾听的聊天机器人。我正在使用 azure 认知服务,目前使用两个函数来听和说:

说话:

def speak(input,voice="en-US-ChristopherNeural"):

audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
speech_config.speech_synthesis_voice_name=voice
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)

speech_synthesis_result = speech_synthesizer.speak_text_async(input).get()
if speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = speech_synthesis_result.cancellation_details
print("Azure Speech synthesis canceled: {}".format(cancellation_details.reason))

return True

听力:

def listen(language):
speech_config.speech_recognition_language=language
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

print("Speak into your microphone.")
speech_recognition_result = speech_recognizer.recognize_once_async().get()

if speech_recognition_result.reason == speechsdk.ResultReason.RecognizedSpeech:
print("Recognized: {}".format(speech_recognition_result.text))

return speech_recognition_result.text

如果用户开始说话,我希望能够打断语音文本。这意味着我必须不断聆听麦克风输入以确定语音尝试。

我目前正在尝试多线程,但是我尝试的示例都被线路阻塞:

speech_synthesis_result = speech_synthesizer.speak_text_async(input).get()

这是我到目前为止所拥有的,但它什么也没说:

import asyncio
import os
import azure.cognitiveservices.speech as speechsdk
from azure.cognitiveservices.speech import SpeechConfig, SpeechSynthesisOutputFormat, SpeechSynthesizer

# Replace with your own subscription key and region identifier
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), region=os.environ.get('SPEECH_REGION'))

# Set up the speech synthesizer


# Define the phrase to be spoken
phrase = "Hello, I'm a chatbot. How can I help you today? You can interrupt me whenever you want"

async def listen_for_user_input():

speech_config.speech_recognition_language="en-US-ChristopherNeural"
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

result = await speech_recognizer.start_continuous_recognition_async()
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
print("Recognized: {}".format(result.text))
await speech_recognizer.stop_continuous_recognition_async()
pass

async def speak_phrase(phrase):
audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
# Speak the defined phrase
result = await synthesizer.speak_text_async(phrase)
if result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print("Azure Speech synthesis canceled: {}".format(cancellation_details.reason))


# Start the program
async def main():
task1 = asyncio.create_task(speak_phrase(phrase))
task2 = asyncio.create_task(listen_for_user_input())
done, pending = await asyncio.wait([task1, task2], return_when=asyncio.FIRST_COMPLETED)
for task in pending:
task.cancel()

最佳答案

截至目前,适用于 Python 的 Azure 语音 SDK(版本 1.x.x)不支持 asyncio/await 的异步方法,我们正在努力在下一个主要版本中提供该支持。请参阅下面并行运行语音到文本和文本到语音的示例中的替代方法和修改方法。

import threading

def listen_for_user_input():
print("Start listen_for_user_input")
transcription_done = threading.Event()
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)

def recognized_cb(evt):
if evt.result.reason == speechsdk.ResultReason.RecognizedSpeech:
print('RECOGNIZED: {}'.format(evt))
elif evt is not None and evt.result.reason == speechsdk.ResultReason.NoMatch:
print('NOMATCH: {}'.format(evt))
transcription_done.set()

def canceled_cb(evt):
try:
if evt.result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = evt.result.cancellation_details
print('CANCELED: {}'.format(cancellation_details.reason))
if cancellation_details.reason == speechsdk.CancellationReason.Error:
print('Error details: {}'.format(cancellation_details.error_details))
transcription_done.set()
except Exception as e:
print(e)

speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
speech_recognizer.recognized.connect(recognized_cb)
speech_recognizer.canceled.connect(canceled_cb)

speech_recognizer.start_continuous_recognition_async()
transcription_done.wait()
speech_recognizer.stop_continuous_recognition_async()
print("Stop listen_for_user_input")

def speak_phrase():
print("Start speak_phrase")
audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
# Speak the defined phrase
result = synthesizer.speak_text_async(phrase).get()
if result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print("Azure Speech synthesis canceled: {}".format(cancellation_details.reason))
print("Stop speak_phrase")

if __name__ == "__main__":
print("Start main program")
# create threads for parallel execution
p1 = threading.Thread(target=speak_phrase)
p2 = threading.Thread(target=listen_for_user_input)

# start both threads
p1.start()
p2.start()

# wait for both threads to finish
p1.join()
p2.join()
print("Stop main program")

关于python - 如何使用 Azure 认知服务同时听和说,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/75298275/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com