gpt4 book ai didi

python - 我想在 azure 语音服务中添加单词级时间戳以进行连续识别

转载 作者:行者123 更新时间:2023-12-01 06:39:20 30 4
gpt4 key购买 nike

def speech_recognize_continuous_from_file():
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
speech_config.request_word_level_timestamps()
audio_config = speechsdk.audio.AudioConfig(filename=weatherfilename)

speech_recognizer =speechsdk.SpeechRecognizer(speech_config=speech_config,audio_config=audio_config)
done = False

def stop_cb(evt):
"""callback that stops continuous recognition upon receiving an event `evt`"""
print('CLOSING on {}'.format(evt))
speech_recognizer.stop_continuous_recognition()
nonlocal done
done = True

all_results = {}
all_results['output'] = []
def handle_final_result(evt):
all_results['output'].append({
'result_id': evt.result.result_id,
'text': evt.result.text
})

speech_recognizer.recognized.connect(handle_final_result)

# Connect callbacks to the events fired by the speech recognizer
speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
# stop continuous recognition on either session stopped or canceled events
speech_recognizer.session_stopped.connect(stop_cb)
speech_recognizer.canceled.connect(stop_cb)

# Start continuous speech recognition
result=speech_recognizer.start_continuous_recognition()
print(result.json)
stt = json.loads(result.json)
confidences_in_nbest = [item['Confidence'] for item in stt['NBest']]
best_index = confidences_in_nbest.index(max(confidences_in_nbest))
words = stt['NBest'][best_index]['Words']
print(words)
print(f"Word\tOffset\tDuration")
for word in words:
print(f"{word['Word']}\t{word['Offset']}\t{word['Duration']}")
while not done:
time.sleep(.5)

print("Printing all results:")
print(all_results)
with open('response.json', 'w') as outfile:
json.dump(all_results, outfile)

最佳答案

可以这样做,而不是执行此操作,而是可以将音频片段化为一堆音频文件,如果不接收完整文件,则将其提供给Azure。然后,最后将所有文件的输出合并为一个。

关于python - 我想在 azure 语音服务中添加单词级时间戳以进行连续识别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59527503/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com