gpt4 book ai didi

python - 如何将实时音频 url 传递给 Google Speech to Text API

转载 作者:太空宇宙 更新时间:2023-11-04 04:05:30 25 4
gpt4 key购买 nike

我有一个现场录音的网址,我正尝试使用 Google Speech to Text API 转录该录音。我正在使用 Cloud Speech to Text API 中的示例代码。但是,问题是当我传递实时 url 时,我没有收到任何输出。以下是我的代码的相关部分。任何帮助将不胜感激!

from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
import io
import os
import time
import requests
import numpy as np
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
from urllib.request import urlopen
from datetime import datetime
from datetime import timedelta
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]= "app_creds.json"

def get_stream():

stream = urlopen('streamurl')

duration = 60
begin = datetime.now()
duration = timedelta(seconds=duration)

while datetime.now() - begin < duration:

data = stream.read(8000)

return data

def transcribe_streaming():
"""Streams transcription of the given audio file."""
client = speech.SpeechClient()

content = get_stream()

# In practice, stream should be a generator yielding chunks of audio data.
stream = [content]
requests = (types.StreamingRecognizeRequest(audio_content=chunk)
for chunk in stream)

config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')

streaming_config = types.StreamingRecognitionConfig(config=config)

# streaming_recognize returns a generator.
responses = client.streaming_recognize(streaming_config, requests)

for response in responses:
# Once the transcription has settled, the first result will contain the
# is_final result. The other results will be for subsequent portions of
# the audio.
for result in response.results:
print('Finished: {}'.format(result.is_final))
print('Stability: {}'.format(result.stability))
alternatives = result.alternatives
# The alternatives are ordered from most likely to least.
for alternative in alternatives:
print('Confidence: {}'.format(alternative.confidence))
print(u'Transcript: {}'.format(alternative.transcript))


最佳答案

向 Google 语音服务发送音频时,请确保服务对象设置与音频编码匹配。在您的特定情况下

config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code='en-US')

对应单声道,16KHz,线性16位PCM编码。查看list of other supported encodings如果您需要转录不同格式的音频。

关于python - 如何将实时音频 url 传递给 Google Speech to Text API,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57426836/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com