gpt4 book ai didi

Google Speech - 流媒体请求返回 EOF 错误

转载 作者:IT王子 更新时间:2023-10-29 01:09:02 29 4
gpt4 key购买 nike

我使用 Go 获取 RTMP 流,将其转码为 FLAC(使用 ffmpeg)并尝试流式传输到 Google 的 Speech API 以转录音频。但是,我在发送数据时不断收到 EOF 错误。我在文档中找不到有关此错误的任何信息,因此我不确定是什么原因造成的。

我将接收到的数据分 block 为 3 秒的片段(长度不相关,只要它小于流识别请求的最大长度)。

这是我的代码的核心:

func main() {

done := make(chan os.Signal)
received := make(chan []byte)

go receive(received)
go transcribe(received)

signal.Notify(done, os.Interrupt, syscall.SIGTERM)

select {
case <-done:
os.Exit(0)
}
}

func receive(received chan<- []byte) {
var b bytes.Buffer
stdout := bufio.NewWriter(&b)

cmd := exec.Command("ffmpeg", "-i", "rtmp://127.0.0.1:1935/live/key", "-f", "flac", "-ar", "16000", "-")
cmd.Stdout = stdout

if err := cmd.Start(); err != nil {
log.Fatal(err)
}

duration, _ := time.ParseDuration("3s")
ticker := time.NewTicker(duration)

for {
select {
case <-ticker.C:
stdout.Flush()
log.Printf("Received %d bytes", b.Len())
received <- b.Bytes()
b.Reset()
}
}
}

func transcribe(received <-chan []byte) {
ctx := context.TODO()

client, err := speech.NewClient(ctx)
if err != nil {
log.Fatal(err)
}

stream, err := client.StreamingRecognize(ctx)
if err != nil {
log.Fatal(err)
}

// Send the initial configuration message.
if err = stream.Send(&speechpb.StreamingRecognizeRequest{
StreamingRequest: &speechpb.StreamingRecognizeRequest_StreamingConfig{
StreamingConfig: &speechpb.StreamingRecognitionConfig{
Config: &speechpb.RecognitionConfig{
Encoding: speechpb.RecognitionConfig_FLAC,
LanguageCode: "en-GB",
SampleRateHertz: 16000,
},
},
},
}); err != nil {
log.Fatal(err)
}

for {
select {
case data := <-received:
if len(data) > 0 {
log.Printf("Sending %d bytes", len(data))
if err := stream.Send(&speechpb.StreamingRecognizeRequest{
StreamingRequest: &speechpb.StreamingRecognizeRequest_AudioContent{
AudioContent: data,
},
}); err != nil {
log.Printf("Could not send audio: %v", err)
}
}
}
}
}

运行此代码会得到以下输出:

2017/10/09 16:05:00 Received 191704 bytes
2017/10/09 16:05:00 Saving 191704 bytes
2017/10/09 16:05:00 Sending 191704 bytes
2017/10/09 16:05:00 Could not send audio: EOF

2017/10/09 16:05:03 Received 193192 bytes
2017/10/09 16:05:03 Saving 193192 bytes
2017/10/09 16:05:03 Sending 193192 bytes
2017/10/09 16:05:03 Could not send audio: EOF

2017/10/09 16:05:06 Received 193188 bytes
2017/10/09 16:05:06 Saving 193188 bytes
2017/10/09 16:05:06 Sending 193188 bytes // Notice that this doesn't error

2017/10/09 16:05:09 Received 191704 bytes
2017/10/09 16:05:09 Saving 191704 bytes
2017/10/09 16:05:09 Sending 191704 bytes
2017/10/09 16:05:09 Could not send audio: EOF

请注意,并非所有的Send 都会失败。

有人能给我指出正确的方向吗?它与 FLAC header 有关吗?我还想知道重置缓冲区是否会导致某些数据被丢弃(即这是一个非常重要的操作,实际上需要一些时间才能完成)并且它不喜欢这种丢失的信息?

任何帮助将不胜感激。

最佳答案

因此,事实证明有一种方法可以获取有关流状态的更多信息,因此我们不必只依赖于返回的错误。

if err := stream.Send(&speechpb.StreamingRecognizeRequest{
StreamingRequest: &speechpb.StreamingRecognizeRequest_AudioContent{
AudioContent: data,
},
}); err != nil {
resp, err := stream.Recv()
log.Printf("Could not send audio: %v", resp.GetError())
}

这打印:

2017/10/16 17:14:53 Could not send audio: code:3 message:"Invalid audio content: too long."

这是更有帮助的错误消息!

关于Google Speech - 流媒体请求返回 EOF 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46650835/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com