I am sending the audio file from a WhatsApp voice message (WhatsApp business cloud API) to google speech to text recognition.
The very weird thing is that it works for voice messages sent from a windows whatsapp client.
So having the official WhatsApp windows WhatsApp program send a voice message does work.
But it does not work when sending the voice message from an android whatsapp app.
So there seems to be different encoding for the voice audio file.
Both have Opus Audio codec and both have sample rate of 48000 Hz.
Both files can be played with VLC, but the android file is much smaller
Any help or ideas on that?
我将WhatsApp语音消息(WhatsApp业务云API)中的音频文件发送到Google语音到文本识别。非常奇怪的是,它适用于从Windows WhatsApp客户端发送的语音消息。因此,让官方的WhatsApp Windows WhatsApp程序发送语音消息是可行的。但当从Android WhatsApp应用程序发送语音消息时,它不起作用。因此,语音音频文件似乎有不同的编码。两者都有Opus音频编解码器,都有48000赫兹的采样率。这两个文件都可以用VLC播放,但Android文件要小得多,有什么帮助或想法吗?
Not sure if that info helps: The working audio file from windows whatsapp desktop has bits per sample 32. The not working one from android has no information about bits per sample. Also the not working file is much smaller.
不确定这些信息是否有帮助:Windows WhatsApp桌面上的工作音频文件的位数为每样本32位。来自Android的不工作的那个没有关于每个样本的位数的信息。此外,不工作的文件要小得多。
Try transcoding it to 16 bits per sample?
The problem is that Google speech to text didn't recognize the Android audio file? I recomend trasncode the files to Google requeriments
It happened to me too, you just need to convert the file to WAV and then you don't need to specify sampleRate nor encoding (source) when providing it to the google speech API.
这也发生在我身上,你只需要把文件转换成wav,然后你不需要指定sampleRate或编码(来源),当你把它提供给Google Speech API时。
For the conversion this answer worked for me, just change 'mp3' to 'wav'.
If another answer also answers this question, why not just flag or vote to close the question as a duplicate rather than post a low-quality answer?