node.js - 语音转文本 : Piping microphone stream to Watson STT with NodeJS-6ren

node.js - 语音转文本 : Piping microphone stream to Watson STT with NodeJS

转载作者：太空宇宙更新时间：2023-11-04 01:31:30

我当前正在尝试将麦克风流发送到 Watson STT 服务，但由于某种原因，Watson 服务未接收该流(我猜测)，因此我收到错误“错误:30 秒内未检测到语音”。

请注意，我已将 .wav 文件传输到 Watson，并且还测试了将 micInputStream 通过管道传输到本地文件，因此我知道两者至少都已正确设置。我对 NodeJS/javascript 相当陌生，所以我希望错误可能是明显的。

const fs = require('fs');
const mic = require('mic');
var SpeechToTextV1 = require('watson-developer-cloud/speech-to-text/v1');

var speechToText = new SpeechToTextV1({
  iam_apikey: '{key_here}',
  url: 'https://stream.watsonplatform.net/speech-to-text/api'
});

var params = {
  content_type: 'audio/l16; rate=44100; channels=2',
  interim_results: true
};

const micParams = { 
    rate: 44100, 
    channels: 2, 
    debug: false, 
    exitOnSilence: 6
  }
  const micInstance = mic(micParams);
  const micInputStream = micInstance.getAudioStream();

  micInstance.start();
  console.log('Watson is listening, you may speak now.');

// Create the stream.
var recognizeStream = speechToText.recognizeUsingWebSocket(params);

// Pipe in the audio.
var textStream = micInputStream.pipe(recognizeStream).setEncoding('utf8');

textStream.on('data', user_speech_text => console.log('Watson hears:', user_speech_text));
textStream.on('error', e => console.log(`error: ${e}`));
textStream.on('close', e => console.log(`close: ${e}`));

最佳答案

结论:最后，我并不完全确定代码出了什么问题。我猜这与麦克风封装有关。我最终放弃了该包并使用“Node-audiorecorder”代替我的音频流 https://www.npmjs.com/package/node-audiorecorder

注意:此模块要求您安装 SoX，并且它必须在您的 $PATH 中可用。 http://sox.sourceforge.net/

更新的代码:对于任何想知道我的最终代码是什么样子的人，请看这里。还要大力感谢 NikolayShmyrev 试图帮助我编写代码!

很抱歉收到大量评论，但对于新项目，我想确保我知道每一行都在做什么。

    // Import module.
    var AudioRecorder = require('node-audiorecorder');
    var fs = require('fs');
    var SpeechToTextV1 = require('watson-developer-cloud/speech-to-text/v1');


    /******************************************************************************
    * Configuring STT
    *******************************************************************************/
    var speechToText = new SpeechToTextV1({
        iam_apikey: '{your watson key here}',
        url: 'https://stream.watsonplatform.net/speech-to-text/api'
    });

    var recognizeStream = speechToText.recognizeUsingWebSocket({
        content_type: 'audio/wav',
        interim_results: true
      });


    /******************************************************************************
    * Configuring the Recording
    *******************************************************************************/
    // Options is an optional parameter for the constructor call.
    // If an option is not given the default value, as seen below, will be used.
    const options = {
        program: 'rec',     // Which program to use, either `arecord`, `rec`, or `sox`.
        device: null,       // Recording device to use.

        bits: 16,           // Sample size. (only for `rec` and `sox`)
        channels: 2,        // Channel count.
        encoding: 'signed-integer',  // Encoding type. (only for `rec` and `sox`)
        rate: 48000,        // Sample rate.
        type: 'wav',        // Format type.

        // Following options only available when using `rec` or `sox`.
        silence: 6,         // Duration of silence in seconds before it stops recording.
        keepSilence: true   // Keep the silence in the recording.
      };

    const logger = console;

    /******************************************************************************
    * Create Streams
    *******************************************************************************/

    // Create an instance.
    let audioRecorder = new AudioRecorder(options, logger);

    //create timeout (so after 10 seconds it stops feel free to remove this)
    setTimeout(function() {
        audioRecorder.stop();
      }, 10000);

    // This line is for saving the file locally as well (Strongly encouraged for testing)
    const fileStream = fs.createWriteStream("test.wav", { encoding: 'binary' });

    // Start stream to Watson STT Remove .pipe(process.stdout) if you dont want translation printed to console
    audioRecorder.start().stream().pipe(recognizeStream).pipe(process.stdout);

    //Create another stream to save locally
    audioRecorder.stream().pipe(fileStream);

    //Finally pipe translation to transcription file
    recognizeStream.pipe(fs.createWriteStream('./transcription.txt'));

关于node.js - 语音转文本 : Piping microphone stream to Watson STT with NodeJS，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55979104/

文章推荐： python - 在同一个生成器上发送和接收值

文章推荐： html - 在内联 block 中垂直对齐子项的最佳方法

文章推荐： c - c在第一次定义时如何给指针数组赋值？

文章推荐： c - 在 C 中返回 char 数组的长度

haskell - 在变压器堆栈中展开 STT 单子(monad)？
这个问题显然与讨论的问题here相关。和 here 。不幸的是，我的要求与这些问题略有不同，并且给出的答案不适用于我。我也不太明白为什么 runST 在这些情况下无法进行类型检查，这没有帮助。我的问
java - Android SpeechToText STT 对话框
有什么方法可以隐藏在android中使用语音转文本时显示的对话框吗？我在某处读到这是不可能的，但肯定有一种方法至少可以显示它然后立即隐藏它？也许有某种方法可以对某些东西进行子类化并改变它的外观？在我
javascript - 将 Google STT 从云功能转移到专用 GAE
我正在使用 Cloud Functions 从存储桶中的 getUserMedia() 转换音频/mp4 使用 ffmpeg 转换为 audio/x-flac 格式，以便能够使用 Google STT
text-to-speech - Linux > Python > TTS、STT 和语音识别
文字转语音我一直在尝试在 windows 和 Linux 环境中运行 pyttsx... Linux 环境: import pyttsx engine = pyttsx.init() python
haskell - 为 STT 定义 PrimMonad 实例？ (意法半导体变压器)
Data.Vector.Mutable 似乎需要 ST 和 IO monad 中的 PrimMonad 实例。类型类定义如下 -- -- | Class of primitive state-tra
java - Watson STT Java - 无法解析 MediaType AUDIO_WEBM
无法解析 MediaType AUDIO_WEBM。我错过了什么吗？我收到以下错误。 [ERROR] /C:/Users/IBM_ADMIN/workspace/ListenApp/src/main/
Android 应用程序在开始使用 Watson STT java-sdk 录制语音时崩溃
我之前让这段代码在我的设备上运行良好，但现在它根本无法运行。设备是三星 SM-G900P(Android 6.0.2，API23)。我必须自己构建语音转文本库、核心库和文本转语音库，因为我的应用程序
node.js - 语音转文本 : Piping microphone stream to Watson STT with NodeJS
我当前正在尝试将麦克风流发送到 Watson STT 服务，但由于某种原因，Watson 服务未接收该流(我猜测)，因此我收到错误“错误:30 秒内未检测到语音”。请注意，我已将 .wav 文件传输
haskell - ghc-mod 期望 MonadBaseControl 具有 `StM` 关联的新类型而不是 `StT` 关联的类型
我在沙盒中安装来自 Hackage 的最新 ghc-mod (5.2.1.1) 时遇到此错误: [15 of 38] Compiling Language.Haskell.GhcMod.CabalCo
java - Watson STT Java - Websockets Java 和 HTTP POST 之间的不同结果
我正在尝试构建一个采用流式音频输入(例如:麦克风中的线路)并使用 IBM Bluemix (Watson) 进行语音转文本的应用。我简要修改了找到的示例 Java 代码 here .此示例发送的是

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

node.js - 语音转文本 : Piping microphone stream to Watson STT with NodeJS