gpt4 book ai didi

c# - 流式输入到 System.Speech.Recognition.SpeechRecognitionEngine

转载 作者:太空狗 更新时间:2023-10-29 19:59:12 26 4
gpt4 key购买 nike

我正在尝试通过 TCP 套接字在 C# 中进行“流式”语音识别。我遇到的问题是 SpeechRecognitionEngine.SetInputToAudioStream() 似乎需要一个可以搜索的定义长度的流。现在我能想到的唯一方法是随着更多输入的输入在 MemoryStream 上重复运行识别器。

下面是一些代码来说明:

            SpeechRecognitionEngine appRecognizer = new SpeechRecognitionEngine();

System.Speech.AudioFormat.SpeechAudioFormatInfo formatInfo = new System.Speech.AudioFormat.SpeechAudioFormatInfo(8000, System.Speech.AudioFormat.AudioBitsPerSample.Sixteen, System.Speech.AudioFormat.AudioChannel.Mono);

NetworkStream stream = new NetworkStream(socket,true);
appRecognizer.SetInputToAudioStream(stream, formatInfo);
// At the line above a "NotSupportedException" complaining that "This stream does not support seek operations."

有谁知道如何解决这个问题?它必须支持某种流式输入,因为它可以使用 SetInputToDefaultAudioDevice() 与麦克风一起正常工作。

谢谢,肖恩

最佳答案

我通过覆盖流类获得了实时语音识别:

class SpeechStreamer : Stream
{
private AutoResetEvent _writeEvent;
private List<byte> _buffer;
private int _buffersize;
private int _readposition;
private int _writeposition;
private bool _reset;

public SpeechStreamer(int bufferSize)
{
_writeEvent = new AutoResetEvent(false);
_buffersize = bufferSize;
_buffer = new List<byte>(_buffersize);
for (int i = 0; i < _buffersize;i++ )
_buffer.Add(new byte());
_readposition = 0;
_writeposition = 0;
}

public override bool CanRead
{
get { return true; }
}

public override bool CanSeek
{
get { return false; }
}

public override bool CanWrite
{
get { return true; }
}

public override long Length
{
get { return -1L; }
}

public override long Position
{
get { return 0L; }
set { }
}

public override long Seek(long offset, SeekOrigin origin)
{
return 0L;
}

public override void SetLength(long value)
{

}

public override int Read(byte[] buffer, int offset, int count)
{
int i = 0;
while (i<count && _writeEvent!=null)
{
if (!_reset && _readposition >= _writeposition)
{
_writeEvent.WaitOne(100, true);
continue;
}
buffer[i] = _buffer[_readposition+offset];
_readposition++;
if (_readposition == _buffersize)
{
_readposition = 0;
_reset = false;
}
i++;
}

return count;
}

public override void Write(byte[] buffer, int offset, int count)
{
for (int i = offset; i < offset+count; i++)
{
_buffer[_writeposition] = buffer[i];
_writeposition++;
if (_writeposition == _buffersize)
{
_writeposition = 0;
_reset = true;
}
}
_writeEvent.Set();

}

public override void Close()
{
_writeEvent.Close();
_writeEvent = null;
base.Close();
}

public override void Flush()
{

}
}

... 并使用它的实例作为 SetInputToAudioStream 方法的流输入。一旦流返回一个长度或返回的计数小于请求的长度,识别引擎就认为输入已完成。这会设置一个永不结束的循环缓冲区。

关于c# - 流式输入到 System.Speech.Recognition.SpeechRecognitionEngine,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1682902/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com