gpt4 book ai didi

c# - ASP.NET MVC 中的超快速文本转语音(WAV -> MP3)

转载 作者:太空狗 更新时间:2023-10-30 01:23:02 30 4
gpt4 key购买 nike

这个问题本质上是关于 Microsoft 的语音 API (SAPI) 对服务器工作负载的适用性以及它是否可以在 w3wp 内部可靠地用于语音合成。我们有一个异步 Controller ,它使用 .NET 4 中的 native System.Speech 程序集(不是作为 Microsoft Speech Platform 的一部分提供的 Microsoft.Speech - 运行时版本11) 和lame.exe生成mp3如下:

       [CacheFilter]
public void ListenAsync(string url)
{
string fileName = string.Format(@"C:\test\{0}.wav", Guid.NewGuid());

try
{
var t = new System.Threading.Thread(() =>
{
using (SpeechSynthesizer ss = new SpeechSynthesizer())
{
ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050, AudioBitsPerSample.Eight, AudioChannel.Mono));
ss.Speak("Here is a test sentence...");
ss.SetOutputToNull();
ss.Dispose();
}

var process = new Process() { EnableRaisingEvents = true };
process.StartInfo.FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"bin\lame.exe");
process.StartInfo.Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3"));
process.StartInfo.UseShellExecute = false;
process.StartInfo.RedirectStandardOutput = false;
process.StartInfo.RedirectStandardError = false;
process.Exited += (sender, e) =>
{
System.IO.File.Delete(fileName);

AsyncManager.OutstandingOperations.Decrement();
};

AsyncManager.OutstandingOperations.Increment();
process.Start();
});

t.Start();
t.Join();
}
catch { }

AsyncManager.Parameters["fileName"] = fileName;
}

public FileResult ListenCompleted(string fileName)
{
return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
}

问题是为什么 SpeechSynthesizer 需要像这样在单独的线程上运行才能返回(这在 SO herehere 的其他地方报告)以及是否为此请求实现 STAThreadRouteHandler 更多-比上述方法更高效/可扩展?

其次,在 ASP.NET(MVC 或 WebForms)上下文中运行 SpeakAsync 有哪些选项?我尝试过的所有选项似乎都不起作用(请参阅下面的更新)。

欢迎提出有关如何改进此模式的任何其他建议(即必须彼此串行执行但每个都具有异步支持的两个依赖项)。我不觉得这个方案在负载下是可持续的,特别是考虑到 SpeechSynthesizer 中的 known memory leaks。考虑在不同的堆栈上一起运行此服务。

更新:SpeakSpeakAsnc 选项似乎都无法在 STAThreadRouteHandler 下工作。前者产生:

System.InvalidOperationException: Asynchronous operations are not allowed in this context. Page starting an asynchronous operation has to have the Async attribute set to true and an asynchronous operation can only be started on a page prior to PreRenderComplete event. at System.Web.LegacyAspNetSynchronizationContext.OperationStarted() at System.ComponentModel.AsyncOperationManager.CreateOperation(Object userSuppliedState) at System.Speech.Internal.Synthesis.VoiceSynthesis..ctor(WeakReference speechSynthesizer) at System.Speech.Synthesis.SpeechSynthesizer.get_VoiceSynthesizer() at System.Speech.Synthesis.SpeechSynthesizer.SetOutputToWaveFile(String path, SpeechAudioFormatInfo formatInfo)

后者导致:

System.InvalidOperationException: The asynchronous action method 'Listen' cannot be executed synchronously. at System.Web.Mvc.Async.AsyncActionDescriptor.Execute(ControllerContext controllerContext, IDictionary`2 parameters)

似乎自定义 STA 线程池(具有 COM 对象的 ThreadStatic 实例)是更好的方法:http://marcinbudny.blogspot.ca/2012/04/dealing-with-sta-coms-in-web.html

更新 #2:System.Speech.SpeechSynthesizer 似乎不需要 STA 处理,似乎在 MTA 线程上运行良好,只要您遵循 Start/Join 模式。这是一个能够正确使用 SpeakAsync 的新版本(问题是过早地处理它!)并将 WAV 生成和 MP3 生成分解为两个单独的请求:

[CacheFilter]
[ActionName("listen-to-text")]
public void ListenToTextAsync(string text)
{
AsyncManager.OutstandingOperations.Increment();

var t = new Thread(() =>
{
SpeechSynthesizer ss = new SpeechSynthesizer();
string fileName = string.Format(@"C:\test\{0}.wav", Guid.NewGuid());

ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050,
AudioBitsPerSample.Eight,
AudioChannel.Mono));
ss.SpeakCompleted += (sender, e) =>
{
ss.SetOutputToNull();
ss.Dispose();

AsyncManager.Parameters["fileName"] = fileName;
AsyncManager.OutstandingOperations.Decrement();
};

CustomPromptBuilder pb = new CustomPromptBuilder(settings.DefaultVoiceName);
pb.AppendParagraphText(text);
ss.SpeakAsync(pb);
});

t.Start();
t.Join();
}

[CacheFilter]
public ActionResult ListenToTextCompleted(string fileName)
{
return RedirectToAction("mp3", new { fileName = fileName });
}

[CacheFilter]
[ActionName("mp3")]
public void Mp3Async(string fileName)
{
var process = new Process()
{
EnableRaisingEvents = true,
StartInfo = new ProcessStartInfo()
{
FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"bin\lame.exe"),
Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3")),
UseShellExecute = false,
RedirectStandardOutput = false,
RedirectStandardError = false
}
};

process.Exited += (sender, e) =>
{
System.IO.File.Delete(fileName);
AsyncManager.Parameters["fileName"] = fileName;
AsyncManager.OutstandingOperations.Decrement();
};

AsyncManager.OutstandingOperations.Increment();
process.Start();
}

[CacheFilter]
public ActionResult Mp3Completed(string fileName)
{
return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
}

最佳答案

服务器上的 I/O 非常昂贵。您认为您可以在服务器硬盘驱动器上获得多少个 wav 写入流?为什么不在内存中完成所有操作,只在完全处理后才写入 mp3? mp3 的体积要小得多,并且 I/O 会占用一小段时间。如果需要,您甚至可以更改代码以将流直接返回给用户,而不是保存到 mp3。

How do can I use LAME to encode an wav to an mp3 c#

关于c# - ASP.NET MVC 中的超快速文本转语音(WAV -> MP3),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12343249/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com