gpt4 book ai didi

ios - 提高/增加文本转语音 (AVSpeechUtterance) 的音量以使其声音更大

转载 作者:行者123 更新时间:2023-11-29 05:36:23 32 4
gpt4 key购买 nike

我有一个导航应用程序,可以使用 AVSpeechUtterance 提供方向语音指令(例如“200 英尺内左转”)。我已经把音量调到1了,就像这样。 speechUtteranceInstance.volume = 1,但与来自 iPhone 的音乐或播客相比,音量仍然很低,尤其是当声音通过蓝牙或有线连接(例如通过蓝牙连接到汽车)时

有什么办法可以提高音量吗?(我知道以前有人问过这个问题,但到目前为止还没有找到适合我的解决方案。)

最佳答案

经过大量研究和尝试,我找到了一个很好的解决方案。

首先我认为这是一个 iOS 错误。当以下所有条件都成立时,我发现语音指令本身也会被闪避(或者至少听起来被闪避),导致语音指令以与 DUCKED 音乐相同的音量播放(因此声音太小,听不清楚)。

  • 在后台播放音乐
  • 回避背景音乐.duckOtheraudioSessionCategory
  • 通过 AVSpeechSynthesizer 播放语音
  • 通过连接的蓝牙播放音频设备(例如蓝牙耳机或蓝牙汽车扬声器)

我找到的解决方案是将speechUtterance提供给AVAudioEngine。这只能在 iOS13 或更高版本上完成,因为这会添加 .write method to AVSpeechSynthesizer

简而言之,我使用 AVAudioEngineAVAudioUnitEQAVAudioPlayerNode,将 AVAudioUnitEQ 的 globalGain 属性设置为大约10分贝。这也有一些怪癖,但可以解决(请参阅代码注释)。

完整代码如下:

import UIKit
import AVFoundation
import MediaPlayer

class ViewController: UIViewController {

// MARK: AVAudio properties
var engine = AVAudioEngine()
var player = AVAudioPlayerNode()
var eqEffect = AVAudioUnitEQ()
var converter = AVAudioConverter(from: AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 22050, channels: 1, interleaved: false)!, to: AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatFloat32, sampleRate: 22050, channels: 1, interleaved: false)!)
let synthesizer = AVSpeechSynthesizer()
var bufferCounter: Int = 0

let audioSession = AVAudioSession.sharedInstance()




override func viewDidLoad() {
super.viewDidLoad()



let outputFormat = AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatFloat32, sampleRate: 22050, channels: 1, interleaved: false)!
setupAudio(format: outputFormat, globalGain: 0)



}

func activateAudioSession() {
do {
try audioSession.setCategory(.playback, mode: .voicePrompt, options: [.mixWithOthers, .duckOthers])
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
} catch {
print("An error has occurred while setting the AVAudioSession.")
}
}

@IBAction func tappedPlayButton(_ sender: Any) {

eqEffect.globalGain = 0
play()

}

@IBAction func tappedPlayLoudButton(_ sender: Any) {
eqEffect.globalGain = 10
play()

}

func play() {
let path = Bundle.main.path(forResource: "voiceStart", ofType: "wav")!
let file = try! AVAudioFile(forReading: URL(fileURLWithPath: path))
self.player.scheduleFile(file, at: nil, completionHandler: nil)
let utterance = AVSpeechUtterance(string: "This is to test if iOS is able to boost the voice output above the 100% limit.")
synthesizer.write(utterance) { buffer in
guard let pcmBuffer = buffer as? AVAudioPCMBuffer, pcmBuffer.frameLength > 0 else {
print("could not create buffer or buffer empty")
return
}

// QUIRCK Need to convert the buffer to different format because AVAudioEngine does not support the format returned from AVSpeechSynthesizer
let convertedBuffer = AVAudioPCMBuffer(pcmFormat: AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatFloat32, sampleRate: pcmBuffer.format.sampleRate, channels: pcmBuffer.format.channelCount, interleaved: false)!, frameCapacity: pcmBuffer.frameCapacity)!
do {
try self.converter!.convert(to: convertedBuffer, from: pcmBuffer)
self.bufferCounter += 1
self.player.scheduleBuffer(convertedBuffer, completionCallbackType: .dataPlayedBack, completionHandler: { (type) -> Void in
DispatchQueue.main.async {
self.bufferCounter -= 1
print(self.bufferCounter)
if self.bufferCounter == 0 {
self.player.stop()
self.engine.stop()
try! self.audioSession.setActive(false, options: [])
}
}

})

self.converter!.reset()
//self.player.prepare(withFrameCount: convertedBuffer.frameLength)
}
catch let error {
print(error.localizedDescription)
}
}
activateAudioSession()
if !self.engine.isRunning {
try! self.engine.start()
}
if !self.player.isPlaying {
self.player.play()
}
}

func setupAudio(format: AVAudioFormat, globalGain: Float) {
// QUIRCK: Connecting the equalizer to the engine somehow starts the shared audioSession, and if that audiosession is not configured with .mixWithOthers and if it's not deactivated afterwards, this will stop any background music that was already playing. So first configure the audio session, then setup the engine and then deactivate the session again.
try? self.audioSession.setCategory(.playback, options: .mixWithOthers)

eqEffect.globalGain = globalGain
engine.attach(player)
engine.attach(eqEffect)
engine.connect(player, to: eqEffect, format: format)
engine.connect(eqEffect, to: engine.mainMixerNode, format: format)
engine.prepare()

try? self.audioSession.setActive(false)

}

}

关于ios - 提高/增加文本转语音 (AVSpeechUtterance) 的音量以使其声音更大,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56999334/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com