gpt4 book ai didi

ios - 如何在IOS中使用OpusCodec对实时音频进行编码和解码?

转载 作者:行者123 更新时间:2023-12-02 02:52:22 25 4
gpt4 key购买 nike

我正在开发一个具有以下要求的应用程序:

  1. 从 iOS 设备 (iPhone) 录制实时音频
  2. 将此音频数据编码为 Opus 数据并通过 WebSocket 将其发送到服务器
  3. 再次将接收到的数据解码为 pcm
  4. 在 iOS 设备 (iPhone) 上播放从 WebSocket 服务器接收到的音频

我为此使用了 AVAudioEngine

 var engine = AVAudioEngine()
var input: AVAudioInputNode = engine.inputNode
var format: AVAudioFormat = input.outputFormat(forBus: AVAudioNodeBus(0))
input.installTap(onBus: AVAudioNodeBus(0), bufferSize: AVAudioFrameCount(8192), format: format, block: { buf, when in
// ‘buf' contains audio captured from input node at time 'when'
})

// start engine

我使用此函数将 AVAudioPCMBuffer 转换为数据

func toData(PCMBuffer: AVAudioPCMBuffer) -> Data {
let channelCount = 1
let channels = UnsafeBufferPointer(start: PCMBuffer.floatChannelData, count: channelCount)
let ch0Data = NSData(bytes: channels[0], length:Int(PCMBuffer.frameLength * PCMBuffer.format.streamDescription.pointee.mBytesPerFrame))
return ch0Data as Data
}

我从 CocoaPod libopus 找到了 Opus 库 libopus

我搜索了很多关于如何在IOS中使用OpusCodec但没有找到解决方案。

如何使用 OpusCodec 编码和解码这些数据?我需要 jitterBuffer 吗?如果我需要如何在IOS中使用

此代码适用于 Opus 编解码器,但语音不清晰

#import "OpusManager.h"
#import <opus/opus.h>

#define SAMPLE_RATE 16000
#define CHANNELS 1
#define BITRATE SAMPLE_RATE * CHANNELS
/**
* Audio frame size
* It is divided by time. When calling, you must use the audio data of
exactly one frame (multiple of 2.5ms: 2.5, 5, 10, 20, 40, 60ms).
* Fs/ms 2.5 5 10 20 40 60
* 8kHz 20 40 80 160 320 480
* 16kHz 40 80 160 320 640 960
* 24KHz 60 120 240 480 960 1440
* 48kHz 120 240 480 960 1920 2880
*/
#define FRAME_SIZE 320

#define APPLICATION OPUS_APPLICATION_VOIP
#define MAX_PACKET_BYTES (FRAME_SIZE * CHANNELS * sizeof(float))
#define MAX_FRAME_SIZE (FRAME_SIZE * CHANNELS * sizeof(float))

typedef opus_int16 OPUS_DATA_SIZE_T;

@implementation OpusManager {
OpusEncoder *_encoder;
OpusDecoder *_decoder;
}

int size;
int error;
unsigned char encodedPacket[MAX_PACKET_BYTES];

- (instancetype)init {
self = [super init];
if (self) {

size = opus_encoder_get_size(CHANNELS);
_encoder = malloc(size);
error = opus_encoder_init(_encoder, SAMPLE_RATE, CHANNELS, APPLICATION);
_encoder = opus_encoder_create(SAMPLE_RATE, CHANNELS, APPLICATION, &error);
_decoder = opus_decoder_create(SAMPLE_RATE, CHANNELS, &error);

opus_encoder_ctl(_encoder, OPUS_SET_BITRATE(BITRATE));
opus_encoder_ctl(_encoder, OPUS_SET_COMPLEXITY(10));
opus_encoder_ctl(_encoder, OPUS_SET_SIGNAL(OPUS_SIGNAL_VOICE));
opus_encoder_ctl(_encoder, OPUS_SET_VBR(0));
opus_encoder_ctl(_encoder, OPUS_SET_APPLICATION(APPLICATION));
opus_encoder_ctl(_encoder, OPUS_SET_DTX(1));
opus_encoder_ctl(_encoder, OPUS_SET_INBAND_FEC(0));
opus_encoder_ctl(_encoder, OPUS_SET_BANDWIDTH(12000));
opus_encoder_ctl(_encoder, OPUS_SET_PACKET_LOSS_PERC(1));
opus_encoder_ctl(_encoder, OPUS_SET_INBAND_FEC(1));
opus_encoder_ctl(_encoder, OPUS_SET_FORCE_CHANNELS(CHANNELS));
opus_encoder_ctl(_encoder, OPUS_SET_PACKET_LOSS_PERC(1));
}
return self;
}

- (NSData *)encode:(NSData *)PCM {

opus_int16 *PCMPtr = (opus_int16 *)PCM.bytes;
int PCMSize = (int)PCM.length / sizeof(opus_int16);
opus_int16 *PCMEnd = PCMPtr + PCMSize;
NSMutableData *mutData = [NSMutableData data];
unsigned char encodedPacket[MAX_PACKET_BYTES];

// Record opus block size
OPUS_DATA_SIZE_T encodedBytes = 0;

while (PCMPtr + FRAME_SIZE < PCMEnd) {
encodedBytes = opus_encode_float(_encoder, (const float *) PCMPtr, FRAME_SIZE, encodedPacket, MAX_PACKET_BYTES);

if (encodedBytes <= 0) {
NSLog(@"ERROR: encodedBytes<=0");
return nil;
}
NSLog(@"encodedBytes: %d", encodedBytes);

// Save the opus block size
[mutData appendBytes:&encodedBytes length:sizeof(encodedBytes)];

// Save opus data
[mutData appendBytes:encodedPacket length:encodedBytes];

PCMPtr += FRAME_SIZE;
}

NSLog(@"mutData: %lu", (unsigned long)mutData.length);
NSLog(@"encodedPacket: %s", encodedPacket);

return mutData.length > 0 ? mutData : nil;

}

- (NSData *)decode:(NSData *)opus {

unsigned char *opusPtr = (unsigned char *)opus.bytes;
int opusSize = (int)opus.length;
unsigned char *opusEnd = opusPtr + opusSize;

NSMutableData *mutData = [NSMutableData data];

float decodedPacket[MAX_FRAME_SIZE];
int decodedSamples = 0;

// Save data for opus block size
OPUS_DATA_SIZE_T nBytes = 0;

while (opusPtr < opusEnd) {
// Take out the opus block size data
nBytes = *(OPUS_DATA_SIZE_T *)opusPtr;
opusPtr += sizeof(nBytes);

decodedSamples = opus_decode_float(_decoder, opusPtr, nBytes,decodedPacket, MAX_FRAME_SIZE, 0);

if (decodedSamples <= 0) {
NSLog(@"ERROR: decodedSamples<=0");
return nil;
}
NSLog(@"decodedSamples:%d", decodedSamples);
[mutData appendBytes:decodedPacket length:decodedSamples *sizeof(opus_int16)];

opusPtr += nBytes;
}
NSLog(@"mutData: %lu", (unsigned long)mutData.length);
return mutData.length > 0 ? mutData : nil;
}

@end

最佳答案

尝试降低带宽或设置更高的比特率。我认为 12kHz 带宽单声道音频的 16kbit 可能太低了。认为在设置应用程序 VOIP 的情况下将带宽保留为自动会更好。可能还有其他问题,但“听起来不太好”不足以分析。

关于ios - 如何在IOS中使用OpusCodec对实时音频进行编码和解码?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55692517/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com