gpt4 book ai didi

ios - 我如何使用 websocket 将音频发送到 Microsoft Translator

转载 作者:行者123 更新时间:2023-11-29 11:54:26 26 4
gpt4 key购买 nike

我已经创建了一个应用程序来将文本翻译成文本以及将语音翻译成文本。我完成了文本到文本和文本到语音的转换。我不是将语音翻译为文本。

我正在使用这个演示 https://github.com/bitmapdata/MSTranslateVendor它只会文本到文本和文本到语音。

我在堆栈溢出中搜索它会给我解决方案,比如使用 websocket 发送音频,但我不知道如何发送它。我是 websocket 编程的新手。

请帮助我如何使用 websocket 发送音频。

我正在按照下面的方法创建音频,但我不知道如何发送它。

- (void)viewDidLoad {
[super viewDidLoad];

settings = [[NSMutableDictionary alloc] init];
[settings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey];
[settings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
[settings setValue:[NSNumber numberWithInt: 2] forKey:AVNumberOfChannelsKey];
[settings setValue:[NSNumber numberWithInt: 16] forKey:AVLinearPCMBitDepthKey];
[settings setValue:[NSNumber numberWithBool: NO] forKey:AVLinearPCMIsBigEndianKey];
[settings setValue:[NSNumber numberWithBool: NO] forKey:AVLinearPCMIsFloatKey];
[settings setValue:[NSNumber numberWithInt: AVAudioQualityHigh] forKey:AVEncoderAudioQualityKey];


NSArray *pathComponents = [NSArray arrayWithObjects:
[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject],
@"Sohil.wav",
nil];
outputFileURL = [NSURL fileURLWithPathComponents:pathComponents];
NSLog(@"Record URL : %@",outputFileURL);

// Setup audio session
AVAudioSession *session = [AVAudioSession sharedInstance];
[session setCategory:AVAudioSessionCategoryPlayAndRecord error:nil];

// Initiate and prepare the recorder
recorder = [[AVAudioRecorder alloc] initWithURL:outputFileURL settings:settings error:nil];
recorder.delegate = self;
recorder.meteringEnabled = YES;
[recorder prepareToRecord];
}

- (IBAction)recordStart:(id)sender {

AVAudioSession *session = [AVAudioSession sharedInstance];
[session setActive:YES error:nil];
[recorder record];

}

- (IBAction)recordStop:(id)sender {
[recorder stop];
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
[audioSession setActive:NO error:nil];

}

并转换它:

    -(NSData*) stripAndAddWavHeader:(NSData*) wav {
unsigned long wavDataSize = [wav length] - 44;

NSData *WaveFile= [NSMutableData dataWithData:[wav subdataWithRange:NSMakeRange(44, wavDataSize)]];

NSMutableData *newWavData;
newWavData = [self addWavHeader:WaveFile];

return newWavData;
}
- (NSMutableData *)addWavHeader:(NSData *)wavNoheader {

int headerSize = 44;
long totalAudioLen = [wavNoheader length];
long totalDataLen = [wavNoheader length] + headerSize-8;
long longSampleRate = 22050.0;
int channels = 1;
long byteRate = 8 * 44100.0 * channels/8;



Byte *header = (Byte*)malloc(44);
header[0] = 'R'; // RIFF/WAVE header
header[1] = 'I';
header[2] = 'F';
header[3] = 'F';
header[4] = (Byte) (totalDataLen & 0xff);
header[5] = (Byte) ((totalDataLen >> 8) & 0xff);
header[6] = (Byte) ((totalDataLen >> 16) & 0xff);
header[7] = (Byte) ((totalDataLen >> 24) & 0xff);
header[8] = 'W';
header[9] = 'A';
header[10] = 'V';
header[11] = 'E';
header[12] = 'f'; // 'fmt ' chunk
header[13] = 'm';
header[14] = 't';
header[15] = ' ';
header[16] = 16; // 4 bytes: size of 'fmt ' chunk
header[17] = 0;
header[18] = 0;
header[19] = 0;
header[20] = 1; // format = 1
header[21] = 0;
header[22] = (Byte) channels;
header[23] = 0;
header[24] = (Byte) (longSampleRate & 0xff);
header[25] = (Byte) ((longSampleRate >> 8) & 0xff);
header[26] = (Byte) ((longSampleRate >> 16) & 0xff);
header[27] = (Byte) ((longSampleRate >> 24) & 0xff);
header[28] = (Byte) (byteRate & 0xff);
header[29] = (Byte) ((byteRate >> 8) & 0xff);
header[30] = (Byte) ((byteRate >> 16) & 0xff);
header[31] = (Byte) ((byteRate >> 24) & 0xff);
header[32] = (Byte) (2 * 8 / 8); // block align
header[33] = 0;
header[34] = 16; // bits per sample
header[35] = 0;
header[36] = 'd';
header[37] = 'a';
header[38] = 't';
header[39] = 'a';
header[40] = (Byte) (totalAudioLen & 0xff);
header[41] = (Byte) ((totalAudioLen >> 8) & 0xff);
header[42] = (Byte) ((totalAudioLen >> 16) & 0xff);
header[43] = (Byte) ((totalAudioLen >> 24) & 0xff);

NSMutableData *newWavData = [NSMutableData dataWithBytes:header length:44];
[newWavData appendBytes:[wavNoheader bytes] length:[wavNoheader length]];
return newWavData;
}

最佳答案

您可以使用 Microsoft Cognitive-Speech-STT-iOS它的工作完美的语音到文本。

1) 首先,您不会在 Register App 上注册您的应用程序

2) 现在您想要订阅 Bing 语音 key - 预览在您的演示项目中的 setting.plist 文件中使用此 key 它的工作正常。你可以得到两把 key 使用任何一把 key 。

关于ios - 我如何使用 websocket 将音频发送到 Microsoft Translator,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39725025/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com