gpt4 book ai didi

iphone - 使用 AVAssetReader 绘制波形

转载 作者:IT老高 更新时间:2023-10-28 11:23:31 28 4
gpt4 key购买 nike

我使用assetUrl 从iPod 库中读取歌曲(代码名为audioUrl)我可以以多种方式播放它,我可以剪切它,我可以用它进行一些进动,但是......我真的不明白我要用这个 CMSampleBufferRef 做什么来获取绘制波形的数据!我需要有关峰值的信息,我如何才能以这种方式(也许是另一种方式)获得它?

    AVAssetTrack * songTrack = [audioUrl.tracks objectAtIndex:0];
AVAssetReaderTrackOutput * output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:nil];
[reader addOutput:output];
[output release];

NSMutableData * fullSongData = [[NSMutableData alloc] init];
[reader startReading];

while (reader.status == AVAssetReaderStatusReading){

AVAssetReaderTrackOutput * trackOutput =
(AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];

CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];

if (sampleBufferRef){/* what I gonna do with this? */}

请帮帮我!

最佳答案

我正在寻找类似的东西,并决定“自己动手”。我意识到这是一个旧帖子,但如果其他人正在寻找这个,这是我的解决方案。它相对快速和肮脏,并将图像标准化为“全尺寸”。它创建的图像是“宽的”,即您需要将它们放在 UIScrollView 中或以其他方式管理显示。

这是基于对 this question 的一些回答。

样本输出

sample waveform

编辑:我添加了平均和渲染方法的对数版本,请参阅此消息的末尾以了解替代版本和比较输出。我个人更喜欢原始的线性版本,但已决定发布它,以防有人可以改进所使用的算法。

您将需要这些导入:

#import <MediaPlayer/MediaPlayer.h>
#import <AVFoundation/AVFoundation.h>

首先,一个通用的渲染方法,它接受一个指向平均样本数据的指针,
并返回一个 UIImage。请注意,这些样本不是可播放的音频样本。

-(UIImage *) audioImageGraph:(SInt16 *) samples
normalizeMax:(SInt16) normalizeMax
sampleCount:(NSInteger) sampleCount
channelCount:(NSInteger) channelCount
imageHeight:(float) imageHeight {

CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
UIGraphicsBeginImageContext(imageSize);
CGContextRef context = UIGraphicsGetCurrentContext();

CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
CGContextSetAlpha(context,1.0);
CGRect rect;
rect.size = imageSize;
rect.origin.x = 0;
rect.origin.y = 0;

CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
CGColorRef rightcolor = [[UIColor redColor] CGColor];

CGContextFillRect(context, rect);

CGContextSetLineWidth(context, 1.0);

float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
float centerLeft = halfGraphHeight;
float centerRight = (halfGraphHeight*3) ;
float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (float) normalizeMax;

for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) {
SInt16 left = *samples++;
float pixels = (float) left;
pixels *= sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerLeft-pixels);
CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
CGContextSetStrokeColorWithColor(context, leftcolor);
CGContextStrokePath(context);

if (channelCount==2) {
SInt16 right = *samples++;
float pixels = (float) right;
pixels *= sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerRight - pixels);
CGContextAddLineToPoint(context, intSample, centerRight + pixels);
CGContextSetStrokeColorWithColor(context, rightcolor);
CGContextStrokePath(context);
}
}

// Create new image
UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();

// Tidy up
UIGraphicsEndImageContext();

return newImage;
}

接下来,一个接受 AVURLAsset 并返回 PNG 图像数据的方法

- (NSData *) renderPNGAudioPictogramForAsset:(AVURLAsset *)songAsset {

NSError * error = nil;
AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];

NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
// [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
// [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/
[NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,
nil];

AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];

[reader addOutput:output];
[output release];

UInt32 sampleRate,channelCount;

NSArray* formatDesc = songTrack.formatDescriptions;
for(unsigned int i = 0; i < [formatDesc count]; ++i) {
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
if(fmtDesc ) {

sampleRate = fmtDesc->mSampleRate;
channelCount = fmtDesc->mChannelsPerFrame;

// NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
}
}

UInt32 bytesPerSample = 2 * channelCount;
SInt16 normalizeMax = 0;

NSMutableData * fullSongData = [[NSMutableData alloc] init];
[reader startReading];

UInt64 totalBytes = 0;
SInt64 totalLeft = 0;
SInt64 totalRight = 0;
NSInteger sampleTally = 0;

NSInteger samplesPerPixel = sampleRate / 50;

while (reader.status == AVAssetReaderStatusReading){

AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];

if (sampleBufferRef){
CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);

size_t length = CMBlockBufferGetDataLength(blockBufferRef);
totalBytes += length;

NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];

NSMutableData * data = [NSMutableData dataWithLength:length];
CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);

SInt16 * samples = (SInt16 *) data.mutableBytes;
int sampleCount = length / bytesPerSample;
for (int i = 0; i < sampleCount ; i ++) {

SInt16 left = *samples++;
totalLeft += left;

SInt16 right;
if (channelCount==2) {
right = *samples++;
totalRight += right;
}

sampleTally++;

if (sampleTally > samplesPerPixel) {

left = totalLeft / sampleTally;

SInt16 fix = abs(left);
if (fix > normalizeMax) {
normalizeMax = fix;
}

[fullSongData appendBytes:&left length:sizeof(left)];

if (channelCount==2) {
right = totalRight / sampleTally;

SInt16 fix = abs(right);
if (fix > normalizeMax) {
normalizeMax = fix;
}

[fullSongData appendBytes:&right length:sizeof(right)];
}

totalLeft = 0;
totalRight = 0;
sampleTally = 0;
}
}

[wader drain];

CMSampleBufferInvalidate(sampleBufferRef);
CFRelease(sampleBufferRef);
}
}

NSData * finalData = nil;

if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){
// Something went wrong. return nil

return nil;
}

if (reader.status == AVAssetReaderStatusCompleted){

NSLog(@"rendering output graphics using normalizeMax %d",normalizeMax);

UIImage *test = [self audioImageGraph:(SInt16 *)
fullSongData.bytes
normalizeMax:normalizeMax
sampleCount:fullSongData.length / 4
channelCount:2
imageHeight:100];

finalData = imageToData(test);
}

[fullSongData release];
[reader release];

return finalData;
}

高级选项:最后,如果您希望能够使用 AVAudioPlayer 播放音频,则需要缓存它到您的应用程序的捆绑缓存文件夹。既然我这样做了,我决定缓存图像数据此外,并将整个事物包装到 UIImage 类别中。你需要包括this open source offeringhere 中提取音频和一些代码处理一些后台线程功能。

首先,一些定义,以及一些用于处理路径名等的通用类方法

//#define imgExt @"jpg"
//#define imageToData(x) UIImageJPEGRepresentation(x,4)

#define imgExt @"png"
#define imageToData(x) UIImagePNGRepresentation(x)

+ (NSString *) assetCacheFolder {
NSArray *assetFolderRoot = NSSearchPathForDirectoriesInDomains(NSCachesDirectory, NSUserDomainMask, YES);
return [NSString stringWithFormat:@"%@/audio", [assetFolderRoot objectAtIndex:0]];
}

+ (NSString *) cachedAudioPictogramPathForMPMediaItem:(MPMediaItem*) item {
NSString *assetFolder = [[self class] assetCacheFolder];
NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID];
NSString *assetPictogramFilename = [NSString stringWithFormat:@"asset_%@.%@",libraryId,imgExt];
return [NSString stringWithFormat:@"%@/%@", assetFolder, assetPictogramFilename];
}

+ (NSString *) cachedAudioFilepathForMPMediaItem:(MPMediaItem*) item {
NSString *assetFolder = [[self class] assetCacheFolder];

NSURL * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL];
NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID];

NSString *assetFileExt = [[[assetURL path] lastPathComponent] pathExtension];
NSString *assetFilename = [NSString stringWithFormat:@"asset_%@.%@",libraryId,assetFileExt];
return [NSString stringWithFormat:@"%@/%@", assetFolder, assetFilename];
}

+ (NSURL *) cachedAudioURLForMPMediaItem:(MPMediaItem*) item {
NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item];
return [NSURL fileURLWithPath:assetFilepath];
}

现在是做“生意”的init方法

- (id) initWithMPMediaItem:(MPMediaItem*) item 
completionBlock:(void (^)(UIImage* delayedImagePreparation))completionBlock {

NSFileManager *fman = [NSFileManager defaultManager];
NSString *assetPictogramFilepath = [[self class] cachedAudioPictogramPathForMPMediaItem:item];

if ([fman fileExistsAtPath:assetPictogramFilepath]) {

NSLog(@"Returning cached waveform pictogram: %@",[assetPictogramFilepath lastPathComponent]);

self = [self initWithContentsOfFile:assetPictogramFilepath];
return self;
}

NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item];

NSURL *assetFileURL = [NSURL fileURLWithPath:assetFilepath];

if ([fman fileExistsAtPath:assetFilepath]) {

NSLog(@"scanning cached audio data to create UIImage file: %@",[assetFilepath lastPathComponent]);

[assetFileURL retain];
[assetPictogramFilepath retain];

[NSThread MCSM_performBlockInBackground: ^{

AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil];
NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset];

[waveFormData writeToFile:assetPictogramFilepath atomically:YES];

[assetFileURL release];
[assetPictogramFilepath release];

if (completionBlock) {

[waveFormData retain];
[NSThread MCSM_performBlockOnMainThread:^{

UIImage *result = [UIImage imageWithData:waveFormData];

NSLog(@"returning rendered pictogram on main thread (%d bytes %@ data in UIImage %0.0f x %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height);

completionBlock(result);

[waveFormData release];
}];
}
}];

return nil;

} else {

NSString *assetFolder = [[self class] assetCacheFolder];

[fman createDirectoryAtPath:assetFolder withIntermediateDirectories:YES attributes:nil error:nil];

NSLog(@"Preparing to import audio asset data %@",[assetFilepath lastPathComponent]);

[assetPictogramFilepath retain];
[assetFileURL retain];

TSLibraryImport* import = [[TSLibraryImport alloc] init];
NSURL * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL];

[import importAsset:assetURL toURL:assetFileURL completionBlock:^(TSLibraryImport* import) {
//check the status and error properties of
//TSLibraryImport

if (import.error) {

NSLog (@"audio data import failed:%@",import.error);

} else{
NSLog (@"Creating waveform pictogram file: %@", [assetPictogramFilepath lastPathComponent]);
AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil];
NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset];

[waveFormData writeToFile:assetPictogramFilepath atomically:YES];

if (completionBlock) {
[waveFormData retain];
[NSThread MCSM_performBlockOnMainThread:^{

UIImage *result = [UIImage imageWithData:waveFormData];
NSLog(@"returning rendered pictogram on main thread (%d bytes %@ data in UIImage %0.0f x %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height);

completionBlock(result);

[waveFormData release];
}];
}
}

[assetPictogramFilepath release];
[assetFileURL release];

} ];

return nil;
}
}

调用这个的一个例子:

-(void) importMediaItem {

MPMediaItem* item = [self mediaItem];

// since we will be needing this for playback, save the url to the cached audio.
[url release];
url = [[UIImage cachedAudioURLForMPMediaItem:item] retain];

[waveFormImage release];

waveFormImage = [[UIImage alloc ] initWithMPMediaItem:item completionBlock:^(UIImage* delayedImagePreparation){

waveFormImage = [delayedImagePreparation retain];
[self displayWaveFormImage];
}];

if (waveFormImage) {
[waveFormImage retain];
[self displayWaveFormImage];
}
}

平均和渲染方法的对数版本

#define absX(x) (x<0?0-x:x)
#define minMaxX(x,mn,mx) (x<=mn?mn:(x>=mx?mx:x))
#define noiseFloor (-90.0)
#define decibel(amplitude) (20.0 * log10(absX(amplitude)/32767.0))

-(UIImage *) audioImageLogGraph:(Float32 *) samples
normalizeMax:(Float32) normalizeMax
sampleCount:(NSInteger) sampleCount
channelCount:(NSInteger) channelCount
imageHeight:(float) imageHeight {

CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
UIGraphicsBeginImageContext(imageSize);
CGContextRef context = UIGraphicsGetCurrentContext();

CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
CGContextSetAlpha(context,1.0);
CGRect rect;
rect.size = imageSize;
rect.origin.x = 0;
rect.origin.y = 0;

CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
CGColorRef rightcolor = [[UIColor redColor] CGColor];

CGContextFillRect(context, rect);

CGContextSetLineWidth(context, 1.0);

float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
float centerLeft = halfGraphHeight;
float centerRight = (halfGraphHeight*3) ;
float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (normalizeMax - noiseFloor) / 2;

for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) {
Float32 left = *samples++;
float pixels = (left - noiseFloor) * sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerLeft-pixels);
CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
CGContextSetStrokeColorWithColor(context, leftcolor);
CGContextStrokePath(context);

if (channelCount==2) {
Float32 right = *samples++;
float pixels = (right - noiseFloor) * sampleAdjustmentFactor;
CGContextMoveToPoint(context, intSample, centerRight - pixels);
CGContextAddLineToPoint(context, intSample, centerRight + pixels);
CGContextSetStrokeColorWithColor(context, rightcolor);
CGContextStrokePath(context);
}
}

// Create new image
UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();

// Tidy up
UIGraphicsEndImageContext();

return newImage;
}

- (NSData *) renderPNGAudioPictogramLogForAsset:(AVURLAsset *)songAsset {

NSError * error = nil;
AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];

NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
// [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
// [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/

[NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,
nil];

AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];

[reader addOutput:output];
[output release];

UInt32 sampleRate,channelCount;

NSArray* formatDesc = songTrack.formatDescriptions;
for(unsigned int i = 0; i < [formatDesc count]; ++i) {
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
if(fmtDesc ) {

sampleRate = fmtDesc->mSampleRate;
channelCount = fmtDesc->mChannelsPerFrame;

// NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
}
}

UInt32 bytesPerSample = 2 * channelCount;
Float32 normalizeMax = noiseFloor;
NSLog(@"normalizeMax = %f",normalizeMax);
NSMutableData * fullSongData = [[NSMutableData alloc] init];
[reader startReading];

UInt64 totalBytes = 0;
Float64 totalLeft = 0;
Float64 totalRight = 0;
Float32 sampleTally = 0;

NSInteger samplesPerPixel = sampleRate / 50;

while (reader.status == AVAssetReaderStatusReading){

AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];

if (sampleBufferRef){
CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);

size_t length = CMBlockBufferGetDataLength(blockBufferRef);
totalBytes += length;

NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];

NSMutableData * data = [NSMutableData dataWithLength:length];
CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);

SInt16 * samples = (SInt16 *) data.mutableBytes;
int sampleCount = length / bytesPerSample;
for (int i = 0; i < sampleCount ; i ++) {

Float32 left = (Float32) *samples++;
left = decibel(left);
left = minMaxX(left,noiseFloor,0);
totalLeft += left;

Float32 right;
if (channelCount==2) {
right = (Float32) *samples++;
right = decibel(right);
right = minMaxX(right,noiseFloor,0);
totalRight += right;
}

sampleTally++;

if (sampleTally > samplesPerPixel) {

left = totalLeft / sampleTally;
if (left > normalizeMax) {
normalizeMax = left;
}

// NSLog(@"left average = %f, normalizeMax = %f",left,normalizeMax);

[fullSongData appendBytes:&left length:sizeof(left)];

if (channelCount==2) {
right = totalRight / sampleTally;

if (right > normalizeMax) {
normalizeMax = right;
}

[fullSongData appendBytes:&right length:sizeof(right)];
}

totalLeft = 0;
totalRight = 0;
sampleTally = 0;
}
}

[wader drain];

CMSampleBufferInvalidate(sampleBufferRef);
CFRelease(sampleBufferRef);
}
}

NSData * finalData = nil;

if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){
// Something went wrong. Handle it.
}

if (reader.status == AVAssetReaderStatusCompleted){
// You're done. It worked.

NSLog(@"rendering output graphics using normalizeMax %f",normalizeMax);

UIImage *test = [self audioImageLogGraph:(Float32 *) fullSongData.bytes
normalizeMax:normalizeMax
sampleCount:fullSongData.length / (sizeof(Float32) * 2)
channelCount:2
imageHeight:100];

finalData = imageToData(test);
}

[fullSongData release];
[reader release];

return finalData;
}

比较输出

Linear
Acme Swing Company 的“Warm It Up”开始的线性图

logarithmic
Acme Swing Company 的“Warm It Up”开始的对数图

关于iphone - 使用 AVAssetReader 绘制波形,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5032775/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com