gpt4 book ai didi

objective-c - 将文本拆分为单词、数字和标点符号

转载 作者:搜寻专家 更新时间:2023-10-30 20:21:49 25 4
gpt4 key购买 nike

我需要将短语拆分为单词、数字、标点符号和空格/制表符。我也想保持事物的秩序。

NSString *text = [NSString stringWithFormat:@"The 3 quick:\"brown fox, jump's\" over."];

这是我需要产生的那种列表:

['The', ' ', '3', ' ', 'quick, ':', '"', 'brown', ' ', 'fox', ',', ' ', 'jump's', ' ', '.']

谢谢!!

最佳答案

试用我使用 NSScannerNSCharacterSet 编写的这个类别:

@interface NSString(Splitting)

-(NSArray *) arrayBySeparatingComponentsInCharacterSet:(NSCharacterSet *) charSet;

@end

@implementation NSString(Splitting)

BOOL scanOneCharacterFromSetIntoString(NSScanner *self, NSCharacterSet * charSet, NSString **outStr);
BOOL scanOneCharacterFromSetIntoString(NSScanner *self, NSCharacterSet * charSet, NSString **outStr)
{
// check for index out of bounds
NSString *inStr = self.string;

if (self.scanLocation >= inStr.length)
{
return NO;
}

unichar ch = [inStr characterAtIndex:self.scanLocation];

if (![charSet characterIsMember:ch])
{
return NO;
}

self.scanLocation++;
if (outStr)
{
*outStr = [NSString stringWithCharacters:&ch length:1];
}

return YES;
}

-(NSArray *) arrayBySeparatingComponentsInCharacterSet:(NSCharacterSet *)charSet
{
NSScanner *scanner = [NSScanner scannerWithString:self];
NSMutableArray *result = [NSMutableArray array];

NSString *temp = nil;
while ([scanner scanUpToCharactersFromSet:charSet intoString:&temp] || scanOneCharacterFromSetIntoString(scanner, charSet, &temp)) {;
[result addObject:temp];

if ([scanner scanLocation] >= [self length])
{
break;
}

unichar temp2 = [self characterAtIndex:[scanner scanLocation]];

if ([charSet characterIsMember:temp2])
{
[result addObject:[NSString stringWithFormat:@"%c", temp2]];
// only update the scan location if the scan was sucessful
[scanner setScanLocation:[scanner scanLocation] + 1];
}
}

return result;
}

@end

int main (int argc, const char * argv[])
{
@autoreleasepool {

NSString *str = @"The 3 quick:\"brown fox, jump's\" over.";
NSArray *array = [str arrayBySeparatingComponentsInCharacterSet:[NSCharacterSet characterSetWithCharactersInString:@" :\",'."]];
NSLog(@"%@", array);
}
}

应该是你需要的,把Character Set改成你需要的就可以了。另请注意,这是在启用 ARC 的情况下编译的,因此它可能会或可能不会在引用计数环境中与内存管理一起正常工作。

关于objective-c - 将文本拆分为单词、数字和标点符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8870032/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com