我有中文新闻提要,我想将句子分成更小的 block 以传递给 API。
我如何在 ios 中执行此操作?我已经为英语语言设置了 50 个字符的字符长度。
目前我正在使用 rangeOfString:
函数来查找点、逗号和断句。
NSString *str = nil, *rem = nil;
str = [final substringToIndex:MAX_CHAR_Private];
rem = [final substringFromIndex:MAX_CHAR_Private];
NSRange rng = [rem rangeOfString:@"?"];
if (rng.location == NSNotFound) {
rng = [rem rangeOfString:@"!"];
if (rng.location == NSNotFound) {
rng = [rem rangeOfString:@","];
if (rng.location == NSNotFound) {
rng = [rem rangeOfString:@"."];
if (rng.location == NSNotFound) {
rng = [rem rangeOfString:@" "];
}
}
}
}
if (rng.location+1 + MAX_CHAR_Private > MAXIMUM_LIMIT_Private) {
rng = [rem rangeOfString:@" "];
}
if (rng.location == NSNotFound) {
remaining = [[final substringFromIndex:MAX_CHAR_Private] retain];
}
else{
//NSRange rng = [rem rangeOfString:@" "];
str = [str stringByAppendingString:[rem substringToIndex:rng.location]];
remaining = [[final substringFromIndex:MAX_CHAR_Private + rng.location+1] retain];
}
这对于中文和日文字符无法正常工作。
最佳答案
检查 NSLinguisticTagger,它应该可以用于中文:
来自 Apple:“NSLinguisticTagger 类用于自动分割自然语言文本并用信息标记它,例如词性。它还可以标记语言、脚本、词干形式等。”
Apple 文档 NSLinguisticTagger Class Reference
关于ios - iPhone SDK : Break chinese sentence into words and letters,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24530762/