我想将文本拆分成一个数组,保留由其余单词分隔的标点符号,因此字符串如下:
Hello, I am Albert Einstein.
应该变成这样的数组:
["Hello", ",", "I", "am", "Albert", "Einstein", "."]
我已经尝试使用 sting.components(separatedBy: CharacterSet.init(charactersIn: ",;;:"))
但是这个方法会删除所有标点符号,并返回一个像这样的数组:
["Hello", "I", "am", "Albert", "Einstein"]
那么,我怎样才能像我的第一个例子那样得到一个数组呢?
最佳答案
它作为解决方案并不漂亮,但您可以尝试:
var str = "Hello, I am Albert Einstein."
var list = [String]()
var currentSubString = "";
//enumerate to get all characters including ".", ",", ";", " "
str.enumerateSubstrings(in: str.startIndex..<str.endIndex, options: String.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, value) in
if let _subString = substring {
if (!currentSubString.isEmpty &&
(_subString.compare(" ") == .orderedSame
|| _subString.compare(",") == .orderedSame
|| _subString.compare(".") == .orderedSame
|| _subString.compare(";") == .orderedSame
)
) {
//create word if see any of those character and currentSubString is not empty
list.append(currentSubString)
currentSubString = _subString.trimmingCharacters(in: CharacterSet.whitespaces )
} else {
//add to current sub string if current character is not space.
if (_subString.compare(" ") != .orderedSame) {
currentSubString += _subString
}
}
}
}
//last word
if (!currentSubString.isEmpty) {
list.append(currentSubString)
}
在 Swift3 中:
var str = "Hello, I am Albert Einstein."
var list = [String]()
var currentSubString = "";
//enumerate to get all characters including ".", ",", ";", " "
str.enumerateSubstrings(in: str.startIndex..<str.endIndex, options: String.EnumerationOptions.byComposedCharacterSequences) { (substring, substringRange, enclosingRange, value) in
if let _subString = substring {
if (!currentSubString.isEmpty &&
(_subString.compare(" ") == .orderedSame
|| _subString.compare(",") == .orderedSame
|| _subString.compare(".") == .orderedSame
|| _subString.compare(";") == .orderedSame
)
) {
//create word if see any of those character and currentSubString is not empty
list.append(currentSubString)
currentSubString = _subString.trimmingCharacters(in: CharacterSet.whitespaces )
} else {
//add to current sub string if current character is not space.
if (_subString.compare(" ") != .orderedSame) {
currentSubString += _subString
}
}
}
}
//last word
if (!currentSubString.isEmpty) {
list.append(currentSubString)
}
想法是循环所有字符并同时创建单词。一个词是一组连续的字符,不是、
、
、.
或;
。所以,在循环创建单词的过程中,如果我们看到其中一个字符,我们就完成了当前单词,并且当前正在构造的单词不为空。
使用您的输入分解步骤:
- get
H
(不是空格或其他终端字符) -> currentSubString = "H" - get
e
(不是空格或其他终端字符) -> currentSubString = "他" - get
l
(不是空格或其他终端字符) -> currentSubString = "Hel" - get
l
(不是空格或其他终端字符) -> currentSubString = " hell " - get
o
(不是空格或其他终端字符) -> currentSubString = "你好" - get
.
(是终端字符)- -> 因为 currentSubString 不为空,添加到
list
并重新开始构造下一个单词,然后 list = ["Hello"] - -> currentSubString = "."(我使用修剪的原因只是为了在我得到这个字符时删除
。但对于其他终端字符,我们必须保留下一个单词。
- -> 因为 currentSubString 不为空,添加到
- get
(是空格字符)
- -> currentSubString不为空,添加到
list
中,重新构建-> list = ["Hello", "."] - -> currentSubString = ""(修剪)。 ...等等。
- -> currentSubString不为空,添加到
关于ios - 将文本拆分为数组,同时保持 Swift 中的标点符号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39834953/