我能够根据第 n 个给定的字符限制将文本段落分解为子字符串。我遇到的冲突是我的算法正是这样做的,并且正在分解单词。这就是我被困住的地方。如果字符限制出现在单词的中间,我如何回溯到一个空格,以便我的所有子字符串都有完整的单词?
这是我正在使用的算法
int arrayLength = 0;
arrayLength = (int) Math.ceil(((mText.length() / (double) charLimit)));
String[] result = new String[arrayLength];
int j = 0;
int lastIndex = result.length - 1;
for (int i = 0; i < lastIndex; i++) {
result[i] = mText.substring(j, j + charLimit);
j += charLimit;
}
result[lastIndex] = mText.substring(j);
我正在使用任意第 n 个整数值设置 charLimit 变量。 mText 是带有一段文本的字符串。关于如何改进这个问题有什么建议吗?预先感谢您。
我收到了很好的回复,只是为了让你知道我做了什么来确定我是否降落在一个空间上,我使用了这个 while 循环。我只是不知道如何纠正这一点。
while (!strTemp.substring(strTemp.length() - 1).equalsIgnoreCase(" ")) {
// somehow refine string before added to array
}
最佳答案
不确定我是否正确理解了您想要的内容,但我的解释的答案:
您可以使用 lastIndexOf 找到字符限制之前的最后一个空格。然后检查您是否足够接近限制(对于没有空格的文本),即:
int arrayLength = 0;
arrayLength = (int) Math.ceil(((mText.length() / (double) charLimit)));
String[] result = new String[arrayLength];
int j = 0;
int tolerance = 10;
int splitpoint;
int lastIndex = result.length - 1;
for (int i = 0; i < lastIndex; i++) {
splitpoint = mText.lastIndexOf(' ' ,j+charLimit);
splitpoint = splitpoint > j+charLimit-tolerance ? splitpoint:j+charLimit;
result[i] = mText.substring(j, splitpoint).trim();
j = splitpoint;
}
result[lastIndex] = mText.substring(j).trim();
这将搜索 charLimit
之前的最后一个空格(示例值),如果字符串小于 tolerance
,则在此处拆分字符串,或者在 charLimit 处拆分
如果不是。
此解决方案的唯一问题是最后一个 Stringtoken 可能比 charLimit
长,因此您可能需要调整 arrayLength
并循环 while (mText - j >字符限制)
编辑
运行示例代码:
public static void main(String[] args) {
String mText = "I am able to break up paragraphs of text into substrings based upon nth given character limit. The conflict I have is that my algorithm is doing exactly this, and is breaking up words. This is where I am stuck. If the character limit occurs in the middle of a word, how can I back track to a space so that all my substrings have entire words?";
int charLimit = 40;
int arrayLength = 0;
arrayLength = (int) Math.ceil(((mText.length() / (double) charLimit)));
String[] result = new String[arrayLength];
int j = 0;
int tolerance = 10;
int splitpoint;
int lastIndex = result.length - 1;
for (int i = 0; i < lastIndex; i++) {
splitpoint = mText.lastIndexOf(' ' ,j+charLimit);
splitpoint = splitpoint > j+charLimit-tolerance ? splitpoint:j+charLimit;
result[i] = mText.substring(j, splitpoint);
j = splitpoint;
}
result[lastIndex] = mText.substring(j);
for (int i = 0; i<arrayLength; i++) {
System.out.println(result[i]);
}
}
输出:
I am able to break up paragraphs of text
into substrings based upon nth given
character limit. The conflict I have is
that my algorithm is doing exactly
this, and is breaking up words. This is
where I am stuck. If the character
limit occurs in the middle of a word,
how can I back track to a space so that
all my substrings have entire words?
其他编辑:根据 curiosu 的建议添加了 trim() 。它删除字符串标记周围的空格。
关于java - 将段落分解为字符串标记,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25411319/