java - 在 .txt 文件中查找所有字符串 "the"

这是我的代码:

// Import io so we can use file objects
import java.io.*;

public class SearchThe {
    public static void main(String args[]) {
        try {
            String stringSearch = "the";
            // Open the file c:\test.txt as a buffered reader
            BufferedReader bf = new BufferedReader(new FileReader("test.txt"));

            // Start a line count and declare a string to hold our current line.
            int linecount = 0;
                String line;

            // Let the user know what we are searching for
            System.out.println("Searching for " + stringSearch + " in file...");

            // Loop through each line, stashing the line into our line variable.
            while (( line = bf.readLine()) != null){
                // Increment the count and find the index of the word
                linecount++;
                int indexfound = line.indexOf(stringSearch);

                // If greater than -1, means we found the word
                if (indexfound > -1) {
                    System.out.println("Word was found at position " + indexfound + " on line " + linecount);
                }
            }

            // Close the file after done searching
            bf.close();
        }
        catch (IOException e) {
            System.out.println("IO Error Occurred: " + e.toString());
        }
    }
}

我想在 test.txt 文件中找到一些单词 "the"。问题是当我找到第一个 “the” 时，我的程序停止寻找更多。

当像“then”这样的词时，我的程序将其理解为“the”。

最佳答案

使用不区分大小写的正则表达式，使用单词边界来查找“the”的所有实例和变体。

indexOf("the") 无法区分 "the" 和 "then"，因为它们都以 "the"开头。同样，“the”出现在 “anathema” 的中间。

为避免这种情况，请使用正则表达式，并搜索“the”，两边都有单词边界 (\b)。使用单词边界，而不是在 ""上拆分，或者仅使用 indexOf("the ")(两边的空格)，这将找不到 "the." 和其他标点符号旁边的实例。您也可以不区分大小写地搜索 “The”。

Pattern p = Pattern.compile("\\bthe\\b", Pattern.CASE_INSENSITIVE);

while ( (line = bf.readLine()) != null) {
    linecount++;

    Matcher m = p.matcher(line);

    // indicate all matches on the line
    while (m.find()) {
        System.out.println("Word was found at position " + 
                       m.start() + " on line " + linecount);
    }
}

关于java - 在 .txt 文件中查找所有字符串 "the"，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/3697833/

java - 在 .txt 文件中查找所有字符串 "the"

上一篇：java - 呈现网页时 onPageFinished 未正确触发

下一篇：java - jNetPcap 与 Jpcap