java - 重复项未被删除

标签 java file arraylist

import java.io.BufferedReader;
import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.Iterator;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;

public class Test {

    List<String> knownWordsArrayList = new ArrayList<String>();
    List<String> wordsArrayList = new ArrayList<String>();

    public void readKnownWordsFile()
    {
        try {
            FileInputStream fstream2 = new FileInputStream("knownWords.txt");
            BufferedReader br2 = new BufferedReader(new InputStreamReader(fstream2));
            String strLine;
            while ((strLine = br2.readLine()) != null) {
                knownWordsArrayList.add(strLine);
            }
        } catch (Exception e) {
        }

    }

    public void readFile() {
        try {
            FileInputStream fstream = new FileInputStream("newWords.txt");
            BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
            String strLine;
            String numberedLineRemoved = "";
            String strippedInput = "";
            String[] words;
            String trimmedString = "";
            while ((strLine = br.readLine()) != null) {
                numberedLineRemoved = numberedLine(strLine);
                strippedInput = numberedLineRemoved.replaceAll("\\p{Punct}", "");
                if ((strippedInput.trim().length() != 0) || (!strippedInput.contains("")) || (strippedInput.contains(" "))) {
                    words = strippedInput.split("\\s+");
                    for (int i = 0; i < words.length; i++) {
                        if (words[i].trim().length() != 0) {
                            wordsArrayList.add(words[i]);
                        }
                    }
                }
            }

            for (int i = 0; i < knownWordsArrayList.size(); i++) {
                wordsArrayList.add(knownWordsArrayList.get(i));
            }
            HashSet h = new HashSet(wordsArrayList);
            wordsArrayList.clear();
            wordsArrayList.addAll(h);
            for (int i = 0; i < wordsArrayList.size(); i++) {
                System.out.println(wordsArrayList.get(i));
            }
            System.out.println(wordsArrayList.size());
            in.close();
        } catch (Exception e) {// Catch exception if any
            System.err.println("Error: " + e.getMessage());
        }
    }

    public String numberedLine(String string) {
        if (string.matches(".*\\d.*")) {
            return "";
        } else {
            return string;
        }
    }

    public static void main(String[] args) {
        Test test = new Test();
        test.readKnownWordsFile();
        test.readFile();

    }

}

从文件添加到 knownWordsArrayList。然后我转到另一个文件并将单词放入 wordsArrayList。然后我创建一个 hashSet 来删除重复的单词,但它们仍然存在。例如,“Mrs”在 knownWordsArrayList 中,但当我打印 wordsArrayList 时,我仍然看到“Mrs”。我不明白为什么没有删除重复的单词。会不会和字符集有关?

最佳答案

For example, "Mrs" is in the knownWordsArrayList but when I print wordsArrayList, I still see "Mrs".

嗯,是的,会的。您显式地将 所有 knownWordsArrayList 中的值添加到 wordsArrayList:

 for (int i = 0; i < knownWordsArrayList.size(); i++) {
      wordsArrayList.add(knownWordsArrayList.get(i));
 }

目前还不清楚您的代码要做什么(使用整个集合操作和增强的 for 循环将有助于清晰度)但这就是为什么 knownWordsArrayList 中的所有内容也在wordsArrayList.

重要的是,这个声明:

Then I make a hashSet to remove the duplicated words

... 只是意味着每个单词只会出现一次。这就是它要做的一切。

我怀疑您应该删除我在上面引用的代码,而是:

HashSet h = new HashSet(wordsArrayList);
h.removeAll(knownWordsArrayList);
wordsArrayList = new ArrayList<String>(h);

关于java - 重复项未被删除,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15976234/

相关文章:

java - 将数据从 ArrayList 加载到 DefaultTableModel

java - IBM java POST API 抛出 SSL 握手异常

java - Hibernate 在一个类中包含多个一对多关系

Java读取文件的问题

C - 读取整个文件最后带有垃圾

java - 从 Java ArrayList 中得到意想不到的结果

java - Spring/MVC webapp 中 Controller 和 Model Java 类的命名约定?

java - PreparedStatement : How to insert data into multiple tables using JDBC

javascript - 如何使用异步回调响应返回文件内容?

java - 如何使用 java-ascii-table 创建表来显示 testObject 的字段值