java - HashMap MySQL - 最佳实践

我遇到过一个场景，任务是读取包含300万 IP地址的文件。
MySQL中有一个表，其中包含Id,PrimaryIP，PrimaryIP可以由#分隔的多个IP组成，更重要的是PrimaryIP > 还可以包含CIDR IP。

总共有8000条记录，每条记录都有多个IP和CIDR IP。

现在，我的任务是读取该文件，将其与数据库进行检查并将匹配的 IP,ID 写入文件。

最初，当我运行我的程序时，我的程序失败了，因为:java.lang.OutOfMemoryError: Java heap space，所以我将其增加了3GB，但仍然失败，后来我拆分了将文件分成 6 个子文件，每个子文件 0.5 百万。

为了查找 CIDR IP 列表，我使用了 Apache SubnetUtils。

下面是我的代码:

public static void main(String[] args) {

        String sqlQuery = "SELECT id,PrimaryIP from IPTable where PrimaryIP != '' limit 100000;";
        Connection connection = null;
        Statement statement = null;
        File oFile = new File("output.txt");
        System.out.println(new Date());
        try{
            List<String> fileData = FileUtils.readLines(new File("input.txt"));
            System.out.println("File Data Size : "+fileData.size());

            Class.forName("com.mysql.jdbc.Driver");
            connection = DriverManager.getConnection("jdbc:mysql://localhost/db?user=root&password=pwd");

            statement = connection.createStatement();
            ResultSet resultSet = statement.executeQuery(sqlQuery);

            System.out.println("Started with MySQL Querying");

            Map<String, Integer> primaryIPIDMap = new HashMap<String, Integer>();

            while (resultSet.next()) {
                primaryIPIDMap.clear();
                int recordID = resultSet.getInt(1);

                if (resultSet.getString(2).contains("#")) {
                    String primaryIP[] = resultSet.getString(2).split("#");

                    for (int i = 0; i < primaryIP.length; i++) {
                        if (primaryIP[i].contains("/")) {
                            String allIP[] = getAllIP(primaryIP[i]);
                            for (int allIPi = 0; allIPi < allIP.length; allIPi++) {
                                primaryIPIDMap.put(allIP[allIPi].intern(), recordID);
                            }
                        } else {
                            primaryIPIDMap.put(primaryIP[i].intern(), recordID);
                        }
                    }
                } else {
                    primaryIPIDMap.put(resultSet.getString(2).intern(), recordID);
                }

                Iterator entries = fileData.iterator();
                while (entries.hasNext()) {
                    String t = (String) entries.next();
                    if (primaryIPIDMap.containsKey(t)) {
                        FileUtils.writeStringToFile(oFile, recordID + "," + t);
                    }
                }
                primaryIPIDMap.clear();
            }

            resultSet.close();
            statement.close();
            connection.close();
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            try {
                if (statement != null)
                    statement.close();
            } catch (Exception se2) {
            }
            try {
                if (connection != null)
                    connection.close();
            } catch (Exception se) {
                se.printStackTrace();
            }
        }

        System.out.println("Finished");
        System.out.println("End Time : "+new Date());
    }

    private static String[] getAllIP(String ip) {
        return new SubnetUtils(ip).getInfo().getAllAddresses();
    }

有人可以告诉我解决这个问题的最佳实践吗。
今天只有300万，明天可能会是500万。我无法继续创建子文件。

最佳答案

我使用解决了问题

读取输入文件 line-by-line
我没有更改 MySQL 表结构，因为它在很多地方都有依赖关系，并且 table was not designed by me .

关于java - HashMap MySQL - 最佳实践，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27408806/

java - HashMap MySQL - 最佳实践

上一篇：php - mysql_fetch_array()/mysql_fetch_assoc()/mysql_fetch_row()/mysql_num_rows 等...期望参数 1 是资源

下一篇：mysql - 连接多个表sql