java - 以最有效的方式从数据库中获取大量数据

在我的应用程序中，我必须读取大量数据。获得所有数据后，我将其放入列表中并对其进行处理并相应地工作。

现在我想知道我是否可以做任何事情，有什么办法可以加快从数据库获取数据的过程吗？我的数据库位于不同的服务器上，我正在使用 java 与数据库交互。

我没有确定的数据大小，即我需要处理的特定行数。我还听说我可以使用多线程，但是该怎么做呢？因为我不知道如何对数据进行分区，因为它是不确定的。即如果要应用以下伪代码

for(i=0 to number of partition) // Not certain on the number of partitions
    create new thread and get data.

或者也许我可以根据某些属性对数据进行哈希处理，然后告诉每个线程获取 map 的特定索引，但是在获取数据之前如何映射它？

我可以研究哪些可能的解决方案，以及如何进行？如果您需要更多信息，请告诉我。

谢谢。

最佳答案

I hear i can go for multithreading, but then how do go about it?

这绝对是加速从远程服务器查询信息的一个不错的选择。
通常在这些任务中 - 与服务器的 IO 是主要瓶颈，并且通过多线程 - 可以同时“请求”多行 - 有效地减少 IO 等待时间。

but then how do go about it?

这个想法是将工作分成更小的任务。看看java high level concurrency API更多细节。
一种解决方案是让每个线程从服务器读取大小为 M 的 block ，并在其中仍然有数据(服务器)时为每个线程重复该过程。类似的东西(对于每个线程):

data = "start";
int chunk = threadNumber;
while (data != null) {
  requestChunk(chunk);
  chunk += numberOfThreads;
}

我在这里假设一旦你“出界”，服务器就会返回null(或者requestChunk()处理它并返回null)。

Or maybe i can hash data on the basis of some attribute and later tell each thread to fetch a particular index of the map

如果您需要迭代数据并检索所有数据 - 散列通常是一个糟糕的解决方案。它的缓存效率非常低，并且对于这种情况来说开销太大。

关于java - 以最有效的方式从数据库中获取大量数据，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12085730/