我试图找到三个不同大小的哈希集的交集。通过改变集合相交的顺序,找到交集的速度是否有任何差异?示例程序如下:
public class RetainTest {
static Set<Integer> large =new HashSet<>();
static Set<Integer> medium =new HashSet<>();
static Set<Integer> small =new HashSet<>();
static int largeSize=10000;
static int midSize=5000;
static int smallSize=1000;
public static void main(String[] args){
preamble()
large.retainAll(medium);
large.retainAll(small);
System.out.println(large.size());
}
public static void preamble(){
large =new HashSet<>();
medium =new HashSet<>();
small =new HashSet<>();
Random rnd=new Random(15);
for(int i=0;i<largeSize;i++){
large.add(rnd.nextInt(largeSize*10));
}
for(int i=0;i<midSize;i++){
medium.add(rnd.nextInt(largeSize*10));
}
for(int i=0;i<smallSize;i++){
small.add(rnd.nextInt(largeSize*10));
}
}
}
最佳答案
分析表明,组合多个集合的最快方法是将较大的集合retainAll
合并到较小的集合中。此外,这些保留的顺序也应该从小到大。所以
small.retainAll(medium);
small.retainAll(large);
分析表明差异很显着:对于该数据集,最慢的订单花费的时间大约是最慢订单的 10 倍
测试程序
这些结果是使用以下测试程序创建的,该程序运行了 20 分钟
public class RetainTest {
static Set<Integer> large =new HashSet<>();
static Set<Integer> medium =new HashSet<>();
static Set<Integer> small =new HashSet<>();
static int largeSize=10000;
static int midSize=5000;
static int smallSize=1000;
public static void main(String[] args){
while(true){
preamble();
int size1=largeMediumSmall().size();
preamble();
int size2=largeSmallMedium().size();
preamble();
int size3=smallMediumLarge().size();
preamble();
int size4=smallLargeMedium().size();
preamble();
int size5=mediumSmallLarge().size();
preamble();
int size6=mediumLargeSmall().size();
//sanity check + ensuring the JIT can't optimise out
if (size1!=size2 || size1!=size3 || size1!=size4 || size1!=size5 || size1!=size6){
System.out.println("bad");
}
}
}
public static Set<Integer> largeMediumSmall(){
large.retainAll(medium);
large.retainAll(small);
return large;
}
public static Set<Integer> smallMediumLarge(){
small.retainAll(medium);
small.retainAll(large);
return small;
}
public static Set<Integer> smallLargeMedium(){
small.retainAll(large);
small.retainAll(medium);
return small;
}
public static Set<Integer> mediumSmallLarge(){
medium.retainAll(small);
medium.retainAll(large);
return medium;
}
public static Set<Integer> mediumLargeSmall(){
medium.retainAll(large);
medium.retainAll(small);
return medium;
}
public static Set<Integer> largeSmallMedium(){
large.retainAll(small);
large.retainAll(medium);
return large;
}
public static void preamble(){
large =new HashSet<>();
medium =new HashSet<>();
small =new HashSet<>();
Random rnd=new Random(15);
for(int i=0;i<largeSize;i++){
large.add(rnd.nextInt(largeSize*10));
}
for(int i=0;i<midSize;i++){
medium.add(rnd.nextInt(largeSize*10));
}
for(int i=0;i<smallSize;i++){
small.add(rnd.nextInt(largeSize*10));
}
}
}
关于java - 当找到多个集合的交集时,使用retainAll()的最快顺序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21431281/