我有一个相当大的List命名项(> = 1,000,000个项),并且用
我的目标是有效地删除由
这是我的测试代码: System.out.println("preparing items");
List<Integer> items = new ArrayList<Integer>(); // Integer is for demo
for (int i = 0; i < 1000000; i++) {
items.add(i * 3); // just for demo
}
System.out.println("deleting items");
long startMillis = System.currentTimeMillis();
items = removeMany(items);
long endMillis = System.currentTimeMillis();
System.out.println("after remove: items.size=" + items.size() +
" and it took " + (endMillis - startMillis) + " milli(s)");
和天真的实现:public static <T> List<T> removeMany(List<T> items) {
int i = 0;
Iterator<T> iter = items.iterator();
while (iter.hasNext()) {
T item = iter.next();
// <cond> goes here
if (/*<cond>: */i % 2 == 0) {
iter.remove();
}
i++;
}
return items;
}
如您所见,我使用项索引模2 == 0作为删除条件(
可以提供什么更好的removeMany
版本,为什么这个更好的版本实际上更好呢?
最佳答案
好的,现在该是提出的方法的测试结果了。这里是我测试过的方法(每种方法的名称在我的资源中也是类名):
NaiveRemoveManyPerformer
-带有迭代器和remove的ArrayList
-我的问题中给出的第一个天真的实现。 BetterNaiveRemoveManyPerformer
-带有向后迭代并从头到尾删除的ArrayList
。 LinkedRemoveManyPerformer
-天真的迭代器,可以删除但可以使用LinkedList
。缺点:仅适用于LinkedList
。 CreateNewRemoveManyPerformer
-ArrayList
作为副本制作(仅添加保留的元素),迭代器用于遍历输入的ArrayList
。 SmartCreateNewRemoveManyPerformer
-更好的CreateNewRemoveManyPerformer
-结果ArrayList
的初始大小(容量)设置为最终列表大小。缺点:启动时必须知道列表的最终大小。 FasterSmartCreateNewRemoveManyPerformer
-更好(?)SmartCreateNewRemoveManyPerformer
-使用项目索引(items.get(idx)
)代替迭代器。 MagicRemoveManyPerformer
-就ArrayList
而言就地工作(无列表副本),并从列表末尾开始压缩孔(已删除项目)。缺点:此方法更改列表中项目的顺序。 ForwardInPlaceRemoveManyPerformer
-适用于ArrayList
-移动保留项以填充孔,最后返回subList(没有最终移除或清除)。 GuavaArrayListRemoveManyPerformer
-用于Iterables.removeIf
的Google Guava ArrayList
-与ForwardInPlaceRemoveManyPerformer
几乎相同,但最终删除列表末尾的项目。 完整的源代码在此答案的末尾给出。
使用不同的列表大小(从10,000到10,000,000个项目)和不同的删除因子(指定必须从列表中删除多少个项目)执行测试。
正如我在评论中发布的其他答案一样,我认为将项目从
ArrayList
复制到第二个ArrayList
的过程比迭代LinkedList
并删除项目要快。 Sun的Java文档指出,与ArrayList
实现相比,LinkedList
的常数因子较低,但是令人惊讶的是,我的问题并非如此。实际上,在大多数情况下,具有简单迭代和删除功能的
LinkedList
具有最佳性能(此方法在LinkedRemoveManyPerformer
中实现)。通常,只有MagicRemoveManyPerformer
的性能可与LinkedRemoveManyPerformer
相提并论,其他方法则要慢得多。 Google Guava GuavaArrayListRemoveManyPerformer
比手工制作的类似代码慢(因为我的代码不会删除列表末尾的不必要项目)。从1,000,000个源项目中删除500,000个项目的示例结果:
NaiveRemoveManyPerformer
:未执行测试-我不是那个病人,但是它的表现比BetterNaiveRemoveManyPerformer
差。 BetterNaiveRemoveManyPerformer
:226080毫升LinkedRemoveManyPerformer
:69毫升CreateNewRemoveManyPerformer
:246毫升SmartCreateNewRemoveManyPerformer
:112毫升FasterSmartCreateNewRemoveManyPerformer
:202毫升MagicRemoveManyPerformer
:74毫升ForwardInPlaceRemoveManyPerformer
:69毫升GuavaArrayListRemoveManyPerformer
:118毫升从1,000,000个源项目中删除1个项目的示例结果(已删除第一个项目):
从1,000,000个源项目中删除333,334个项目的示例结果:
从1,000,000个源项目中删除1,000,000个(所有)项目的示例结果(所有项目都被删除,但经过一个接一个的处理,如果您先验地知道要删除所有项目,则应该简单地清除列表):
我的最终结论是:使用混合方法-如果处理LinkedList-最好进行简单的迭代和删除,如果处理ArrayList-取决于项目顺序是否重要-然后使用ForwardInPlaceRemoveManyPerformer-如果可以更改项目顺序-最佳选择是MagicRemoveManyPerformer。如果先验地知道移除因子(您知道将要移除多少个项目还是保留了多少个项目),那么在特定情况下可以采用更多条件来选择效果更好的方法。但是已知的去除因子不是通常的情况... Google Guava
Iterables.removeIf
是这样一种混合解决方案,但是假设稍有不同(必须更改原始列表,不能创建新列表,并且始终按项目顺序排列)-这些是最常见的假设,因此removeIf
在大多数现实生活中都是最佳选择。还要注意,所有好的方法(天真的不好!)都足够好-在实际应用中,其中任何一个都做得很好,但是必须避免使用天真的方法。
最后-我的测试源代码。
package WildWezyrListRemovalTesting;
import com.google.common.base.Predicate;
import com.google.common.collect.Iterables;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
public class RemoveManyFromList {
public static abstract class BaseRemoveManyPerformer {
protected String performerName() {
return getClass().getSimpleName();
}
protected void info(String msg) {
System.out.println(performerName() + ": " + msg);
}
protected void populateList(List<Integer> items, int itemCnt) {
for (int i = 0; i < itemCnt; i++) {
items.add(i);
}
}
protected boolean mustRemoveItem(Integer itemVal, int itemIdx, int removeFactor) {
if (removeFactor == 0) {
return false;
}
return itemIdx % removeFactor == 0;
}
protected abstract List<Integer> removeItems(List<Integer> items, int removeFactor);
protected abstract List<Integer> createInitialList();
public void testMe(int itemCnt, int removeFactor) {
List<Integer> items = createInitialList();
populateList(items, itemCnt);
long startMillis = System.currentTimeMillis();
items = removeItems(items, removeFactor);
long endMillis = System.currentTimeMillis();
int chksum = 0;
for (Integer item : items) {
chksum += item;
}
info("removing took " + (endMillis - startMillis)
+ " milli(s), itemCnt=" + itemCnt
+ ", removed items: " + (itemCnt - items.size())
+ ", remaining items: " + items.size()
+ ", checksum: " + chksum);
}
}
private List<BaseRemoveManyPerformer> rmps =
new ArrayList<BaseRemoveManyPerformer>();
public void addPerformer(BaseRemoveManyPerformer rmp) {
rmps.add(rmp);
}
private Runtime runtime = Runtime.getRuntime();
private void runGc() {
for (int i = 0; i < 5; i++) {
runtime.gc();
}
}
public void testAll(int itemCnt, int removeFactor) {
runGc();
for (BaseRemoveManyPerformer rmp : rmps) {
rmp.testMe(itemCnt, removeFactor);
}
runGc();
System.out.println("\n--------------------------\n");
}
public static class NaiveRemoveManyPerformer
extends BaseRemoveManyPerformer {
@Override
public List<Integer> removeItems(List<Integer> items, int removeFactor) {
if (items.size() > 300000 && items instanceof ArrayList) {
info("this removeItems is too slow, returning without processing");
return items;
}
int i = 0;
Iterator<Integer> iter = items.iterator();
while (iter.hasNext()) {
Integer item = iter.next();
if (mustRemoveItem(item, i, removeFactor)) {
iter.remove();
}
i++;
}
return items;
}
@Override
public List<Integer> createInitialList() {
return new ArrayList<Integer>();
}
}
public static class BetterNaiveRemoveManyPerformer
extends NaiveRemoveManyPerformer {
@Override
public List<Integer> removeItems(List<Integer> items, int removeFactor) {
// if (items.size() > 300000 && items instanceof ArrayList) {
// info("this removeItems is too slow, returning without processing");
// return items;
// }
for (int i = items.size(); --i >= 0;) {
Integer item = items.get(i);
if (mustRemoveItem(item, i, removeFactor)) {
items.remove(i);
}
}
return items;
}
}
public static class LinkedRemoveManyPerformer
extends NaiveRemoveManyPerformer {
@Override
public List<Integer> createInitialList() {
return new LinkedList<Integer>();
}
}
public static class CreateNewRemoveManyPerformer
extends NaiveRemoveManyPerformer {
@Override
public List<Integer> removeItems(List<Integer> items, int removeFactor) {
List<Integer> res = createResultList(items, removeFactor);
int i = 0;
for (Integer item : items) {
if (mustRemoveItem(item, i, removeFactor)) {
// no-op
} else {
res.add(item);
}
i++;
}
return res;
}
protected List<Integer> createResultList(List<Integer> items, int removeFactor) {
return new ArrayList<Integer>();
}
}
public static class SmartCreateNewRemoveManyPerformer
extends CreateNewRemoveManyPerformer {
@Override
protected List<Integer> createResultList(List<Integer> items, int removeFactor) {
int newCapacity = removeFactor == 0 ? items.size()
: (int) (items.size() * (removeFactor - 1L) / removeFactor + 1);
//System.out.println("newCapacity=" + newCapacity);
return new ArrayList<Integer>(newCapacity);
}
}
public static class FasterSmartCreateNewRemoveManyPerformer
extends SmartCreateNewRemoveManyPerformer {
@Override
public List<Integer> removeItems(List<Integer> items, int removeFactor) {
List<Integer> res = createResultList(items, removeFactor);
for (int i = 0; i < items.size(); i++) {
Integer item = items.get(i);
if (mustRemoveItem(item, i, removeFactor)) {
// no-op
} else {
res.add(item);
}
}
return res;
}
}
public static class ForwardInPlaceRemoveManyPerformer
extends NaiveRemoveManyPerformer {
@Override
public List<Integer> removeItems(List<Integer> items, int removeFactor) {
int j = 0; // destination idx
for (int i = 0; i < items.size(); i++) {
Integer item = items.get(i);
if (mustRemoveItem(item, i, removeFactor)) {
// no-op
} else {
if (j < i) {
items.set(j, item);
}
j++;
}
}
return items.subList(0, j);
}
}
public static class MagicRemoveManyPerformer
extends NaiveRemoveManyPerformer {
@Override
public List<Integer> removeItems(List<Integer> items, int removeFactor) {
for (int i = 0; i < items.size(); i++) {
if (mustRemoveItem(items.get(i), i, removeFactor)) {
Integer retainedItem = removeSomeFromEnd(items, removeFactor, i);
if (retainedItem == null) {
items.remove(i);
break;
}
items.set(i, retainedItem);
}
}
return items;
}
private Integer removeSomeFromEnd(List<Integer> items, int removeFactor, int lowerBound) {
for (int i = items.size(); --i > lowerBound;) {
Integer item = items.get(i);
items.remove(i);
if (!mustRemoveItem(item, i, removeFactor)) {
return item;
}
}
return null;
}
}
public static class GuavaArrayListRemoveManyPerformer
extends BaseRemoveManyPerformer {
@Override
protected List<Integer> removeItems(List<Integer> items, final int removeFactor) {
Iterables.removeIf(items, new Predicate<Integer>() {
public boolean apply(Integer input) {
return mustRemoveItem(input, input, removeFactor);
}
});
return items;
}
@Override
protected List<Integer> createInitialList() {
return new ArrayList<Integer>();
}
}
public void testForOneItemCnt(int itemCnt) {
testAll(itemCnt, 0);
testAll(itemCnt, itemCnt);
testAll(itemCnt, itemCnt - 1);
testAll(itemCnt, 3);
testAll(itemCnt, 2);
testAll(itemCnt, 1);
}
public static void main(String[] args) {
RemoveManyFromList t = new RemoveManyFromList();
t.addPerformer(new NaiveRemoveManyPerformer());
t.addPerformer(new BetterNaiveRemoveManyPerformer());
t.addPerformer(new LinkedRemoveManyPerformer());
t.addPerformer(new CreateNewRemoveManyPerformer());
t.addPerformer(new SmartCreateNewRemoveManyPerformer());
t.addPerformer(new FasterSmartCreateNewRemoveManyPerformer());
t.addPerformer(new MagicRemoveManyPerformer());
t.addPerformer(new ForwardInPlaceRemoveManyPerformer());
t.addPerformer(new GuavaArrayListRemoveManyPerformer());
t.testForOneItemCnt(1000);
t.testForOneItemCnt(10000);
t.testForOneItemCnt(100000);
t.testForOneItemCnt(200000);
t.testForOneItemCnt(300000);
t.testForOneItemCnt(500000);
t.testForOneItemCnt(1000000);
t.testForOneItemCnt(10000000);
}
}
关于java - 如何有效(性能)从Java中的列表中删除许多项?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2043783/