java - 如何有效(性能)从Java中的列表中删除许多项?

标签 java performance list collections

我有一个相当大的List命名项(> = 1,000,000个项),并且用表示的某些条件选择要删除的项,并且对于列表中的许多(也许一半)项都是正确的。

我的目标是有效地删除由选择的项目并保留所有其他项目,可以修改源列表,可以创建新列表-应该根据性能选择最佳方法。

这是我的测试代码:

    System.out.println("preparing items");
    List<Integer> items = new ArrayList<Integer>(); // Integer is for demo
    for (int i = 0; i < 1000000; i++) {
        items.add(i * 3); // just for demo
    }

    System.out.println("deleting items");
    long startMillis = System.currentTimeMillis();
    items = removeMany(items);
    long endMillis = System.currentTimeMillis();

    System.out.println("after remove: items.size=" + items.size() + 
            " and it took " + (endMillis - startMillis) + " milli(s)");

和天真的实现:
public static <T> List<T> removeMany(List<T> items) {
    int i = 0;
    Iterator<T> iter = items.iterator();
    while (iter.hasNext()) {
        T item = iter.next();
        // <cond> goes here
        if (/*<cond>: */i % 2 == 0) {
            iter.remove();
        }
        i++;
    }
    return items;
}

如您所见,我使用项索引模2 == 0作为删除条件()-仅用于演示目的。

可以提供什么更好的removeMany版本,为什么这个更好的版本实际上更好呢?

最佳答案

好的,现在该是提出的方法的测试结果了。这里是我测试过的方法(每种方法的名称在我的资源中也是类名):

  • NaiveRemoveManyPerformer-带有迭代器和remove的ArrayList-我的问题中给出的第一个天真的实现。
  • BetterNaiveRemoveManyPerformer-带有向后迭代并从头到尾删除的ArrayList
  • LinkedRemoveManyPerformer-天真的迭代器,可以删除但可以使用LinkedList。缺点:仅适用于LinkedList
  • CreateNewRemoveManyPerformer-ArrayList作为副本制作(仅添加保留的元素),迭代器用于遍历输入的ArrayList
  • SmartCreateNewRemoveManyPerformer-更好的CreateNewRemoveManyPerformer-结果ArrayList的初始大小(容量)设置为最终列表大小。缺点:启动时必须知道列表的最终大小。
  • FasterSmartCreateNewRemoveManyPerformer-更好(?)SmartCreateNewRemoveManyPerformer-使用项目索引(items.get(idx))代替迭代器。
  • MagicRemoveManyPerformer-就ArrayList而言就地工作(无列表副本),并从列表末尾开始压缩孔(已删除项目)。缺点:此方法更改列表中项目的顺序。
  • ForwardInPlaceRemoveManyPerformer-适用于ArrayList-移动保留项以填充孔,最后返回subList(没有最终移除或清除)。
  • GuavaArrayListRemoveManyPerformer-用于Iterables.removeIf的Google Guava ArrayList-与ForwardInPlaceRemoveManyPerformer几乎相同,但最终删除列表末尾的项目。

  • 完整的源代码在此答案的末尾给出。

    使用不同的列表大小(从10,000到10,000,000个项目)和不同的删除因子(指定必须从列表中删除多少个项目)执行测试。

    正如我在评论中发布的其他答案一样,我认为将项目从ArrayList复制到第二个ArrayList的过程比迭代LinkedList并删除项目要快。 Sun的Java文档指出,与ArrayList实现相比,LinkedList的常数因子较低,但是令人惊讶的是,我的问题并非如此。

    实际上,在大多数情况下,具有简单迭代和删除功能的LinkedList具有最佳性能(此方法在LinkedRemoveManyPerformer中实现)。通常,只有MagicRemoveManyPerformer的性能可与LinkedRemoveManyPerformer相提并论,其他方法则要慢得多。 Google Guava GuavaArrayListRemoveManyPerformer比手工制作的类似代码慢(因为我的代码不会删除列表末尾的不必要项目)。

    从1,000,000个源项目中删除500,000个项目的示例结果:
  • NaiveRemoveManyPerformer:未执行测试-我不是那个病人,但是它的表现比BetterNaiveRemoveManyPerformer差。
  • BetterNaiveRemoveManyPerformer:226080毫升
  • LinkedRemoveManyPerformer:69毫升
  • CreateNewRemoveManyPerformer:246毫升
  • SmartCreateNewRemoveManyPerformer:112毫升
  • FasterSmartCreateNewRemoveManyPerformer:202毫升
  • MagicRemoveManyPerformer:74毫升
  • ForwardInPlaceRemoveManyPerformer:69毫升
  • GuavaArrayListRemoveManyPerformer:118毫升

  • 从1,000,000个源项目中删除1个项目的示例结果(已删除第一个项目):
  • BetterNaiveRemoveManyPerformer:34毫升
  • LinkedRemoveManyPerformer:41毫升
  • CreateNewRemoveManyPerformer:253毫升
  • SmartCreateNewRemoveManyPerformer:108毫升
  • FasterSmartCreateNewRemoveManyPerformer:71毫升
  • MagicRemoveManyPerformer:43毫升
  • ForwardInPlaceRemoveManyPerformer:73毫升
  • GuavaArrayListRemoveManyPerformer:78毫升

  • 从1,000,000个源项目中删除333,334个项目的示例结果:
  • BetterNaiveRemoveManyPerformer:253206毫升
  • LinkedRemoveManyPerformer:69毫升
  • CreateNewRemoveManyPerformer:245毫升
  • SmartCreateNewRemoveManyPerformer:111毫米
  • FasterSmartCreateNewRemoveManyPerformer:203毫升
  • MagicRemoveManyPerformer:69毫升
  • ForwardInPlaceRemoveManyPerformer:72毫升
  • GuavaArrayListRemoveManyPerformer:102毫升

  • 从1,000,000个源项目中删除1,000,000个(所有)项目的示例结果(所有项目都被删除,但经过一个接一个的处理,如果您先验地知道要删除所有项目,则应该简单地清除列表):
  • BetterNaiveRemoveManyPerformer:58毫升
  • LinkedRemoveManyPerformer:88毫升
  • CreateNewRemoveManyPerformer:95毫升
  • SmartCreateNewRemoveManyPerformer:91毫升
  • FasterSmartCreateNewRemoveManyPerformer:48毫升
  • MagicRemoveManyPerformer:61毫升
  • ForwardInPlaceRemoveManyPerformer:49毫升
  • GuavaArrayListRemoveManyPerformer:133毫米

  • 我的最终结论是:使用混合方法-如果处理LinkedList-最好进行简单的迭代和删除,如果处理ArrayList-取决于项目顺序是否重要-然后使用ForwardInPlaceRemoveManyPerformer-如果可以更改项目顺序-最佳选择是MagicRemoveManyPerformer。如果先验地知道移除因子(您知道将要移除多少个项目还是保留了多少个项目),那么在特定情况下可以采用更多条件来选择效果更好的方法。但是已知的去除因子不是通常的情况... Google Guava Iterables.removeIf是这样一种混合解决方案,但是假设稍有不同(必须更改原始列表,不能创建新列表,并且始终按项目顺序排列)-这些是最常见的假设,因此removeIf在大多数现实生活中都是最佳选择。

    还要注意,所有好的方法(天真的不好!)都足够好-在实际应用中,其中任何一个都做得很好,但是必须避免使用天真的方法。

    最后-我的测试源代码。
    package WildWezyrListRemovalTesting;
    
    import com.google.common.base.Predicate;
    import com.google.common.collect.Iterables;
    import java.util.ArrayList;
    import java.util.Iterator;
    import java.util.LinkedList;
    import java.util.List;
    
    public class RemoveManyFromList {
    
        public static abstract class BaseRemoveManyPerformer {
    
            protected String performerName() {
                return getClass().getSimpleName();
            }
    
            protected void info(String msg) {
                System.out.println(performerName() + ": " + msg);
            }
    
            protected void populateList(List<Integer> items, int itemCnt) {
                for (int i = 0; i < itemCnt; i++) {
                    items.add(i);
                }
            }
    
            protected boolean mustRemoveItem(Integer itemVal, int itemIdx, int removeFactor) {
                if (removeFactor == 0) {
                    return false;
                }
                return itemIdx % removeFactor == 0;
            }
    
            protected abstract List<Integer> removeItems(List<Integer> items, int removeFactor);
    
            protected abstract List<Integer> createInitialList();
    
            public void testMe(int itemCnt, int removeFactor) {
                List<Integer> items = createInitialList();
                populateList(items, itemCnt);
                long startMillis = System.currentTimeMillis();
                items = removeItems(items, removeFactor);
                long endMillis = System.currentTimeMillis();
                int chksum = 0;
                for (Integer item : items) {
                    chksum += item;
                }
                info("removing took " + (endMillis - startMillis)
                        + " milli(s), itemCnt=" + itemCnt
                        + ", removed items: " + (itemCnt - items.size())
                        + ", remaining items: " + items.size()
                        + ", checksum: " + chksum);
            }
        }
        private List<BaseRemoveManyPerformer> rmps =
                new ArrayList<BaseRemoveManyPerformer>();
    
        public void addPerformer(BaseRemoveManyPerformer rmp) {
            rmps.add(rmp);
        }
        private Runtime runtime = Runtime.getRuntime();
    
        private void runGc() {
            for (int i = 0; i < 5; i++) {
                runtime.gc();
            }
        }
    
        public void testAll(int itemCnt, int removeFactor) {
            runGc();
            for (BaseRemoveManyPerformer rmp : rmps) {
                rmp.testMe(itemCnt, removeFactor);
            }
            runGc();
            System.out.println("\n--------------------------\n");
        }
    
        public static class NaiveRemoveManyPerformer
                extends BaseRemoveManyPerformer {
    
            @Override
            public List<Integer> removeItems(List<Integer> items, int removeFactor) {
                if (items.size() > 300000 && items instanceof ArrayList) {
                    info("this removeItems is too slow, returning without processing");
                    return items;
                }
                int i = 0;
                Iterator<Integer> iter = items.iterator();
                while (iter.hasNext()) {
                    Integer item = iter.next();
                    if (mustRemoveItem(item, i, removeFactor)) {
                        iter.remove();
                    }
                    i++;
                }
                return items;
            }
    
            @Override
            public List<Integer> createInitialList() {
                return new ArrayList<Integer>();
            }
        }
    
        public static class BetterNaiveRemoveManyPerformer
                extends NaiveRemoveManyPerformer {
    
            @Override
            public List<Integer> removeItems(List<Integer> items, int removeFactor) {
    //            if (items.size() > 300000 && items instanceof ArrayList) {
    //                info("this removeItems is too slow, returning without processing");
    //                return items;
    //            }
    
                for (int i = items.size(); --i >= 0;) {
                    Integer item = items.get(i);
                    if (mustRemoveItem(item, i, removeFactor)) {
                        items.remove(i);
                    }
                }
                return items;
            }
        }
    
        public static class LinkedRemoveManyPerformer
                extends NaiveRemoveManyPerformer {
    
            @Override
            public List<Integer> createInitialList() {
                return new LinkedList<Integer>();
            }
        }
    
        public static class CreateNewRemoveManyPerformer
                extends NaiveRemoveManyPerformer {
    
            @Override
            public List<Integer> removeItems(List<Integer> items, int removeFactor) {
                List<Integer> res = createResultList(items, removeFactor);
                int i = 0;
    
                for (Integer item : items) {
                    if (mustRemoveItem(item, i, removeFactor)) {
                        // no-op
                    } else {
                        res.add(item);
                    }
                    i++;
                }
    
                return res;
            }
    
            protected List<Integer> createResultList(List<Integer> items, int removeFactor) {
                return new ArrayList<Integer>();
            }
        }
    
        public static class SmartCreateNewRemoveManyPerformer
                extends CreateNewRemoveManyPerformer {
    
            @Override
            protected List<Integer> createResultList(List<Integer> items, int removeFactor) {
                int newCapacity = removeFactor == 0 ? items.size()
                        : (int) (items.size() * (removeFactor - 1L) / removeFactor + 1);
                //System.out.println("newCapacity=" + newCapacity);
                return new ArrayList<Integer>(newCapacity);
            }
        }
    
        public static class FasterSmartCreateNewRemoveManyPerformer
                extends SmartCreateNewRemoveManyPerformer {
    
            @Override
            public List<Integer> removeItems(List<Integer> items, int removeFactor) {
                List<Integer> res = createResultList(items, removeFactor);
    
                for (int i = 0; i < items.size(); i++) {
                    Integer item = items.get(i);
                    if (mustRemoveItem(item, i, removeFactor)) {
                        // no-op
                    } else {
                        res.add(item);
                    }
                }
    
                return res;
            }
        }
    
        public static class ForwardInPlaceRemoveManyPerformer
                extends NaiveRemoveManyPerformer {
    
            @Override
            public List<Integer> removeItems(List<Integer> items, int removeFactor) {
                int j = 0; // destination idx
                for (int i = 0; i < items.size(); i++) {
                    Integer item = items.get(i);
                    if (mustRemoveItem(item, i, removeFactor)) {
                        // no-op
                    } else {
                        if (j < i) {
                            items.set(j, item);
                        }
                        j++;
                    }
                }
    
                return items.subList(0, j);
            }
        }
    
        public static class MagicRemoveManyPerformer
                extends NaiveRemoveManyPerformer {
    
            @Override
            public List<Integer> removeItems(List<Integer> items, int removeFactor) {
                for (int i = 0; i < items.size(); i++) {
                    if (mustRemoveItem(items.get(i), i, removeFactor)) {
                        Integer retainedItem = removeSomeFromEnd(items, removeFactor, i);
                        if (retainedItem == null) {
                            items.remove(i);
                            break;
                        }
                        items.set(i, retainedItem);
                    }
                }
    
                return items;
            }
    
            private Integer removeSomeFromEnd(List<Integer> items, int removeFactor, int lowerBound) {
                for (int i = items.size(); --i > lowerBound;) {
                    Integer item = items.get(i);
                    items.remove(i);
                    if (!mustRemoveItem(item, i, removeFactor)) {
                        return item;
                    }
                }
                return null;
            }
        }
    
        public static class GuavaArrayListRemoveManyPerformer
                extends BaseRemoveManyPerformer {
    
            @Override
            protected List<Integer> removeItems(List<Integer> items, final int removeFactor) {
                Iterables.removeIf(items, new Predicate<Integer>() {
    
                    public boolean apply(Integer input) {
                        return mustRemoveItem(input, input, removeFactor);
                    }
                });
    
                return items;
            }
    
            @Override
            protected List<Integer> createInitialList() {
                return new ArrayList<Integer>();
            }
        }
    
        public void testForOneItemCnt(int itemCnt) {
            testAll(itemCnt, 0);
            testAll(itemCnt, itemCnt);
            testAll(itemCnt, itemCnt - 1);
            testAll(itemCnt, 3);
            testAll(itemCnt, 2);
            testAll(itemCnt, 1);
        }
    
        public static void main(String[] args) {
            RemoveManyFromList t = new RemoveManyFromList();
            t.addPerformer(new NaiveRemoveManyPerformer());
            t.addPerformer(new BetterNaiveRemoveManyPerformer());
            t.addPerformer(new LinkedRemoveManyPerformer());
            t.addPerformer(new CreateNewRemoveManyPerformer());
            t.addPerformer(new SmartCreateNewRemoveManyPerformer());
            t.addPerformer(new FasterSmartCreateNewRemoveManyPerformer());
            t.addPerformer(new MagicRemoveManyPerformer());
            t.addPerformer(new ForwardInPlaceRemoveManyPerformer());
            t.addPerformer(new GuavaArrayListRemoveManyPerformer());
    
            t.testForOneItemCnt(1000);
            t.testForOneItemCnt(10000);
            t.testForOneItemCnt(100000);
            t.testForOneItemCnt(200000);
            t.testForOneItemCnt(300000);
            t.testForOneItemCnt(500000);
            t.testForOneItemCnt(1000000);
            t.testForOneItemCnt(10000000);
        }
    }
    

    关于java - 如何有效(性能)从Java中的列表中删除许多项?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2043783/

    相关文章:

    java - Web 服务中错误代码的最佳实践

    java - getColumnHeaders()、getColumnNames() 和 getColumnTitles() 之间的区别

    php - 如何使sql查询按日期对结果进行分组和求和?

    python - 当我尝试从 python 列表中删除元素(pydot 对象)时出现错误

    python - 你如何比较 2 个列表并返回差异? (python 中的差异函数不返回我需要的东西)

    JAVA-从txt文件中读取整数并计算整数

    java - 动态调度任务的异步执行

    wpf - WPF (Silverlight) 布局 (Render)Transform 对应用性能的影响

    c# - 我应该重复使用点和矩形还是创建新的?

    python - 不使用 zip 创建嵌套列表