java - 即使使用线程池,多线程时许多短期任务也会变慢

标签 java multithreading performance threadpool

背景

我目前有一个线性物理引擎(但这个问题不需要物理引擎知识),其中一部分我正在尝试多线程以提高效率。

一个这样的部分是广泛阶段,在这种情况下,这涉及沿所有 3 个轴移动所有对象以检查哪些重叠(所有轴上发生的任何重叠都被认为在广泛阶段发生碰撞)。 3 轴扫描除了使用普通对象外,完全独立,因此似乎是多线程的好地方。

为了避免在线程之间阻塞的可能性,这 3 个进程中的每一个都在(如果适用)进入多线程之前获取它想要使用的所有数据的本地副本

虽然这些扫描是一个重要的瓶颈,但它们的生命周期非常短,一次扫描通常持续 1-4 毫秒。这是一个实时应用程序,其中代码每秒运行 60 次,因此总滴答时间最多为 17 毫秒,因此 1-4 毫秒对我来说是很长的时间。因为这些扫描是短暂的,所以我使用了线程池。特别是一个 Executors.newFixedThreadPool(3),3 个用于 3 个轴。

我的测试计算机是具有超线程的双核,因此最多 4 个线程应该很舒服。 使用 Runtime.getRuntime().availableProcessors();

检查

问题

在运行以下测试代码时,其中使用线程池单线程或多线程运行大量短期任务,多线程版本要慢得多;查看配置文件数据。即使多线程部分没有共同的对象,情况也是如此。为什么会这样,是否有任何方法可以同时运行许多短期(1-4 毫秒)任务?

即使使任务大得多,也只会使多线程版本接近单线程的性能不会像我预期的那样超过它,这让我觉得我做错了一些严重的事情。

enter image description here

测试代码

public class BroadPhaseAxisSweep implements Callable<Set<PotentialCollisionPrecursor>>  {

    static final int XAXIS=0;
    static final int YAXIS=1;
    static final int ZAXIS=2;

    int axis; 
    int[] axisIndicies;
    boolean[] isStatic;
    boolean[] isLightWeight; 
    boolean[] isCollidable; 

    //orders the same as axisIndicies
    double[] starts;
    double[] ends;

    private static ExecutorService sweepPool = Executors.newFixedThreadPool(3);

    public BroadPhaseAxisSweep(int axis, List<TestObject> allObjects) {
        //all data that will be used by the thread is cached internally to avoid 
        //any concurrent access issues

        this.axis = axis;

        //allObjects is in reality unsorted, axisIndicies holds sorted indices
        //in this case allObjects just "happens" to be already sorted
        this.axisIndicies =new int[allObjects.size()];
        for(int i=0;i<allObjects.size();i++){
            axisIndicies[i]=i;
        }
        isStatic=new boolean[allObjects.size()];
        for(int i=0;i<allObjects.size();i++){
            isStatic[i]=allObjects.get(i).isStatic();
        }
        isLightWeight=new boolean[allObjects.size()];
        for(int i=0;i<allObjects.size();i++){
            isLightWeight[i]=allObjects.get(i).isLightWeightPhysicsObject();
        }
        isCollidable=new boolean[allObjects.size()];
        for(int i=0;i<allObjects.size();i++){
            isCollidable[i]=allObjects.get(i).isCollidable();
        }

        starts=new double[allObjects.size()];
        for(int i=0;i<allObjects.size();i++){
            starts[i]=allObjects.get(i).getStartPoint();
        }
        ends=new double[allObjects.size()];
        for(int i=0;i<allObjects.size();i++){
            ends[i]=allObjects.get(i).getEndPoint();
        }
    }


    @Override
    public Set<PotentialCollisionPrecursor> call() throws Exception {
        return axisSweep_simple(axisIndicies);
    }

    private Set<PotentialCollisionPrecursor> axisSweep_simple(int[] axisIndicies){

        Set<PotentialCollisionPrecursor> thisSweep =new HashSet();


        for(int i=0;i<starts.length;i++){
            if (isCollidable[axisIndicies[i]]){
                double activeObjectEnd=ends[i];
                //sweep forwards until an objects start is before out end
                for(int j=i+1;j<starts.length;j++){
                    //j<startXsIndicies.length is the bare mininmum contrain, most js wont get that far
                    if ((isStatic[axisIndicies[i]]&& isStatic[axisIndicies[j]]) || ((isLightWeight[axisIndicies[i]]&& isLightWeight[axisIndicies[j]]))){
                        //if both objects are static or both are light weight then they cannot by definition collide, we can skip
                        continue;
                    }


                    if (activeObjectEnd>starts[j]){
                        PotentialCollisionPrecursor potentialCollision=new PotentialCollisionPrecursor(getObjectNumberFromAxisNumber(i),getObjectNumberFromAxisNumber(j));
                            thisSweep.add(potentialCollision);
                    }else{
                        break; //this is as far as this active object goes

                    }

                }
            }
        }

        return thisSweep;
    }


    private int getObjectNumberFromAxisNumber(int number){
        return axisIndicies[number];
    }


     public static void main(String[] args){
         int noOfObjectsUnderTest=250;

         List<TestObject> testObjects=new ArrayList<>();

         Random rnd=new Random();
         double runningStartPosition=0;
         for(int i=0;i<noOfObjectsUnderTest;i++){
             runningStartPosition+=rnd.nextDouble()*0.01;
             testObjects.add(new TestObject(runningStartPosition));
         }

         while(true){
             runSingleTreaded(testObjects);
             runMultiThreadedTreaded(testObjects);
         }

     }

    private static void runSingleTreaded(List<TestObject> testObjects) {
        try {
            //XAXIS used over and over again just for test
            Set<PotentialCollisionPrecursor> xSweep=(new BroadPhaseAxisSweep(XAXIS,testObjects)).call();
            Set<PotentialCollisionPrecursor> ySweep=(new BroadPhaseAxisSweep(XAXIS,testObjects)).call();
            Set<PotentialCollisionPrecursor> zSweep=(new BroadPhaseAxisSweep(XAXIS,testObjects)).call();

            System.out.println(xSweep.size()); //just so JIT can't possibly optimise out
            System.out.println(ySweep.size()); //just so JIT can't possibly optimise out
            System.out.println(zSweep.size()); //just so JIT can't possibly optimise out
        } catch (Exception ex) {
            //bad practice, example only
            Logger.getLogger(BroadPhaseAxisSweep.class.getName()).log(Level.SEVERE, null, ex);
        }
    }

    private static void runMultiThreadedTreaded(List<TestObject> testObjects) {
        try {
            //XAXIS used over and over again just for test
            Future<Set<PotentialCollisionPrecursor>> futureX=sweepPool.submit(new BroadPhaseAxisSweep(XAXIS,testObjects));
            Future<Set<PotentialCollisionPrecursor>> futureY=sweepPool.submit(new BroadPhaseAxisSweep(XAXIS,testObjects));
            Future<Set<PotentialCollisionPrecursor>> futureZ=sweepPool.submit(new BroadPhaseAxisSweep(XAXIS,testObjects));

            Set<PotentialCollisionPrecursor> xSweep=futureX.get();
            Set<PotentialCollisionPrecursor> ySweep=futureY.get();
            Set<PotentialCollisionPrecursor> zSweep=futureZ.get();

            System.out.println(xSweep.size()); //just so JIT can't possibly optimise out
            System.out.println(ySweep.size()); //just so JIT can't possibly optimise out
            System.out.println(zSweep.size()); //just so JIT can't possibly optimise out
        } catch (Exception ex) {
            //bad practice, example only
            Logger.getLogger(BroadPhaseAxisSweep.class.getName()).log(Level.SEVERE, null, ex);
        }
    }


    public static class TestObject{

        final boolean isStatic;
        final boolean isLightWeight;
        final boolean isCollidable;
        final double startPointOnAxis;
        final double endPointOnAxis; 

        public TestObject(double startPointOnAxis) {
            Random rnd=new Random();
            this.isStatic = rnd.nextBoolean();
            this.isLightWeight =  rnd.nextBoolean();
            this.isCollidable =  rnd.nextBoolean();
            this.startPointOnAxis = startPointOnAxis;
            this.endPointOnAxis =startPointOnAxis+0.2*rnd.nextDouble();
        }

        public boolean isStatic() {
            return isStatic;
        }

        public boolean isLightWeightPhysicsObject() {
            return isLightWeight;
        }

        public boolean isCollidable() {
            return isCollidable;
        }

        public double getStartPoint() {
            return startPointOnAxis;
        }

        public double getEndPoint() {
            return endPointOnAxis;
        }
    }

}

public class PotentialCollisionPrecursor {
    //holds the object numbers of a potential collision, can be converted to a real PotentialCollision using a list of those objects
    private final int rigidBodyNumber1;
    private final int rigidBodyNumber2; 


    public PotentialCollisionPrecursor(int rigidBodyNumber1, int rigidBodyNumber2) {
        if (rigidBodyNumber1<rigidBodyNumber2){
            this.rigidBodyNumber1 = rigidBodyNumber1;
            this.rigidBodyNumber2 = rigidBodyNumber2;
        }else{
            this.rigidBodyNumber1 = rigidBodyNumber2;
            this.rigidBodyNumber2 = rigidBodyNumber1;
        }
    }

    public int getRigidBodyNumber1() {
        return rigidBodyNumber1;
    }

    public int getRigidBodyNumber2() {
        return rigidBodyNumber2;
    }

    @Override
    public int hashCode() {
        int hash = 7;
        hash = 67 * hash + this.rigidBodyNumber1;
        hash = 67 * hash + this.rigidBodyNumber2;
        return hash;
    }

    @Override
    public boolean equals(Object obj) {
        if (obj == null) {
            return false;
        }
        if (getClass() != obj.getClass()) {
            return false;
        }
        final PotentialCollisionPrecursor other = (PotentialCollisionPrecursor) obj;
        if (this.rigidBodyNumber1 != other.rigidBodyNumber1) {
            return false;
        }
        if (this.rigidBodyNumber2 != other.rigidBodyNumber2) {
            return false;
        }
        return true;
    }

}

不同大小的线程池

在单线程之后,下一个最快的是 2/3 线程池,然后最慢的是线程池中的单个线程(毫不奇怪,因为它有所有的开销而没有任何 yield )

enter image description here

异常大的任务大小

为了测试问题是否只是线程任务太小,我将任务大小增加到 100 毫秒左右。这些给出了更令人困惑的结果; 1 到 3 之间的任意数量的线程速度大致相同,并且比单线程慢

enter image description here

最佳答案

如果您的大扫描只需要几毫秒,那么您最好同步执行所有操作。保留线程池线程(在 Windows 上)所需的工作时间超过 5 毫秒。更不用说您仍然需要在线程之间移动数据并等待上下文切换,然后最终将线程放回您找到它们的位置。

整个过程可能只会降低性能,特别是因为您正在获取数据的本地副本。如果每次扫描自然独立并花费 500 毫秒以上,您可能会受益于您已经实现的某些并发模型。

需要注意的一件有趣的事情是,如今的图形处理器带有专用于物理计算的嵌入式协处理器。他们之所以如此擅长做这样的事情,是因为他们有时有数千个处理器内核,它们都以相对较低的时钟速率运行。这意味着它们非常适契约(Contract)时处理许多小型任务。您可能想尝试直接与图形处理器连接以将物理处理卸载到那种环境中,而不是在通用 CPU 上使用它。

关于java - 即使使用线程池,多线程时许多短期任务也会变慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21410988/

相关文章:

java - Liferay 产生异常 : null and javax. servlet.ServletException : java. lang.StackOverflowError

Java线程向多个类发送相同的数据

java - 如何配置 apache httpcore 4 以使用代理?

带有完整年份的 Java DateFormat.SHORT

java - Spring Boot Actuator metric heap.used 随着每次后续的执行器/metrics api 调用而增加

java - 异步调用两个 Java 方法

c++ - 何时使用 C++11 mutex、lock、unique_lock、shared_lock 等

mysql - 我应该在这些列上创建索引吗?

javascript - 将相同函数分配给 JavaScript 中的多个变量时的性能问题

c++ - 使用 Eigen 的性能比使用我自己的类更差