java - 寻找精度递减的模式

标签 java algorithm statistics mode

我觉得应该有一个可用的库来更简单地做两件事,A) 在 double 的情况下找到数组的模式和 B) 优雅地降低精度直到达到特定频率。

想象一下这样的数组:

double[] a = {1.12, 1.15, 1.13, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4};

如果我正在寻找 3 的频率,那么它将从小数点后 2 位变为小数点后 1 位,最后返回 1.1 作为我的模式。如果我的频率要求为 4,它将返回 4 作为我的模式。

我确实有一组代码可以按我想要的方式工作,并返回我期望的结果,但我觉得应该有一种更有效的方法来完成这个,或者一个现有的库可以帮助我完成相同的。附件是我的代码,我对我应该采取的不同方法的想法/评论很感兴趣....我列出了迭代以限制精度可以降低的程度。

public static double findMode(double[] r, int frequencyReq)
{
    double mode = 0d;
    int frequency = 0;
    int iterations = 4;

    HashMap<Double, BigDecimal> counter = new HashMap<Double, BigDecimal>();

    while(frequency < frequencyReq && iterations > 0){
        String roundFormatString = "#.";
        for(int j=0; j<iterations; j++){
            roundFormatString += "#";
        }
        DecimalFormat roundFormat = new DecimalFormat(roundFormatString);
        for(int i=0; i<r.length; i++){

            double element = Double.valueOf(roundFormat.format(r[i]));

            if(!counter.containsKey(element))
                counter.put(element, new BigDecimal(0));

            counter.put(element,counter.get(element).add(new BigDecimal(1)));
        }

        for(Double key : counter.keySet()){

            if(counter.get(key).compareTo(new BigDecimal(frequency))>0){
                mode = key;
                frequency = counter.get(key).intValue();
                log.debug("key: " + key + " Count: " + counter.get(key));
            }
        }
        iterations--;
    }

    return mode;
}

编辑

根据 Paulo 的评论,另一种改写问题的方式是:目标是找到一个数字,其中邻域中至少有 frequency 数组元素,邻域的半径尽可能小.

最佳答案

这里是重新制定的问题的解决方案:

The goal is to locate a number where in the neighborhood are at least frequency array elements, with the radius of the neighborhood being as small as possible.

(我随意调换了输入数组中 1.151.13 的顺序。)

基本思想是:我们已经对输入进行了排序(即相邻元素是连续的),并且我们知道在我们的邻域中需要多少元素。所以我们在这个数组上循环一次,测量左边元素和更右边的 frequency 元素之间的距离。它们之间是 frequency 元素,所以这形成了一个邻域。然后我们简单地取最小的这样的距离。 (我的方法有一个复杂的返回结果的方式,你可能想做得更好。)

这并不完全等同于您原来的问题(不适用于固定的数字步长),但也许这才是您真正想要的:-)

不过,您必须找到一种更好的格式化结果的方法。

package de.fencing_game.paul.examples;

import java.util.Arrays;

/**
 * searching of dense points in a distribution.
 *
 * Inspired by http://stackoverflow.com/questions/5329628/finding-a-mode-with-decreasing-precision.
 */
public class InpreciseMode {

    /** our input data, should be sorted ascending. */
    private double[] data;

    public InpreciseMode(double ... data) {
        this.data = data;
    }


    /**
     * searchs the smallest neighbourhood (by diameter) which
     * contains at least minSize elements.
     *
     * @return an array of two arrays:
     *     {   { the middle point of the neighborhood,
     *           the diameter of the neighborhood  },
     *        all the elements of the neigborhood }
     *
     * TODO: better return an object of a class encapsuling these.
     */
    public double[][] findSmallNeighbourhood(int minSize) {
        int currentLeft = -1;
        int currentRight = -1;
        double currentMinDiameter = Double.POSITIVE_INFINITY;

        for(int i = 0; i + minSize-1 < data.length; i++) {
            double diameter = data[i+minSize-1] - data[i];
            if(diameter < currentMinDiameter) {
                currentMinDiameter = diameter;
                currentLeft = i;
                currentRight = i + minSize-1;
            }
        }
        return
            new double[][] {
            { 
                (data[currentRight] + data[currentLeft])/2.0,
                currentMinDiameter
            },
            Arrays.copyOfRange(data, currentLeft, currentRight+1)
        };
    }

    public void printSmallNeighbourhoods() {
        for(int frequency = 2; frequency <= data.length; frequency++) {
            double[][] found = findSmallNeighbourhood(frequency);

            System.out.printf("There are %d elements in %f radius "+
                              "around %f:%n     %s.%n",
                              frequency, found[0][1]/2, found[0][0],
                              Arrays.toString(found[1]));
        }
    }


    public static void main(String[] params) {
        InpreciseMode m =
            new InpreciseMode(1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1,
                              4.2, 4.3, 4.4);
        m.printSmallNeighbourhoods();
    }

}

输出是

There are 2 elements in 0,005000 radius around 1,125000:
     [1.12, 1.13].
There are 3 elements in 0,015000 radius around 1,135000:
     [1.12, 1.13, 1.15].
There are 4 elements in 0,150000 radius around 4,250000:
     [4.1, 4.2, 4.3, 4.4].
There are 5 elements in 0,450000 radius around 3,850000:
     [3.4, 3.44, 4.1, 4.2, 4.3].
There are 6 elements in 0,500000 radius around 3,900000:
     [3.4, 3.44, 4.1, 4.2, 4.3, 4.4].
There are 7 elements in 1,200000 radius around 3,200000:
     [2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4].
There are 8 elements in 1,540000 radius around 2,660000:
     [1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2].
There are 9 elements in 1,590000 radius around 2,710000:
     [1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3].
There are 10 elements in 1,640000 radius around 2,760000:
     [1.12, 1.13, 1.15, 2.0, 3.4, 3.44, 4.1, 4.2, 4.3, 4.4].

关于java - 寻找精度递减的模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5329628/

相关文章:

math - 选择公平的团队 - 以及证明这一点的数学

android - 有多少百分比的平板电脑用户以纵向和横向浏览?

java - 单元测试正确的数据结构创建

java - ResultSet.getArray(colName) 可以返回 null 吗?

寻找最便宜元素以获得一定数量资源的算法

python-3.x - Pandas 合并有两个具有相同代码和输入数据的结果

java - Java DecimalFormat 的问题

java - 如何在 tomcat 中设置 org.apache.tomcat.util.digester.EnvironmentPropertySource

string - Boyer-Moore 字符串搜索算法运行时间复杂度

algorithm - 点在凹面上的最近点