Given an array of N numbers (not necessarily sorted). We can merge any two numbers into one and the cost of merging the two numbers is equal to the sum of the two values. The task is to find the total minimum cost of merging all the numbers.

Example:
Let the array A = [1,2,3,4]

Then, we can remove 1 and 2, add both of them and keep the sum back in array. Cost of this step would be (1+2) = 3.

Now, A = [3,3,4], Cost = 3

In second step, we can 3 and 3, add both of them and keep the sum back in array. Cost of this step would be (3+3) = 6.

Now, A = [4,6], Cost = 6

In third step, we can remove both elements from the array and keep the sum back in array again. Cost of this step would be (4+6) = 6.

Now, A = [10], Cost = 10

So, total cost turns out to be 19 (10+6+3).

We will have to pick the 2 smallest elements to minimize our total cost. A simple way to do this is using a min heap structure. We will be able to get the minimum element in O(1) and insertion will be O(log n).

The time complexity of this approach is O(n log n).

但我尝试了另一种方法，但无法找到失败的情况。基本思想是我们在任何时候选择的两个最小元素之和总是大于之前选择的一对元素之和。所以“临时”数组将始终排序，我们将能够在 O(1) 中访问最少的元素。

因为我对输入数组进行排序，然后简单地遍历数组，所以我的方法的复杂度是 O(n log n)。

int minCost(vector<int>& arr) {
    sort(arr.begin(), arr.end());
    // temp array will contain the sum of all the pairs of minimum elements
    vector<int> temp;
    // index for arr
    int i = 0;
    // index for temp
    int j = 0;
    int cost = 0;

    // while we have more than 1 element combined in both the input and temp array
    while(arr.size() - i + temp.size() - j > 1) {
        int num1, num2;
        // selecting num1 (minimum element)
        if(i < arr.size() && j < temp.size()) {
            if(arr[i] <= temp[j])
                num1 = arr[i++];
            else
                num1 = temp[j++];
        }
        else if(i < arr.size())
            num1 = arr[i++];
        else if(j < temp.size())
            num1 = temp[j++];

        // selecting num2 (second minimum element)
        if(i < arr.size() && j < temp.size()) {
            if(arr[i] <= temp[j])
                num2 = arr[i++];
            else
                num2 = temp[j++];
        }
        else if(i < arr.size())
            num2 = arr[i++];
        else if(j < temp.size())
            num2 = temp[j++];

        // appending the sum of the minimum elements in the temp array
        int sum = num1 + num2;
        temp.push_back(sum);
        cost += sum;
    }
    return cost;
}

这种方法是否正确？如果没有，请让我知道我遗漏了什么，以及该算法失败的测试用例。

SPOJ Link for the same problem

最佳答案

逻辑对我来说似乎非常可靠...所有计算的总和永远不会减少，因此您只需要将最旧的两个计算总和、接下来的两个元素或最旧的总和和下一个元素相加。

我只是简化代码:

#include <vector>
#include <algorithm>
#include <stdio.h>

int hsum(std::vector<int> arr) {
    int ni = arr.size(), nj = 0, i = 0, j = 0, res = 0;
    std::sort(arr.begin(), arr.end());
    std::vector<int> temp;
    auto get = [&]()->int {
        if (j == nj || (i < ni && arr[i] < temp[j])) return arr[i++];
        return temp[j++];
    };
    while ((ni-i)+(nj-j)>1) {
        int a = get(), b = get();
        res += a+b;
        temp.push_back(a + b); nj++;
    }
    return res;
}

int main() {
    fprintf(stderr, "%i\n", hsum(std::vector<int>{1,4,2,3}));
    return  0;
}

好主意!

另一项改进是注意到正在处理的两个数组(原始数组和保存总和的临时数组)的累积长度将在每一步减少。由于第一步将使用两个输入元素，临时数组在每一步增加一个元素这一事实仍然不足以让数组本身中分配的“步行队列”到达读取指针。这意味着不需要临时数组，并且可以在数组本身中找到求和的空间...

int hsum(std::vector<int> arr) {
    int ni = arr.size(), nj = 0, i = 0, j = 0, res = 0;
    std::sort(arr.begin(), arr.end());
    auto get = [&]()->int {
        if (j == nj || (i < ni && arr[i] < arr[j])) return arr[i++];
        return arr[j++];
    };
    while ((ni-i)+(nj-j)>1) {
        int a = get(), b = get();
        res += a+b;
        arr[nj++] = a + b;
    }
    return res;
}

关于 SPOJ 上的错误...我尝试简单地搜索问题，但没有成功。但是，我尝试生成随机长度的随机数组，并使用直接从规范中找到的“强力”解决方案检查此解决方案，我有理由相信该算法是正确的。

我至少知道一个编程领域 (Topcoder)，有时问题是精心设计的，因此如果使用 unsigned 计算会给出正确的结果，但如果使用 int 则不会(或者如果使用 unsigned long long 但如果使用 long long 则不会)因为整数溢出。不知道SPOJ是不是也做这种废话⁽¹⁾...可能是某些隐藏测试用例失败的原因...

编辑

如果使用 long long 值，请使用 SPOJ 检查算法是否通过...这是我使用的条目:

#include <stdio.h>
#include <algorithm>
#include <vector>

int main(int argc, const char *argv[]) {
    int n;
    scanf("%i", &n);
    for (int testcase=0; testcase<n; testcase++) {
        int sz; scanf("%i", &sz);
        std::vector<long long> arr(sz);
        for (int i=0; i<sz; i++) scanf("%lli", &arr[i]);

        int ni = arr.size(), nj = 0, i = 0, j = 0;
        long long res = 0;
        std::sort(arr.begin(), arr.end());
        auto get = [&]() -> long long {
            if (j == nj || (i < ni && arr[i] < arr[j])) return arr[i++];
            return arr[j++];
        };
        while ((ni-i)+(nj-j)>1) {
            long long a = get(), b = get();
            res += a+b;
            arr[nj++] = a + b;
        }
        printf("%lli\n", res);
    }
    return 0;
}

PS:这种计算也是在给定符号频率表的情况下构建用于熵编码的哈夫曼树所需的计算，因此它不仅仅是随机练习，而是具有实际应用。

(1) 我说“废话”是因为在 Topcoder 中，他们从不给出需要 65 位的问题；因此它不是真正关心溢出，而只是为新手设置陷阱。另一个我认为是我在 TC 上看到的不好的做法是，一些问题是经过精心设计的，所以正确的算法如果使用 C++ 将几乎不符合超时限制:只需使用另一种语言(并获得例如减速 2 倍)，你无法解决问题。

关于c++ - 将数组大小减少到 1 的最小成本，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/68857405/

c++ - 将数组大小减少到 1 的最小成本

编辑

上一篇：TypeScript 类型保护函数在使用数组 reduce 时推断出错误的类型

下一篇：assembly - 英特尔在哪里记录了 ModR/M 字节中扩展寄存器 (R8-R15) 的编码？