C++ Intel TBB和Microsoft PPL,如何在并行循环中使用next_permutation?

标签 c++ algorithm visual-c++ parallel-processing permutation

我安装了 Visual Studio 2012 和 Intel Parallel Studio 2013,因此我有 Intel TBB。

假设我有以下代码:

const int cardsCount = 12; // will be READ by all threads
// the required number of cards of each colour to complete its set:
// NOTE that the required number of cards of each colour is not the same as the total number of cards of this colour available
int required[] = {2,3,4}; // will be READ by all threads
Card cards[cardsCount]; // will be READ by all threads
int cardsIndices[cardsCount];// this will be permuted, permutations need to be split among threads !

// set "cards" to 4 cards of each colour (3 colours total = 12 cards)
// set cardsIndices to {0,1,2,3...,11}

 // this variable will be written to by all threads, maybe have one for each thread and combine them later?? or can I use concurrent_vector<int> instead !?
int logColours[] = {0,0,0};

int permutationsCount = fact(cardsCount);

for (int pNum=0; pNum<permutationsCount; pNum++) // I want to make this loop parallel !!
{
    int countColours[3] = {0,0,0}; // local loop variable, no problem with multithreading
    for (int i=0; i<cardsCount; i++)
    {
        Card c = cards[cardsIndices[i]]; // accessed "cards"

        countColours[c.Colour]++; // local loop variable, np.
            // we got the required number of cards of this colour to complete it
        if (countColours[c.Colour] == required[c.Colour]) // read global variable "required" !
        {
                    // log that we completed this colour and go to next permutation
            logColours[c.Colour] ++; // should I use a concurrent_vector<int> for this shared variable?
            break;
        }
    }
    std::next_permutation(cardsIndices, cardsIndices+cardsCount); // !! this is my main issue
}

我正在计算的是,如果我们从可用的卡片中随机挑选,我们将完成一种颜色多少次,这是通过遍历每个可能的排列并按顺序挑选来彻底完成的,当一种颜色“完成”时,我们就打破并离开到下一个排列。请注意,每种颜色我们有 4 张卡片,但完成每种颜色所需的卡片数量为 {2,3,4}(红色、绿色、蓝色)。 2 张红卡足以完成红色,我们有 4 张可用,因此红色比蓝色更有可能完成,而蓝色需要选择所有 4 张卡。

我想让这个 for 循环并行,但我的主要问题是如何处理“卡片”排列?这里有大约 5 亿的排列(12!),如果我有 4 个线程,我如何将其分成 4 个不同的部分,并让每个线程都经过每个部分?

如果我不知道机器有多少核心,并且希望程序自动选择正确的并发线程数怎么办?肯定有办法使用英特尔或微软的工具来做到这一点吗?

这是我的 Card 结构,以防万一:

struct Card
{
public:
    int Colour;
    int Symbol;
}

最佳答案

N = cardsNumber , M = required[0] * required[1] * ... * required[maxColor] 。 那么,实际上,你的问题可以在 O(N * M) 时间内轻松解决。就您而言,那就是 12 * 2 * 3 * 4 = 288运营。 :)

实现此目的的一种可能方法是使用递归关系。 考虑一个函数logColours f(n, required) 。让n是当前已考虑的卡片数量; required是您示例中的 vector 。函数以 vector logColours 形式返回答案。 您感兴趣f(12, {2,3,4}) 。函数内部的简短循环计算 f可以这样写:

std::vector<int> f(int n, std::vector<int> require) {
    if (cache[n].count(require)) {
        // we have already calculated function with same arguments, do not recalculate it again
        return cache[n][require];
    }

    std::vector<int> logColours(maxColor, 0); // maxColor = 3 in your example

    for (int putColor=0; putColor<maxColor; ++putColor) {
         if (/* there is still at least one card with color 'putColor'*/) {
              // put a card of color 'putColor' on place 'n'
              if (require[putColor] == 1) {
                  // means we've reached needed amount of cards of color 'putColor'
                  ++logColours[putColor];
              } else {
                  --require[putColor];
                  std::vector<int> logColoursRec = f(n+1, require);
                  ++require[putColor];
                  // merge child array into your own.
                  for (int i=0; i<maxColor; ++i)
                      logColours[i] += logColoursRec[i];
              }
          }
     }

     // store logColours in a cache corresponding to this function arguments
     cache[n][required] = std::move(logColours);
     return cache[n][required];
 }

缓存可以实现为 std::unordered_map<int, std::unordered_map<std::vector<int>, std::vector<int>>> .

一旦理解了主要思想,您就能够用更高效的代码来实现它。

关于C++ Intel TBB和Microsoft PPL,如何在并行循环中使用next_permutation?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16250302/

相关文章:

c++ - 使用 Linux 时出现段错误,但在 Xcode 中没有

c++ - C++ 控制台输出问题

c++ - CMake 如何确定目标依赖项的回退构建配置?

c++ - key不存在时map给出的值

c# - 使用相同字母的排列

algorithm - 从麦克风输入中移除已知音频输出

C++代码编译但在运行时失败

algorithm - 规范的霍夫曼编码比特流的内容是什么?

windows - 使用静态库 libcurl.lib 构建程序

c++ - 将 GNU C++ 程序移植到 Visual C++