c++ - 如何有效地只保留重复项？

给定一个 STL vector ，仅按排序顺序输出重复项，例如，

INPUT : { 4, 4, 1, 2, 3, 2, 3 }
OUTPUT: { 2, 3, 4 }

该算法是微不足道的，但目标是使其与 std::unique() 一样高效。我的幼稚实现就地修改了容器:

我的幼稚实现:

void not_unique(vector<int>* pv)
{
    if (!pv)
        return;

 // Sort (in-place) so we can find duplicates in linear time
 sort(pv->begin(), pv->end());

 vector<int>::iterator it_start = pv->begin();
 while (it_start != pv->end())
 {
  size_t nKeep = 0;

  // Find the next different element
  vector<int>::iterator it_stop = it_start + 1;
  while (it_stop != pv->end() && *it_start == *it_stop)
  {
   nKeep = 1; // This gets set redundantly
   ++it_stop;
  }

  // If the element is a duplicate, keep only the first one (nKeep=1).
  // Otherwise, the element is not duplicated so erase it (nKeep=0).
  it_start = pv->erase(it_start + nKeep, it_stop);
 }
}

如果您可以使其更高效、更优雅或更通用，请告诉我。例如，自定义排序算法，或在第二个循环中复制元素以消除erase() 调用。

最佳答案

我的第一次尝试失败了，假设 std::unique 将所有重复项移动到范围的末尾(它没有)。哎呀。这是另一个尝试:

这是 not_unique 的一个实现。它删除在排序范围内只出现一次的任何元素和重复出现多次的任何元素。因此，结果范围是出现多次的唯一元素列表。

该函数具有线性复杂度，并在范围内进行单次传递(std::unique 具有线性复杂度)。 It 必须满足前向迭代器的要求。返回结果范围的结尾。

template <typename It>
It not_unique(It first, It last)
{
    if (first == last) { return last; }

    It new_last = first;
    for (It current = first, next = ++first; next != last; ++current, ++next)
    {
        if (*current == *next)
        {
            if (current == new_last)
            {
                ++new_last;
            }
            else
            {
                *new_last++ = *current;
                while (next != last && *current == *next)
                {
                    ++current;
                    ++next;
                }
                if (next == last)
                    return new_last;
            }
        }
    }
    return new_last;
}

关于c++ - 如何有效地只保留重复项？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/2550229/

c++ - 如何有效地只保留重复项？

上一篇：带有 ICU 的 C++ UTF-8 输出

下一篇：c++ - 命名空间范围内的 const 变量不是隐式静态的吗？