c++ - 如何在 C++ 中快速输入数百万个整数？

我正在做一个关于 C++ 栈的数据结构编程作业。

在这个作业中，我应该读取很多整数(在最坏的情况下我应该读取 1,600,000 个整数)并最终输出一些字符串。

作为学生，我提交了我的cpp源文件，网站对我的源代码进行了评判和评分。我得到了 100%，但我想做得更好。这个作业的时间限制是 2 秒，我的源代码的执行时间是 128 毫秒。然而，优等生只用了52毫秒就完成了任务。所以我想知道如何使我的代码更快。

我的源码主要包含三部分:

使用 cin 从 OnlineJudge 系统中读取大量整数(最多 1,600,000 个整数)。
尝试找到解决方案并将其存储在一个字符数组中。
使用cout输出char数组。

OnlineJudge 告诉我代码的执行时间。第一部分用时 100 毫秒，第二部分用时 20 毫秒，第三部分用时 12 毫秒。所以如果我想让我的代码更快，我应该提高输入速度。

OnlineJudge的输入是这样的:

5 2
1 2 3 5 4

第一行是两个整数n和m，第二行是n个整数，中间用空格隔开。限制是:1<=n<=1,600,000 和 0<=m<=1,600,000。为了读取超过100万个整数，我的代码是这样的:

#include <iostream>
using namespace std;
int main()
{
    std::ios::sync_with_stdio(false);
    cin.tie(NULL);
    int *exit = new int[1600000];
    cin>>n>>m;
    for (int i=0;i<n;++i)
        cin>>exit[i];
    return 0;
}

如果n很小，OnlineJudge说执行时间是0毫秒。如果 n 非常大，例如1,600,000。 OnlineJudge 说这段代码需要 100 毫秒。如果我删除

std::ios::sync_with_stdio(false);
cin.tie(NULL);

然后代码需要 424 毫秒。然而，在这个作业中阅读整数是必要的，所以我很好奇优等生如何在仅 52 毫秒内完成“cin，find the solution，cout”。

你有什么提高输入速度的想法吗？

2019.4.17:有人建议使用 vector 或 std::from_chars，但在本作业中这些被禁止。如果我写

#include <vector>

或

#include <charconv>

或

#include <array>

然后 OnlineJudge 提示“编译错误”。

有人建议使用scanf，我的代码是这样的:

for (int i=0;i<n;++i)
    scanf("%d", &exit[i]);

但是执行时间是120毫秒。顺便说一句，我不认为scanf比cin快，Using scanf() in C++ programs is faster than using cin?

有人建议用getline，我很少用这个函数，我的代码是这样的:

stringstream ss;
string temp;
getline(cin, temp);
ss<<temp;ss>>n;ss>>m;
ss.clear();temp.clear();
getline(cin, temp);ss<<temp;
for (int i=0;i<n;++i)
    ss>>exit[i];

执行时间也是120毫秒。

有人建议使用 mmap。我以前从未听说过这个功能。好像只有Unix才有这个功能？但是我使用的是 Visual Studio 2010。我的代码是这样的:

#include <unistd.h>
#include <sys/mman.h>
    //to load 1,600,000 integers
    int *exit = static_cast<int*>(mmap(NULL,1600*getpagesize(),PROT_READ,MAP_ANON|MAP_SHARED,0,0));
    for (int i=0;i<n;++i)
        cin>>*(exit+i);

OnlineJudge 说的是“运行时错误(信号 11)”而不是“编译错误”，信号 11 表示“无效的内存引用”，当进程进行无效的虚拟内存引用或段错误时，会发送此信号给进程，即当它执行分段违规时。不知道是不是我的mmap有什么问题，望大家指点。

2019.4.22:感谢大家的帮助，现在我成功解决了这个问题，关键函数是mmap，代码如下:

#include <sys/mman.h>
    cin.tie(NULL);
    std::ios::sync_with_stdio(false);
    string temp;

    int n,m;
    int *exit = new int[1600000];

    const int input_size = 13000000;
    void *mmap_void = mmap(0,input_size,PROT_READ,MAP_PRIVATE,0,0);
    char *mmap_input = (char *)mmap_void;
    int r=0,s=0;
    while (mmap_input[s]<'0' || mmap_input[s]>'9') ++s;
    while (mmap_input[s]>='0' && mmap_input[s]<='9')
    { r=r*10+(mmap_input[s]-'0');++s; }
    n=r;r=0;
    while (mmap_input[s]<'0' || mmap_input[s]>'9') ++s;
    while (mmap_input[s]>='0' && mmap_input[s]<='9')
    { r=r*10+(mmap_input[s]-'0');++s; }
    m=r;r=0;
    while (mmap_input[s]<'0' || mmap_input[s]>'9') ++s;
    for (int i=0;i<n;++i)
    {
        while (mmap_input[s]>='0' && mmap_input[s]<='9')
        { r=r*10+(mmap_input[s]-'0');++s; }
        ++s;
        exit[i]=r;r=0;
    }

mmap 和将字符转换为整数的执行时间需要 8 毫秒。现在这个作业的总执行时间需要 40 毫秒，比 52 毫秒更快。

最佳答案

一些想法:

使用 std::scanf 而不是 std::istream 读取整数。众所周知，由于多种原因，后者速度较慢，即使使用 std::ios::sync_with_stdio(false) 调用也是如此。
通过将文件映射到内存来读取文件。
比 scanf 和 strtol 更快地解析整数。

例子:

#include <cstdio>

int main() {
    int n, m, a[1600000];
    if(2 != std::scanf("%d %d", &n, &m))
        throw;
    for(int i = 0; i < n; ++i)
        if(1 != std::scanf("%d", a + i))
            throw;
}

您还可以展开 scanf 循环以在一次调用中读取多个整数。例如:

#include <cstdio>

constexpr int step = 64;
char const fmt[step * 3] =
    "%d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d "
    "%d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d "
    "%d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d "
    "%d %d %d %d %d %d %d %d %d %d %d %d %d %d %d %d"
    ;
void main() {
    int a[1600000];
    int n, m;
    if(2 != std::scanf("%d %d", &n, &m))
        throw;

    for(int i = 0; i < n; i += step) {
        int expected = step < n - i ? step : n - i;
        int* b = a + i;
        int read = scanf(fmt + 3 * (step - expected),
                         b + 0x00, b + 0x01, b + 0x02, b + 0x03, b + 0x04, b + 0x05, b + 0x06, b + 0x07,
                         b + 0x08, b + 0x09, b + 0x0a, b + 0x0b, b + 0x0c, b + 0x0d, b + 0x0e, b + 0x0f,
                         b + 0x10, b + 0x11, b + 0x12, b + 0x13, b + 0x14, b + 0x15, b + 0x16, b + 0x17,
                         b + 0x18, b + 0x19, b + 0x1a, b + 0x1b, b + 0x1c, b + 0x1d, b + 0x1e, b + 0x1f,
                         b + 0x20, b + 0x21, b + 0x22, b + 0x23, b + 0x24, b + 0x25, b + 0x26, b + 0x27,
                         b + 0x28, b + 0x29, b + 0x2a, b + 0x2b, b + 0x2c, b + 0x2d, b + 0x2e, b + 0x2f,
                         b + 0x30, b + 0x31, b + 0x32, b + 0x33, b + 0x34, b + 0x35, b + 0x36, b + 0x37,
                         b + 0x38, b + 0x39, b + 0x3a, b + 0x3b, b + 0x3c, b + 0x3d, b + 0x3e, b + 0x3f);
        if(read != expected)
            throw;
    }
}

另一种选择是手动解析整数(将文件映射到内存中会有所帮助，并且解析整数的算法比这个和标准的atoi/strtol快得多，参见Fastware - Andrei Alexandrescu):

int main() {
    int n, m, a[1600000];
    if(2 != std::scanf("%d %d", &n, &m))
        throw;

    for(int i = 0; i < n; ++i) {
        int r = std::getchar();
        while(std::isspace(r))
            r = std::getchar();
        bool neg = false;
        if('-' == r) {
            neg = true;
            r = std::getchar();
        }
        r -= '0';
        for(;;) {
            int s = std::getchar();
            if(!std::isdigit(s))
                break;
            r = r * 10 + (s - '0');
        }
        a[i] = neg ? -r : r;
    }
}

还有一种方法是将文件映射到内存中并更快地解析它:

#include <boost/iostreams/device/mapped_file.hpp>

inline int find_and_parse_int(char const*& begin, char const* end) {
    while(begin != end && std::isspace(*begin))
        ++begin;
    if(begin == end)
        throw;
    bool neg = *begin == '-';
    begin += neg;
    int r = 0;
    do {
        unsigned c = *begin - '0';
        if(c >= 10)
            break;
        r = r * 10 + static_cast<int>(c);
    } while(++begin != end);
    return neg ? -r : r;
}

void main() {
    boost::iostreams::mapped_file f("random-1600000.txt", boost::iostreams::mapped_file::readonly);
    char const* begin = f.const_data();
    char const* end = begin + f.size();
    int n = find_and_parse_int(begin, end);
    int m = find_and_parse_int(begin, end);

    int a[1600000];
    for(int i = 0; i < n; ++i)
        a[i] = find_and_parse_int(begin, end);
}

Benchmark source code .

请注意，不同版本的编译器和标准库的结果可能会有很大差异:

CentOS 6.10 版，g++-6.3.0，Intel Core i7-4790 CPU @ 3.60GHz

---- Best times ----
seconds,    percent, method
0.167985515,  100.0, getchar
0.147258495,   87.7, scanf
0.137161991,   81.7, iostream
0.118859546,   70.8, scanf-multi
0.034033769,   20.3, mmap-parse-faster

Ubuntu 18.04.2 LTS，g++-8.2.0，Intel Core i7-7700K CPU @ 4.20GHz

---- Best times ----
seconds,    percent, method
0.133155952,  100.0, iostream
0.102128208,   76.7, scanf
0.082469185,   61.9, scanf-multi
0.048661004,   36.5, getchar
0.025320109,   19.0, mmap-parse-faster

关于c++ - 如何在 C++ 中快速输入数百万个整数？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55713589/

c++ - 如何在 C++ 中快速输入数百万个整数？

上一篇：c++ - 区分 typedef

下一篇：c++ - 为什么对共享库本身中定义的符号使用全局偏移表？