c++ - 如何优雅高效地将文件读入vector？

#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>

using namespace std;

vector<char> f1()
{
    ifstream fin{ "input.txt", ios::binary };
    return
    {
        istreambuf_iterator<char>(fin),
        istreambuf_iterator<char>()
    };
}

vector<char> f2()
{
    vector<char> coll;
    ifstream fin{ "input.txt", ios::binary };
    char buf[1024];
    while (fin.read(buf, sizeof(buf)))
    {
        copy(begin(buf), end(buf),
            back_inserter(coll));
    }

    copy(begin(buf), begin(buf) + fin.gcount(),
        back_inserter(coll));

    return coll;
}

int main()
{
    f1();
    f2();
}

很明显，f1()比f2()更简洁；所以我更喜欢 f1() 而不是 f2()。但是，我担心 f1() 的效率不如 f2()。

所以，我的问题是:

主流的 C++ 编译器会优化 f1() 使其与 f2() 一样快吗？

更新:

我用了一个130M的文件在release模式下测试(Visual Studio 2015 with Clang 3.8):

f1() 需要 1614 毫秒，而 f2() 需要 616 毫秒。

f2() 比 f1() 快。

多么可悲的结果!

最佳答案

我已经使用 with mingw482 检查了我这边的代码. 出于好奇，我添加了一个附加功能 f3具有以下实现:

inline vector<char> f3()
{
    ifstream fin{ filepath, ios::binary };
    fin.seekg (0, fin.end);
    size_t len = fin.tellg();
    fin.seekg (0, fin.beg);

    vector<char> coll(len);
    fin.read(coll.data(), len);
    return coll;
}

我已经使用文件 ~90M 进行了测试长的。对于我的平台，结果与您的有所不同。

f1() ~850ms
f2() ~600ms
f3() ~70ms

结果计算为 10 次连续文件读取的平均值。

f3自 vector<char> coll(len); 以来函数花费的时间最少它已分配了所有必需的内存，无需进行进一步的重新分配。至于back_inserter它要求类型具有 push_back成员函数。当 capacity 时，哪个 vector 会重新分配？被超过。如文档中所述:

push_back

This effectively increases the container size by one, which causes an automatic reallocation of the allocated storage space if -and only if- the new vector size surpasses the current vector capacity.

其中f1和 f2尽管两者都使用 back_inserter，但后者稍快一些。 . f2可能更快，因为它以 block 的形式读取文件，这允许进行一些缓冲。

关于c++ - 如何优雅高效地将文件读入vector？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41139764/

c++ - 如何优雅高效地将文件读入vector？

上一篇：c++ - 如何查询 boost::log 严重性？

下一篇：c++ - 使用 libxml2 缩进合并的 xml 文件