c++ - 性能测量 : time vs tick?

对于在 1 或 2 个内核上运行的 2 线程程序，确保实现实时性能的最佳方法是什么？ boost::timer 还是 RDTSC？

我们从那段代码开始

boost::timer t;
p.f(frame);
max_time_per_frame = std!::max(max_time_per_frame, t.ellapsed());

... where p is an instance of Proc.

class Proc {
public:
    Proc() : _frame_counter(0) {}

    // that function must be call for each video frame and take less than 1/fps seconds 
    // 24 fps => 1/24 => < 0.04 seconds.
    void f(unsigned char * const frame) 
    {
        processFrame(frame); //that's the most important part

        //that part run every 240 frame and should not affect
        // the processFrame flow !
        if(_frame_counter % 240 == 0) 
        {
            do_something_more();
        }
        _frame_counter++;
    }

private:
    _frame_counter;
}

因此它以单线程/单核方式运行，我们观察到由于 do_something_more 处理，max_time_per_frame 高于目标时间。为了消除这些处理时间峰值，我们在一个单独的线程中启动每个 do_something_more，就像下面的伪代码一样。

class Proc {
public:
    Proc() : _frame_counter(0) {
        t = start_thread ( do_something_more_thread );
    }

    // that function must be call for each video frame and take less than 1/fps seconds 
    // 24 fps => 1/24 => < 0.04 seconds.
    void f(unsigned char * const frame) 
    {
        processFrame(frame); //that's the most important part

        //that part run every 240 frame and should not affect
        // the processFrame flow !
        if(_frame_counter % 240 == 0) 
        {
            sem.up();
        }
        _frame_counter++;
    }

    void do_something_more_thread()
    {
       while(1)
       {
            sem.down();
            do_something_more();
       }
    }

private:
    _frame_counter;
    semaphore sem; 
    thread t;
}

我总是在 1 核和 2 核上启动我的程序。所以我使用 start/AFFINITY 1 pro.exe 或 start/AFFINITY 3 prog.exe 从时间的角度来看，一切都很好，max_time_per_frame 低于我们的目标，接近 0.02 秒/帧的平均值。

但是如果我使用 RDTSC 转储在 f 中花费的滴答数。

#include <intrin.h>
...
unsigned long long getTick()
{
    return __rdtsc();
}

void f(unsigned char * const frame) 
{
    s = getTick();

    processFrame(frame); //that's the most important part

    //that part run every 240 frame and should not affect
    // the processFrame flow !
    if(_frame_counter % 240 == 0) 
    {
        sem.up();
    }
    _frame_counter++;

    e = getTick();
    dump(e - s);
}

start/AFFINITY 3 prog.exe max_tick_per_frame 很稳定，正如我预期的那样，我看到了 1 个线程(1 个核心的 100%)，第二个线程在第二个核心上以正常速度启动。

start/AFFINITY 1 pro.exe，我只看到 1 个内核处于 100%(正如预期的那样)，但是 do_something_more 计算时间似乎并没有超过时间，交错线程执行。事实上，每隔一定时间，我就会看到滴答计数的巨大峰值。

所以问题是为什么？唯一有趣的衡量标准是 time 吗？ tick在 1 个内核(频率提升)上运行软件时有意义吗？

最佳答案

虽然您永远无法在 Windows 之外获得真正的实时性能，但您可以使用 Windows API 来减少 RDTSC 的缺陷。 .

这是一个利用 API 的小代码块。

#include <Windows.h>
#include <stdio.h>

int
main(int argc, char* argv[])
{
    double timeTaken;
    LARGE_INTEGER frequency;
    LARGE_INTEGER firstCount;
    LARGE_INTEGER endCount;
    /*-- give us the higheest priority avaliable --*/
    SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);
    /*-- get the frequency of the timer we are using --*/
    QueryPerformanceFrequency(&frequency);  
    /*-- get the timers current tick --*/
    QueryPerformanceCounter(&firstCount);
    /*-- some pause --*/
    Sleep(1);
    /*-- get the timers current tick --*/
    QueryPerformanceCounter(&endCount);
    /*-- calculate time passed --*/
    timeTaken = (double)(doubleendCount.QuadPart-firstCount.QuadPart)/(double)(frequency.QuadPart/1000);

    printf("Time: %lf", timeTaken);

    return 0;
}

您还可以使用:

#include <Mmsystem.h>
if(timeBeginPeriod(1) == TIMERR_NOCANDO) {
    printf("TIMER could not be set to 1ms\n");
}
/*-- your code here --*/
timeEndPeriod(1);

但这会将全局 Windows 计时器分辨率更改为您设置的任何时间间隔(或至少尝试它)，所以我不会推荐这种方法，除非您 100% 确定您是唯一会使用的方法这个程序，因为这可能会对其他程序产生意想不到的副作用。

关于c++ - 性能测量 : time vs tick?，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/18089646/

c++ - 性能测量 : time vs tick?

上一篇：c++ - 为什么 Visual Studio 显示多线程，即使我的 MFC 应用程序不是多线程的？

下一篇：c++ - 使用 QTextStream 读取正在写入的文件？