c++ - 如何使 perf_event_open() 中的 PERF_COUNT_SW_CONTEXT_SWITCHES 配置起作用？

我正在为我编写的软件设置分析，但我无法使用 perf_event_open 获得上下文切换计数。

为了测试问题，我也尝试使用 perf_event_open man_page 上提供的示例代码。使用 sched_yield 并使用任务集在同一核心上运行并行进程以强制上下文切换。使用 perf_event_open() 的上下文切换计数仍然为 0。(使用 perf stat 时我得到非零数字:在大循环中数以千计)。我也尝试过读取文件/使用 mmap 来强制页面错误。

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/ioctl.h>
#include <linux/perf_event.h>
#include <asm/unistd.h>
#include <iostream>
#include <string.h>
#include <sys/mman.h>
using namespace std;
int buf_size_shift = 8;

static unsigned perf_mmap_size(int buf_size_shift)
{
    return ((1U << buf_size_shift) + 1) * sysconf(_SC_PAGESIZE);
}

static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
            int cpu, int group_fd, unsigned long flags)
{
        int ret;

        ret = syscall(__NR_perf_event_open, hw_event, pid, cpu,
                      group_fd, flags);
        return ret;
}


int main(int argc, char **argv)
{

       struct perf_event_attr pe;
       long long count;
       int fd;

       memset(&pe, 0, sizeof(struct perf_event_attr));
       pe.type = PERF_TYPE_SOFTWARE;
       //pe.sample_type = PERF_SAMPLE_CALLCHAIN; /* this is what allows you to obtain callchains */

       pe.size = sizeof(struct perf_event_attr);
       pe.config = PERF_COUNT_SW_CONTEXT_SWITCHES;
       pe.disabled = 1;
       pe.exclude_kernel = 1;
       pe.sample_period = 1000;
       pe.exclude_hv = 1;

       fd = perf_event_open(&pe, 0, -1, -1, 0); 
       if (fd == -1) {
          fprintf(stderr, "Error opening leader %llx\n", pe.config);
          exit(EXIT_FAILURE);
       }

       /* associate a buffer with the file */
       struct perf_event_mmap_page *mpage;
       mpage = (perf_event_mmap_page*) mmap(NULL,  perf_mmap_size(buf_size_shift),
        PROT_READ|PROT_WRITE, MAP_SHARED,
       fd, 0);
       if (mpage == (struct perf_event_mmap_page *)-1L) {
        close(fd);
        return -1;
       }

       ioctl(fd, PERF_EVENT_IOC_RESET, 0);
       ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);

       printf("Measuring instruction count for this printf\n");
       long long sum = 0;
       for (long long i = 0; i < 10000000000; i++) {
           sum += i;
           if (i%1000000 == 0)
               cout << i << " : " << sum << endl;
       } 

       ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
       read(fd, &count, sizeof(long long));

       printf("Used %lld cs\n", count);

       close(fd);
}

type = PERF_COUNT_SOFTWARE 和 config = PERF_COUNT_SW_CONTEXT_SWITCHES 的代码即使在强制上下文切换的情况下也会在计数中输出 0。而其他指标正在发挥作用。

在使用 mmap 环形缓冲区时，我在读取它时看到 PERF_RECORD_SWITCH 记录，而根据我的理解，正在记录上下文切换事件。

有关性能计数和环形缓冲区中的数据如何相关的任何信息也很受欢迎。

最佳答案

事件未被计算在内，因为您禁用了来自内核 (exclude_kernel = 1;) 和 PERF_TYPE_SOFTWARE 的事件事件通常由内核提供。

如果删除 exclude_kernel , 事件被计算在内。

计数与环形缓冲区中记录的事件之间的联系是sample_period .您的设置 pe.sample_period = 1000;意味着每 1000 个开关事件，一个 PERF_RECORD_SAMPLE事件被写入环形缓冲区。

以下读取缓冲区的例子只是为了说明一般方法。在实践中，您需要处理环绕缓冲区末尾的事件，并进行更多的一致性检查。

   auto tail = mpage->data_tail;
   const auto head = mpage->data_head;
   const auto size = mpage->data_size;
   char* data = reinterpret_cast<char*>(mpage) + sysconf(_SC_PAGESIZE);
   int events = 0;
   while (true) {
       if (tail >= head) break;
       auto event_header_p = (struct perf_event_header*)(data + (tail % size));
       std::cout << "event << " << event_header_p->type << ", size: " << event_header_p->size << "\n";
       tail += event_header_p->size;
       events++;
   }

您应该找到相应数量的 PERF_RECORD_SAMPLE == 9 类型的事件在缓冲区中(除非有溢出)。如果要读取它们，则需要将指针转换为适当的结构。 PERF_RECORD_SAMPLE的实际布局|事件 - 或任何其他事件 - 取决于您的 perf_event_attr配置并记录在 perf_event_open 中.

关于c++ - 如何使 perf_event_open() 中的 PERF_COUNT_SW_CONTEXT_SWITCHES 配置起作用？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57709699/

c++ - 如何使 perf_event_open() 中的 PERF_COUNT_SW_CONTEXT_SWITCHES 配置起作用？

上一篇：c++ - 使用命名空间的 QT 测试

下一篇：c++ - g++ 和 clang++ 都存在模板函数参数包扩展问题？