我正在运行 MPI 作业并收到此警告消息:
[comet-05-08.sdsc.edu:mpi_rank_10][async_thread] Got unknown event 17 ... continuing ...
我正在使用 MVAPICH 2.1 与 icc (ICC) 15.0.2 20150121 进行编译。
该消息的含义是什么?有害吗?
最佳答案
来自this邮件列表:
this error message is being printed by the asynchronous progress thread because of receving an IBV_EVENT_CLIENT_REREGISTER event (event #17).
建议您更新至最新版本。我链接到的邮件建议使用 MAPICH2 1.4(比您的新),尽管该邮件来自 2009 年。
可能生成的代码是:
switch (event.event_type) {
...
break;
default:
NEM_IB_ERR("Got unknown event %d ... continuing ...",
event.event_type);
}
在哪里可以找到完整代码 here .
如评论部分所示:
IBV_EVENT_CLIENT_REREGISTER
The SM requests that the client will reregister to all subscriptions previously requested from this port, for example (but not limited to) join a multicast group. This event may be generated when the SM suffered from a failure, which caused it to lose his records or when there is new SM in the subnet.
This event will be generated by the device only if the bit that indicates that client reregister is supported set in
port_attr.port_cap_flags
.
我不会对这个事件感到满意,所以如果我是你,我会更新。如果问题仍然存在,我会联系 MAPICH2 人员。
关于multithreading - "Got unknown event 17 ... continuing ..."对于 MPI 意味着什么,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37979144/