背景
我一直在尝试用 Python (3.7) 编写一个分辨率至少为微秒的可靠计时器。目的是每隔一段时间运行一些特定的任务,持续很长一段时间。
经过一些研究后,我决定使用 perf_counter_ns
,因为它具有更高的一致性和测试分辨率等(monotonic_ns
、time_ns
、 process_time_ns
, and thread_time_ns
), 详细信息可以在 time module documentation 中找到和 PEP 564
测试
为了确保 perf_counter_ns 的精度(和准确性),我设置了一个测试来收集连续时间戳之间的延迟,如下所示。
import time
import statistics as stats
# import resource
def practical_res_test(clock_timer_ns, count, expected_res):
counter = 0
diff = 0
timestamp = clock_timer_ns() # initial timestamp
diffs = []
while counter < count:
new_timestamp = clock_timer_ns()
diff = new_timestamp - timestamp
if (diff > 0):
diffs.append(diff)
timestamp = new_timestamp
counter += 1
print('Mean: ', stats.mean(diffs))
print('Mode: ', stats.mode(diffs))
print('Min: ', min(diffs))
print('Max: ', max(diffs))
outliers = list(filter(lambda diff: diff >= expected_res, diffs))
print('Outliers Total: ', len(outliers))
if __name__ == '__main__':
count = 10000000
# ideally, resolution of at least 1 us is expected
# but let's just do 10 us for the sake of this test
expected_res = 10000
practical_res_test(time.perf_counter_ns, count, expected_res)
# other method benchmarks
# practical_res_test(time.time_ns, count, expected_res)
# practical_res_test(time.process_time_ns, count, expected_res)
# practical_res_test(time.thread_time_ns, count, expected_res)
# practical_res_test(
# lambda: int(resource.getrusage(resource.RUSAGE_SELF).ru_stime * 10**9),
# count,
# expected_res
# )
问题与疑问
问题:为什么在时间戳之间偶尔会有明显的跳跃? 在我的 Raspberry Pi 3 Model B V1.2 上进行了 10,000,000 次计数的多次测试产生了类似的结果,其中之一如下(时间当然是以纳秒为单位):
Mean: 2440.1013097
Mode: 2396
Min: 1771
Max: 1450832 # huge skip as I mentioned
Outliers Total: 8724 # delays that are more than 10 us
我的 Windows 桌面上的另一个测试:
Mean: 271.05812 # higher end machine - better resolution
Mode: 200
Min: 200
Max: 30835600 # but there're still skips, even more significant
Outliers Total: 49021
虽然我知道不同系统的分辨率会有所不同,但很容易注意到我的测试中的分辨率比 PEP 564 中的评级低得多.最重要的是,偶尔会观察到跳跃。
如果您对发生这种情况的原因有任何见解,请告诉我。它是否与我的测试有关,或者 perf_counter_ns 是否一定会在此类用例中失败?如果是这样,您对更好的解决方案有什么建议吗? 如果我需要提供任何其他信息,请告诉我。
附加信息
为了完成,这里是来自 time.get_clock_info() 的时钟信息
在我的树莓派上:
Clock: perf_counter
Adjustable: False
Implementation: clock_gettime(CLOCK_MONOTONIC)
Monotonic: True
Resolution(ns): 1
在我的 Windows 桌面上:
Clock: perf_counter
Adjustable: False
Implementation: QueryPerformanceCounter()
Monotonic: True
Resolution(ns): 100
还值得一提的是,我知道 time.sleep()
,但从我的测试和用例来看,它并不是特别可靠,因为其他人已经讨论过 here
最佳答案
如果绘制时间差列表,您会看到基线相当低,峰值随时间增加。
这是由 append() 操作引起的,它偶尔需要重新分配底层数组(Python 列表就是这样实现的)。 通过预先分配数组,结果会有所改善:
import time
import statistics as stats
import gc
import matplotlib.pyplot as plt
def practical_res_test(clock_timer_ns, count, expected_res):
counter = 0
diffs = [0] * count
gc.disable()
timestamp = clock_timer_ns() # initial timestamp
while counter < count:
new_timestamp = clock_timer_ns()
diff = new_timestamp - timestamp
if diff > 0:
diffs[counter] = diff
timestamp = new_timestamp
counter += 1
gc.enable()
print('Mean: ', stats.mean(diffs))
print('Mode: ', stats.mode(diffs))
print('Min: ', min(diffs))
print('Max: ', max(diffs))
outliers = list(filter(lambda diff: diff >= expected_res, diffs))
print('Outliers Total: ', len(outliers))
plt.plot(diffs)
plt.show()
if __name__ == '__main__':
count = 10000000
# ideally, resolution of at least 1 us is expected
# but let's just do 10 us for the sake of this test
expected_res = 10000
practical_res_test(time.perf_counter_ns, count, expected_res)
这些是我得到的结果:
Mean: 278.6002
Mode: 200
Min: 200
Max: 1097700
Outliers Total: 3985
相比之下,这些是使用原始代码在我的系统上的结果:
Mean: 333.92254
Mode: 300
Min: 200
Max: 50507300
Outliers Total: 2590
要获得更好的性能,您可能需要在 Linux 上运行并使用 SCHED_FIFO。但永远记住,微秒精度的实时任务不是用 Python 完成的。 如果您的问题是软实时的,您可以侥幸逃脱,但这完全取决于错过最后期限的惩罚以及您对代码和 Python 解释器的时间复杂性的理解。
关于python - 为什么Python的时间函数(如perf_counter_ns)解析会不一致?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57248652/