python - 光束 : AfterProcessingTime cause 'NoneType' object has no attribute 'time'

标签 python google-cloud-dataflow apache-beam

在 beam 2.14.0

下有以下代码
| "FixedWindow" >> beam.WindowInto(beam.window.FixedWindows(4 * 60),
                      trigger=beam.trigger.Repeatedly(
                              beam.trigger.AfterProcessingTime(delay=1 * 60)
                          ),
                      accumulation_mode=beam.trigger.AccumulationMode.DISCARDING)

出现如下错误

Traceback (most recent call last):
  File "beam_home.py", line 287, in <module>
    run()
  File "beam_home.py", line 282, in run
    p.run().wait_until_finish()
  File "/usr/local/lib/python2.7/site-packages/apache_beam/pipeline.py", line 406, in run
    self._options).run(False)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/pipeline.py", line 419, in run
    return self.runner.run_pipeline(self, self._options)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/direct/direct_runner.py", line 128, in run_pipeline
    return runner.run_pipeline(pipeline, options)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 294, in run_pipeline
    default_environment=self._default_environment))
  File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 301, in run_via_runner_api
    return self.run_stages(stage_context, stages)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 383, in run_stages
    stage_context.safe_coders)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 655, in _run_stage
    result, splits = bundle_manager.process_bundle(data_input, data_output)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 1460, in process_bundle
    process_bundle_id, transform_id, elements)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 1356, in _send_input_to_worker
    for byte_stream in byte_streams:
  File "/usr/local/lib/python2.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 186, in __iter__
    for wkvs in windowed_key_values(key, windowed_values):
  File "/usr/local/lib/python2.7/site-packages/apache_beam/transforms/trigger.py", line 966, in process_entire_key
    state, windowed_values, output_watermark):
  File "/usr/local/lib/python2.7/site-packages/apache_beam/transforms/trigger.py", line 1130, in process_elements
    self.trigger_fn.on_element(value, window, context)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/transforms/trigger.py", line 515, in on_element
    self.underlying.on_element(element, window, context)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/transforms/trigger.py", line 373, in on_element
    self.early.on_element(element, window, NestedContext(context, 'early'))
  File "/usr/local/lib/python2.7/site-packages/apache_beam/transforms/trigger.py", line 515, in on_element
    self.underlying.on_element(element, window, context)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/transforms/trigger.py", line 307, in on_element
    '', TimeDomain.REAL_TIME, context.get_current_time() + self.delay)
  File "/usr/local/lib/python2.7/site-packages/apache_beam/transforms/trigger.py", line 759, in get_current_time
    return self._outer.get_current_time()
  File "/usr/local/lib/python2.7/site-packages/apache_beam/transforms/trigger.py", line 733, in get_current_time
    return self._clock.time()
AttributeError: 'NoneType' object has no attribute 'time'

我错过了什么吗?

最佳答案

看起来你没有遗漏任何东西。 这显然是一个已知问题。请看BEAM-5132 .

我认为最好的解决方法是避免使用 AfterProcessingTime,这是根本原因。这很烦人,但您可以在 ParDo 类中模仿它的效果。

关于python - 光束 : AfterProcessingTime cause 'NoneType' object has no attribute 'time' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57676311/

相关文章:

java - @OnTimer 在窗口后不触发

java - GCP 数据流 - SSLHandshakeException

google-cloud-dataflow - 诊断失败的 Cloud Dataflow 流水线

python - 关闭到 Google 云存储的 SSL

python - GMM - 对数似然不是单调的

python - 如何使用 'add_value_provider_argument'初始化运行时参数?

python - 使用 Python SDK 在 Spark 上运行 Apache Beam wordcount 管道时并行度低

apache-beam - 使用 tensorflow transform 的 writeTransform 函数时出错

python - 导入错误:没有名为 cStringIO 的模块

python - Python 3.5.3 中相同的字符串怎么可能是不同的对象?