我已上传以下输入(在 Azure 门户中进行测试):
[
{"engineid":"engine001","eventtime":1,"tmp":19.3,"hum":0.22},
{"engineid":"engine001","eventtime":2,"tmp":19.7,"hum":0.21},
{"engineid":"engine002","eventtime":3,"tmp":20.4,"hum":0.25},
{"engineid":"engine001","eventtime":4,"tmp":19.6,"hum":0.24}
]
然后我尝试获取记录组,以便我拥有每个引擎的最后 2 行。正如您在示例中看到的,我只有 2 个不同的引擎,因此我期望输出包含两条记录,每条记录都包含排名记录,但我得到了 4 个输出记录。
这是我的查询:
-- Taking relevant fields from the input stream
WITH RelevantTelemetry AS
(
SELECT engineid, tmp, hum, eventtime
FROM [engine-telemetry]
WHERE engineid IS NOT NULL
),
-- Grouping by engineid in TimeWindows
TimeWindows AS
(
SELECT engineid,
CollectTop(2) OVER (ORDER BY eventtime DESC) as TimeWindow
FROM
[RelevantTelemetry]
WHERE engineid IS NOT NULL
GROUP BY SlidingWindow(hour, 24), engineid
)
--Output timewindows for verification purposes
SELECT TimeWindow
INTO debug
FROM TimeWindows
我使用了 TIMESTAMP BY 属性,更改了 GROUP BY 的顺序等,但我仍然保留以下 4 条记录,而不是我期望的 2 条记录:
有什么想法吗?
[
{"TimeWindow":
[
{"rank":1,"value": "engineid":"engine001","tmp":0.0003,"hum":-0.0002,"eventtime":1}}
]},
{"TimeWindow":
[
{"rank":1,"value":{"engineid":"engine001","tmp":-0.0019,"hum":-0.0002,"eventtime":4}},
{"rank":2,"value":{"engineid":"engine001","tmp":-0.0026,"hum":-0.0002,"eventtime":2}},
{"rank":3,"value":{"engineid":"engine001","tmp":0.0003,"hum":-0.0002,"eventtime":1}}
]},
{"TimeWindow":
[
{"rank":1,"value":{"engineid":"engine002","tmp":0.0017,"hum":0.0003,"eventtime":3}}
]},
{"TimeWindow":
[
{"rank":1,"value":{"engineid":"engine001","tmp":-0.0019,"hum":-0.0002,"eventtime":4}},
{"rank":2,"value":{"engineid":"engine001","tmp":-0.0026,"hum":-0.0002,"eventtime":2}}
]}
]
最佳答案
根据@SteveZhao的建议,您需要使用GROUP BY TumblingWindow(hour, 24), engineid
而不是GROUP BY SlidingWindow(hour, 24),engineid
欲了解更多信息,请参阅: https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions
关于azure - CollectTop 在 Azure 流分析中返回的行数超出了我的预期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63267072/