gpt4 book ai didi

azure-blob-storage - 流分析的 blob 路径中不考虑自定义时间戳

转载 作者:行者123 更新时间:2023-12-04 13:37:22 24 4
gpt4 key购买 nike

给出一个如下所示的查询:

SELECT
EventDate,
system.Timestamp as test
INTO
[azuretableoutput]
FROM
[csvdata] TIMESTAMP BY EventDate

根据文档,EventDate 现在应该用作时间戳。
但是,当使用此路径将数据存储到 blobstorage 时:
sadata/Y={datetime:yyyy}/M={datetime:MM}/D={datetime:dd}

我似乎仍然得到了摄入时间。在我的情况下,摄取的时间没有任何意义,我需要使用 EventDate 作为路径。这可能吗?

在 Visual Studio 中检查数据时,test 和 EventDate 应该相等,但结果如下所示:
EventDate                   ;Test
2020-04-03T11:13:07.3670000Z;2020-04-09T02:16:15.5390000Z
2020-04-03T11:13:07.0460000Z;2020-04-09T02:16:15.5390000Z
2020-04-03T11:13:07.0460000Z;2020-04-09T02:16:15.5390000Z
2020-04-03T11:13:07.3670000Z;2020-04-09T02:16:15.5390000Z
2020-04-03T11:13:08.1470000Z;2020-04-09T02:16:15.5390000Z

延迟容忍到达窗口设置为:99:23:59:59
乱序容差设置为:00:00:00:00,乱序 Action 设置为调整。

在 Azure 上的流分析中运行相同的查询时,我得到以下结果:
[{"eventdate":"2020-04-03T11:13:20.1060000Z","test":"2020-04-03T11:13:20.1060000Z"},
{"eventdate":"2020-04-03T11:13:20.1060000Z","test":"2020-04-03T11:13:20.1060000Z"},
{"eventdate":"2020-04-03T11:13:20.1060000Z","test":"2020-04-03T11:13:20.1060000Z"}]

到现在为止还挺好。当使用 Azure 上的数据运行查询时,它会生成以下路径:
 Y=2020/M=04/D=09

它应该产生这样的路径:
Y=2020/M=04/D=03
有趣的是,在检查实际存储在 blobstorage 中的数据时,我发现:
EventDate,test
2020-04-03T11:20:39.3100000Z,2020-04-09T19:33:35.3870000Z,

System.timestamp 似乎只有在对采样数据测试查询时才会更改,但在查询正常运行并接收数据时实际上并未更改。

我已经将延迟到达设置为 0 天和 20 天进行了测试。实际上,我需要禁用延迟到达调整,因为我可能会通过管道获得多年以前的事件。

最佳答案

此问题已在 MicrosoftDocs GitHub 上提出并关闭

微软的人说:

Maximum days for late arrival is 20, so if the policy is set to 99:23:59:59 (99 days). The adjustment could be causing a discrepancy in System.Timestamp.

By definition of late arrival tolerance window, for each incoming event, Azure Stream Analytics compares the event time with the arrival time; if the event time is outside of the tolerance window, you can configure the system to either drop the event or adjust the event’s time to be within the tolerance.

Consider that after watermarks are generated, the service can potentially receive events with event time lower than the watermark. You can configure the service to either drop those events, or adjust the event’s time to the watermark value.

As a part of the adjustment, the event’s System.Timestamp is set to the new value, but the event time field itself is not changed. This adjustment is the only situation where an event’s System.Timestamp can be different from the value in the event time field, and may cause unexpected results to be generated.

For more information, please see Understand time handling in Azure Stream Analytics.

Unfortunately, testing with sample data in Azure portal doesn't take policies into account at this time.



其他可能有用的资源:
  • System.Timestamp()
  • TIMESTAMP BY
  • Event ordering policies
  • Time handling
  • Job monitoring
  • 关于azure-blob-storage - 流分析的 blob 路径中不考虑自定义时间戳,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61011072/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com