json - 用于 JSON 转换的 U-SQL 脚本

标签 json azure azure-data-lake azure-data-factory u-sql

我有多个级别的 JSON 文件,需要在 Azure Data Lake Analytics 中对其进行转换,以获取从“vin”级别(包括)开始的所有数据

剪切后的json为:

"vehicleStatusResponse": {
    "vehicleStatuses": [
        {
            "vin": "ABC1234567890",
            "triggerType": {
                "triggerType": "TIMER",
                "context": "RFMS",
                "driverId": {
                    "tachoDriverIdentification": {
                        "driverIdentification": "123456789",
                        "cardIssuingMemberState": "BRA",
                        "driverAuthenticationEquipment": "CARD",
                        "cardReplacementIndex": "0",
                        "cardRenewalIndex": "1"
                    }
                }
            },
            "receivedDateTime": "2020-02-12T04:11:19.221Z",
            "hrTotalVehicleDistance": 103306960,
            "totalEngineHours": 3966.6216666666664,
            "driver1Id": {
                "tachoDriverIdentification": {
                    "driverIdentification": "BRA1234567"
                }
            },
            "engineTotalFuelUsed": 48477520,
            "accumulatedData": {
                "durationWheelbaseSpeedOverZero": 8309713,
                "distanceCruiseControlActive": 8612200,
                "durationCruiseControlActive": 366083,
                "fuelConsumptionDuringCruiseActive": 3064170,
                "durationWheelbaseSpeedZero": 5425783,
                "fuelWheelbaseSpeedZero": 3332540,
                "fuelWheelbaseSpeedOverZero": 44709670,
                "ptoActiveClass": [
                    {
                        "label": "wheelbased speed >0",
                        "seconds": 16610,
                        "meters": 29050,
                        "milliLitres": 26310
                    },
                    {
                        "label": "wheelbased speed =0",
                        "seconds": 457344,
                        "milliLitres": 363350

有一个针对这种情况的脚本:

CREATE ASSEMBLY IF NOT EXISTS [Newtonsoft.Json] FROM "adl://#####.azuredatalakestore.net/Newtonsoft.Json.dll";
CREATE ASSEMBLY IF NOT EXISTS [Microsoft.Analytics.Samples.Formats] FROM "adl://#####.azuredatalakestore.net/Microsoft.Analytics.Samples.Formats.dll";


REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];

USING Microsoft.Analytics.Samples.Formats.Json;

DECLARE @InputFile string = "response.json";
DECLARE @OutputFile string = "response.csv";


@json =
EXTRACT
    vin string,
    triggerType string,
    driverIdentification string,
    receivedDateTime DateTime,
    hrTotalVehicleDistance int,
    totalEngineHours float,
    engineTotalFuelUsed int,
    durationWheelbaseSpeedOverZero int,
    distanceCruiseControlActive int,
    durationCruiseControlActive int,
    fuelConsumptionDuringCruiseActive int,
    durationWheelbaseSpeedZero int,
    fuelWheelbaseSpeedZero int
    ptoActiveClass string
    label string
    seconds int
    meters int
    millilitres int


FROM
    @InputFile
USING new MultiLevelJsonExtractor("vehicleStatusResponse.vehicleStatuses.vin[*]",
    true,
    "vin",
    "triggerType",
    "driverIdentification",
    "receivedDateTime",
    "hrTotalVehicleDistance",
    "totalEngineHours",
    "engineTotalFuelUsed",
    "durationWheelbaseSpeedOverZero",
    "distanceCruiseControlActive",
    "durationCruiseControlActive",
    "fuelConsumptionDuringCruiseActive",
    "durationWheelbaseSpeedZero",
    "fuelWheelbaseSpeedZero", 
    "ptoActiveClass",
    "label",
    "seconds",
    "meters",
    "millilitres"
    );
@new =
SELECT
    vin,
    triggerType,
    driverIdentification,
    receivedDateTime,
    hrTotalVehicleDistance,
    totalEngineHours,
    engineTotalFuelUsed,
    durationWheelbaseSpeedOverZero,
    distanceCruiseControlActive,
    durationCruiseControlActive,
    fuelConsumptionDuringCruiseActive,
    durationWheelbaseSpeedZero,
    fuelWheelbaseSpeedZero,
    ptoActiveClass,
    label,
    seconds,
    meters,
    millilitres
FROM @json;
OUTPUT @new
TO @OutputFile
USING Outputters.Csv();

无论如何,我只得到空白的response.csv,没有数据。我的脚本出了什么问题?如果您有其他方法来转换分层 json 数据,将会很有趣。

最佳答案

您没有正确提取 JSON。您需要像这样使用它:

FROM
    @InputFile
USING new MultiLevelJsonExtractor("vehicleStatusResponse.vehicleStatuses.vin[*]",
    true,
    "tachoDriverIdentification.driverIdentification"
    );

您可以阅读有关使用 U-SQL 进行 JSON 高级 JSON 操作的更多信息 here .

关于json - 用于 JSON 转换的 U-SQL 脚本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61439034/

相关文章:

python - 从 PySpark 中的工作节点访问 ADLS 上的二进制文件的最有效方法?

ios - 如何解析来自 Rest API 的 JSON 响应

python - 通过curl/urllib2向Pyramid应用程序发送json数据没有给出正确的请求。POST

json - 如何从 JSON 字符串中删除子字符串

Azure Kusto 查询用于修剪完整 Azure 资源 ID 的名称

Azure 应用服务(移动应用)AAD 身份验证 token 刷新

java - 如何从 JSONObject 的路径中获取嵌套值?

azure - Azure Synapse Notebook 中的 WriteStream(格式为 ('console' )

azure - 将 Azure-Data-Explorer 与 Azure-Data-Lake-Gen2 连接

azure-data-lake - 如何在 USQL UDO 中记录一些内容?