avro - 我可以将Apache Avro模式拆分为多个文件吗?

标签 avro

我可以,

{
    "type": "record",
    "name": "Foo",
    "fields": [
        {"name": "bar", "type": {
            "type": "record",
            "name": "Bar",
            "fields": [ ]
        }}
    ]
}

并且工作正常,但是假设我想将架构分为两个文件,例如:
{
    "type": "record",
    "name": "Foo",
    "fields": [
        {"name": "bar", "type": "Bar"}
    ]
}

{
    "type": "record",
    "name": "Bar",
    "fields": [ ]
}

Avro有能力做到这一点吗?

最佳答案

是的,有可能。

我已经在Java项目中通过在avro-maven-plugin中定义了通用模式文件来做到这一点
例:

search_result.avro:

{"namespace": "com.myorg.other",
 "type": "record",
 "name": "SearchResult",
 "fields": [
     {"name": "type", "type": "SearchResultType"},
     {"name": "keyWord",  "type": "string"},
     {"name": "searchEngine", "type": "string"},
     {"name": "position", "type": "int"},
     {"name": "userAction", "type": "UserAction"}
 ]
}

search_suggest.avro:
{"namespace": "com.myorg.other",
 "type": "record",
 "name": "SearchSuggest",
 "fields": [
     {"name": "suggest", "type": "string"},
     {"name": "request",  "type": "string"},
     {"name": "searchEngine", "type": "string"},
     {"name": "position", "type": "int"},
     {"name": "userAction", "type": "UserAction"},
     {"name": "timestamp", "type": "long"}
 ]
}

user_action.avro:
{"namespace": "com.myorg.other",
 "type": "enum",
 "name": "UserAction",
 "symbols": ["S", "V", "C"]
}

search_result_type.avro
{"namespace": "com.myorg.other",
 "type": "enum",
 "name": "SearchResultType",
 "symbols": ["O", "S", "A"]
}

avro-maven-plugin配置:
<plugin>
    <groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>1.7.4</version>
    <executions>
    <execution>
        <phase>generate-sources</phase>
        <goals>
        <goal>schema</goal>
        </goals>
    <configuration>
     <sourceDirectory>${project.basedir}/src/main/resources/avro</sourceDirectory>
         <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
     <includes>
         <include>**/*.avro</include>
     </includes>
     <imports>
              <import>${project.basedir}/src/main/resources/avro/user_action.avro</import>
              <import>${project.basedir}/src/main/resources/avro/search_result_type.avro</import>
     </imports>
       </configuration>
     </execution>
</executions>
</plugin>

关于avro - 我可以将Apache Avro模式拆分为多个文件吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21539113/

相关文章:

java - 从 Avro GenericRecord 获取类型值

arrays - avro 模式中的 optional 数组

java - Avro 架构和生成的文件中的十进制数据类型支持

hadoop - 在 pig 中读取二进制 avro

mapreduce - Parquet:将特定列读入内存

java - 使用 kafka 时将 Java 转换为 Avro 并返回

java - Apache Flink 从 Kafka 读取 Avro byte[]

namespaces - Avro 命名空间错误