我尝试使用以下方法在运行时获取 Avro Schema:
private Schema getSchema(Class clazz) {
Schema s = ReflectData.get().getSchema(clazz);
AvroSchema avroSchema = new AvroSchema(s);
return avroSchema.getAvroSchema();
}
但是由于我的 POJO 类包含如下泛型:
public abstract class Data<T> implements Serializable {
private static final long serialVersionUID = 1L;
private String dataType;
private T id;
public Data() {
}
public Data(String dataType) {
this.dataType = dataType;
}
public Data(String dataType, T id) {
this.dataType = dataType;
this.id = id;
}
}
我收到以下异常:
Exception in thread "main" org.apache.avro.AvroRuntimeException: avro.shaded.com.google.common.util.concurrent.UncheckedExecutionException: org.apache.avro.AvroTypeException: Unknown type: T
at org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:227)
我了解 Avro 不支持泛型类型。有没有办法可以在运行时生成架构期间从类中省略某些类字段?
最佳答案
private <T> String writePojoToParquet(List<T> pojos, String fileKey){
String fileName = fileKey + ".parquet";
Path path = new Path(fileName.replace("/", "_"));
//No matter what delete file always.
String strPath = path.toString();
FileUtils.delete(strPath);
FileUtils.delete(strPath + ".crc");
logger.debug("Writing data to parquet file {}", strPath);
Configuration conf = new Configuration();
try (ParquetWriter<T> writer =
AvroParquetWriter.<T>builder(path)
.withSchema(ReflectData.AllowNull.get().getSchema(pojos.get(0).getClass()))
.withDataModel(ReflectData.get())
.withConf(conf)
.withCompressionCodec(CompressionCodecName.SNAPPY)
.withWriteMode(ParquetFileWriter.Mode.OVERWRITE)
.enableValidation()
.enableDictionaryEncoding()
.build()) {
for (T p : pojos) {
writer.write(p);
}
return strPath;
} catch (IOException e) {
logger.error("Error while writing data to parquet file {}.", strPath, e);
}
return null;
}
关于java - 为具有泛型类型的 Java POJO 生成 Avro 架构,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59091375/