在保持LogicalTypes的同时将Parquet / Avro GenericRecord写入JSON

问题描述

我正在尝试将一些包含LogicalTypes的Parquet记录写入JSON。我通过AvroParquetReader执行此操作,这给了我Avro GenericRecord

GenericData.get().addLogicalTypeConversion(new TimeConversions.TimeMillisConversion());

try (ParquetReader<GenericRecord> parquetReader =
    AvroParquetReader.<GenericRecord>builder(new LocalInputFile(this.path))
        .withDataModel(GenericData.get())
        .build()) {
    GenericRecord record = parquetReader.read();
    record.toString();
}

record.toString()产生:

{"universe_member_id": 94639,"member_from_dt": 2001-08-31T00:00:00Z,"member_to_dt": 2200-01-01T00:00:00Z}

请注意,这是无效的JSON-日期已按照其LogicalType正确转换,但没有用引号引起来。

因此,我尝试了JsonEncoder

GenericData.get().addLogicalTypeConversion(new TimeConversions.TimeMillisConversion()); //etc
OutputStream stringOutputStream = new StringOutputStream();

try (ParquetReader<GenericRecord> parquetReader =
    AvroParquetReader.<GenericRecord>builder(new LocalInputFile(this.path))
        .withDataModel(GenericData.get())
        .build()) {
    GenericRecord record = parquetReader.read();
    DatumWriter<GenericRecord> writer = new GenericDatumWriter<>(record.getSchema());
    JsonEncoder encoder = EncoderFactory.get().jsonEncoder(record.getSchema(),stringOutputStream);
    writer.write(record,encoder);
    encoder.flush();
}

但是这根本不会转换日期字段并将数据类型烘焙到每条记录中:

{"universe_member_id":{"long":94639},"member_from_dt":{"long":999216000000000},"member_to_dt":{"long":7258118400000000}}

我正在寻找的输出是:

{"universe_member_id": 94639,"member_from_dt": "2001-08-31T00:00:00Z","member_to_dt": "2200-01-01T00:00:00Z"}

如何正确地将GenericRecord写入JSON?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)