有没有一种方法可以使用Jackson和/或其关联库之一csv，json等将String转换为Java类型

问题描述

是否存在一种机制，可以使用标准的一组检查来检测，然后使用杰克逊的标准文本相关的库之一（csv，json甚至杰克逊核心）将String转换为检测到的类型？我可以想象使用它以及与该值关联的标签（例如CSV标头）来执行类似以下操作：

JavaTypeAndValue typeAndValue = StringToJavaType.fromValue(Object x,String label);  
typeAndValue.type() // FQN of Java type,maybe
typeAndValue.label() // where label might be a column header value,for example
typeAndValue.value() // returns Object  of typeAndValue.type()

将需要一组“提取器”来应用转换，并且该类的使用者必须了解“对象”返回类型的“歧义”，但仍然能够使用和使用该信息，鉴于其目的。

我目前正在考虑的示例涉及构造sql DDL或DML，就像CREATE Table语句一样，使用从评估csv文件中的行所派生的List中获得的信息。

经过进一步的挖掘，希望在那里找到一些东西，我写下了自己的想法。

请记住，我的目的不是要表达一些“完整”的内容，因为我敢肯定，这里有些遗漏的地方，未解决的极端情况等。

pasrse(List<Map<String,String>> rows,List<String> headers的想法是，例如，这可能是从Jackson读取的CSV文件中的行的示例。

同样，这还不完整，因此我不想挑出以下所有错误的内容。问题不是“我们怎么写这个？”，而是“是否有人熟悉存在的东西并执行以下操作？”。

import gms.labs.cassandra.sandBox.extractors.Extractor;
import gms.labs.cassandra.sandBox.extractors.Extractors;
import lombok.Builder;
import lombok.Getter;
import lombok.Setter;
import lombok.experimental.Accessors;

@Accessors(fluent=true,chain=true)
public class TypeAndValue
{

    @Builder
    TypeAndValue(Class<?> type,String rawValue){
        this.type = type;
        this.rawValue = rawValue;
        label = "NONE";
    }

    @Getter
    final Class<?> type;

    @Getter
    final String rawValue;

    @Setter
    @Getter
    String label;

    public Object value(){
        return Extractors.extractorFor(this).value(rawValue);
    }

    static final String DEFAULT_LABEL = "NONE";

}

一个简单的解析器，其中parse来自我从CSVReader获得List<Map<String,String>>的上下文中。

import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.math.NumberUtils;

import java.util.*;
import java.util.function.BiFunction;

public class JavaTypeParser
{
public static final List<TypeAndValue> parse(List<Map<String,List<String> headers)
{
    List<TypeAndValue> typesAndVals = new ArrayList<TypeAndValue>();
    for (Map<String,String> row : rows) {
        for (String header : headers) {
            String val = row.get(header);
            TypeAndValue typeAndValue =
                    //  isNull,isBoolean,isNumber
                    isNull(val).orElse(isBoolean(val).orElse(isNumber(val).orElse(_typeAndValue.apply(String.class,val).get())));
            typesAndVals.add(typeAndValue.label(header));
        }
    }
  
}

public static Optional<TypeAndValue> isNumber(String val)
{
    if (!NumberUtils.isCreatable(val)) {
        return Optional.empty();
    } else {
        return _typeAndValue.apply(NumberUtils.createNumber(val).getClass(),val);
    }
}

public static Optional<TypeAndValue> isBoolean(String val)
{
    boolean bool = (val.equalsIgnoreCase("true") || val.equalsIgnoreCase("false"));
    if (bool) {
        return _typeAndValue.apply(Boolean.class,val);
    } else {
        return Optional.empty();
    }
}

public static Optional<TypeAndValue> isNull(String val){
    if(Objects.isNull(val) || val.equals("null")){
        return _typeAndValue.apply(ObjectUtils.Null.class,val);
    }
    else{
        return Optional.empty();
    }
}

static final BiFunction<Class<?>,String,Optional<TypeAndValue>> _typeAndValue = (type,value) -> Optional.of(
        TypeAndValue.builder().type(type).rawValue(value).build());

}

提取器。这是一个如何将值的“提取程序”（包含在字符串中）注册到某个地方以进行查找的示例。也可以通过许多其他方式引用它们。

import gms.labs.cassandra.sandBox.TypeAndValue;
import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.math.NumberUtils;

import java.math.BigDecimal;
import java.math.BigInteger;
import java.util.Arrays;
import java.util.List;

public class Extractors
{

private static final List<Class> NUMS = Arrays.asList(
        BigInteger.class,BigDecimal.class,Long.class,Integer.class,Double.class,Float.class);

public static final Extractor<?> extractorFor(TypeAndValue typeAndValue)
{
    if (NUMS.contains(typeAndValue.type())) {
        return (Extractor<Number>) value -> NumberUtils.createNumber(value);
    } else if(typeAndValue.type().equals(Boolean.class)) {
        return  (Extractor<Boolean>) value -> Boolean.valueOf(value);
    } else if(typeAndValue.type().equals(ObjectUtils.Null.class)) {
        return  (Extractor<ObjectUtils.Null>) value -> null; // should we just return the raw value.  some frameworks coerce to null.
    } else if(typeAndValue.type().equals(String.class)) {
        return  (Extractor<String>) value -> typeAndValue.rawValue(); // just return the raw value.  some frameworks coerce to null.
    }
    else{
        throw new RuntimeException("unsupported");
    }
}
}

我从JavaTypeParser类中运行了此代码，以供参考。

public static void main(String[] args)
{

    Optional<TypeAndValue> num = isNumber("-1230980980980980980980980980980988009808989080989809890808098292");
    num.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass());  // BigInteger
    });
    num = isNumber("-123098098097987");
    num.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass()); // Long
    });
    num = isNumber("-123098.098097987"); // Double
    num.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass());
    });
    num = isNumber("-123009809890898.0980979098098908080987"); // BigDecimal
    num.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass());
    });

    Optional<TypeAndValue> bool = isBoolean("FaLse");
    bool.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        System.out.println(typeAndVal.value().getClass()); // Boolean
    });

    Optional<TypeAndValue> nulll = isNull("null");
    nulll.ifPresent(typeAndVal -> {
        System.out.println(typeAndVal.value());
        //System.out.println(typeAndVal.value().getClass());  would throw null pointer exception
        System.out.println(typeAndVal.type()); // ObjectUtils.Null (from apache commons lang3)
    });

}

解决方法

我不知道有任何库可以做到这一点，也从未见过任何在开放的可能类型上以这种方式工作的东西。

对于一组封闭的类型（您知道所有可能的输出类型），更简单的方法是将FQN类写在字符串中（根据您的描述，如果您可以控制所写的字符串，则无法得到）。
完整的FQN or an alias to it。

否则，我认为没有写所有的支票是无法逃脱的。

此外，在考虑边缘用例时，它将非常精致。

假设您在字符串中使用json作为序列化格式，那么如何区分String之类的Hello World值和以某种ISO格式（例如{{1 }}。为此，您需要在执行的检查中引入一些优先级（首先尝试使用一些正则表达式来检查它是否是日期，如果不使用下一个，则将简单的字符串一个作为最后一个）。

如果有两个对象怎么办？

Date

您会收到第二种类型的序列化值，但是薪水为空（null或属性完全丢失）。

您如何分辨集合或列表之间的区别？

我不知道您的意图是否如此动态，或者您已经知道所有可能的可反序列化的类型，也许问题中的更多详细信息会有所帮助。

更新

仅看了一下代码，现在看起来就更清楚了。如果您知道所有可能的输出，就是这样。
我唯一要做的更改就是减轻抽象抽取过程中要管理的类型的增加。
为此，我认为应该进行一些小的更改，例如：

2020-09-22

然后您可以为每种类型定义提取器：

   String name;
   String surname;
}

class Employee {
   String name;
   String surname;
   Integer salary
}

然后注册并自动执行检查：

interface Extractor {
    Boolean match(String value);
    Object extract(String value);
}

要解析CSV，我建议使用https://commons.apache.org/proper/commons-csv，因为CSV解析会引起麻烦。

您实际上想做的是写一个parser。您将片段转换为解析树。解析树捕获类型和值。对于数组和对象等分层类型，每个树节点都包含子节点。

Antlr是最常用的解析器之一（尽管对您的用例有些夸大）。 Antlr为Json提供了开箱即用的支持。

我建议花点时间吸收所有涉及的概念。即使一开始它看起来似乎有些矫kill过正，但是当您进行任何扩展时，它很快就会得到回报。改变语法相对容易。生成的代码非常复杂。此外，所有解析器生成器都会验证您的语法以显示逻辑错误。

当然，如果将自己限制为仅解析CSV或JSON（而不是同时解析），则应该使用现有库的解析器。例如，杰克逊有ObjectMapper.readTree来获取解析树。您还可以使用ObjectMapper.readValue(<fragment>,Object.class)来简单地获取规范的Java类。

尝试一下：

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;

String j = // json string;

            JsonFactory jsonFactory = new JsonFactory();
            ObjectMapper jsonMapper = new ObjectMapper(jsonFactory);
            JsonNode jsonRootNode = jsonMapper.readTree(j);
            Iterator<Map.Entry<String,JsonNode>> jsonIterator = jsonRootNode.fields();

            while (jsonIterator.hasNext()) {
                Map.Entry<String,JsonNode> jsonField = jsonIterator.next();
                String k = jsonField.getKey();
                String v = jsonField.getValue().toString();
                ...

            }

jackson jackson jackson jackson-dataformat-csv java java type-conversion