H2O目标编码器Mojo模型抛出NULL指针异常

问题描述

我正在h2o文档上使用给定的python example构建目标编码器模型,并尝试使用该模型的mojo通过java预测目标编码。但是,mojo预测仅在测试数据中存在,而在训练数据中不存在,具有以下错误的类别上失败

Exception in thread "main" java.lang.NullPointerException
    at hex.genmodel.algos.targetencoder.TargetEncoderMojoModel.computeEncodings(TargetEncoderMojoModel.java:87)
    at hex.genmodel.algos.targetencoder.TargetEncoderMojoModel.score0(TargetEncoderMojoModel.java:72)
    at hex.genmodel.easy.EasyPredictModelWrapper.predict(EasyPredictModelWrapper.java:889)
    at hex.genmodel.easy.EasyPredictModelWrapper.transformWithTargetEncoding(EasyPredictModelWrapper.java:618)
    at main.main(main.java:26)

深入研究目标编码器mojo后,发现domains.txt中仅存在测试数据中存在的类别,因此目标编码器不会将这些类别视为丢失的类别。但是encoding_map.ini中缺少这些类别的目标编码,因此,当模型尝试使用NullPointerException访问此类类别的编码时,模型将抛出encoding_map.ini

训练模型的代码:

h2o.init()
from h2o.estimators import H2OTargetEncoderEstimator

titanic = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/gbm_test/titanic.csv")
titanic['survived'] = titanic['survived'].asfactor()
response='survived'

train,test = titanic.split_frame(ratios = [.5],seed = 1234)

encoded_columns = ["home.dest","cabin","embarked"]

blended_avg= True
inflection_point = 3
smoothing = 10
noise = 0.15
data_leakage_handling = "k_fold"
fold_column = "kfold_column"
train[fold_column] = train.kfold_column(n_folds=5,seed=3456)

titanic_te = H2OTargetEncoderEstimator(fold_column=fold_column,data_leakage_handling=data_leakage_handling,blending=blended_avg,k=inflection_point,f=smoothing)

titanic_te.train(x=encoded_columns,y=response,training_frame=train)

titanic_te.download_mojo(get_genmodel_jar=True)

获取编码的代码:

import java.io.*;
import java.util.Arrays;
import hex.genmodel.easy.RowData;
import hex.genmodel.easy.EasyPredictModelWrapper;
import hex.genmodel.easy.prediction.*;
import hex.genmodel.MojoModel;
import hex.genmodel.algos.targetencoder.TargetEncoderMojoModel;

public class main {

    public static void main(String[] args) throws Exception {

        EasyPredictModelWrapper model = new EasyPredictModelWrapper(MojoModel.load("TargetEncoder_model_python_1599838802418_2.zip"));

        String[] temp_home = { "?Havana  Cuba","Aberdeen / Portland  OR","Albany  NY","Altdorf  Switzerland","Amenia  ND","Antwerp  Belgium / Stanton  OH","Asarum  Sweden Brooklyn  NY","Ascot  Berkshire / Rochester  NY","Auburn  NY","Aughnacliff  Co Longford  Ireland New York  NY","Australia Fingal  ND","Austria Niagara Falls  NY","Austria-Hungary","Austria-Hungary / Germantown  Philadelphia  PA"};

        for(int j=0; j<temp_home.length; j++){
            RowData row = new RowData();
            row.put("cabin","D43");
            row.put("embarked","C");
            row.put("home.dest",temp_home[j]);

            TargetEncoderPrediction p = model.transformWithTargetEncoding(row);
            System.out.println(Arrays.toString(p.transformations));
        }
    }
}

编译命令: javac -cp h2o-genmodel.jar -J-Xmx2g main.java

运行命令: java -cp。:h2o-genmodel.jar main

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...