有关h2o.grid函数中的并行性的问题

问题描述

我尝试使用h2o包中的h2o.grid()函数通过R进行一些调整，当我将参数parallelism设置为大于1时，它总是显示警告

某些模型由于故障而无法构建，有关更多详细信息，请运行`summary（grid_object，show_stack_traces = TRUE）

并且最终网格对象中的model_ids包含许多以_cv_1，_cv_2等结尾的模型，并且模型的数量不等于{ {1}}列表，我认为它们只是max_models流程中的模型，而不是最终模型。

当我将search_criteria设置为大于1时：

when I set "parallelism" larger than 1

当我保留cv默认值或将其设置为1时，结果是正常的，所有模型都以parallelism，parallelism等结尾。

当我保留“并行度”默认值或将其设置为1时：

when I leave the "parallelism" default or set it to 1

这是我的代码：

_model_1

那么如何在_model_2中正确使用# set the grid rf_h2o_grid <- list(mtries = seq(3,ncol(train_h2o),4),max_depth = c(5,10,15,20)) # set the search_criteria sc <- list(strategy = "Randomdiscrete",seed = 100,max_models = 5 ) # random grid tuning rf_h2o_grid_tune_random <- h2o.grid( algorithm = "randomForest",x = x,y = y,training_frame = train_h2o,nfolds = 5,# use cv to validate the parameters fold_assignment = "Stratified",ntrees = 100,hyper_params = rf_h2o_grid,search_criteria = sc # parallelism = 6 # when I set it larger than 1,the result always includes some "cv_" models )？感谢您的帮助！

解决方法

这是网格搜索中与并行性有关的一个实际问题，以前已注意到但未正确报告。感谢您提出此问题，我们将尽快修复它：如果要跟踪进度，请参见https://h2oai.atlassian.net/browse/PUBDEV-7886。

在进行适当修复之前，必须避免在网格中同时使用CV和并行性。

关于以下错误：

某些模型由于故障而无法构建，有关更多详细信息，请运行`summary（grid_object，show_stack_traces = TRUE）

如果该错误是可重现的，则应该通过使用verbose=True运行网格来获取更多详细信息。将整个错误消息添加到上面的票证中也可能会有帮助。

这是因为您设置了 max_models = 5，您的网格只会制作 5 个模型然后停止。

设置提前停止标准的方法有以下三种：

"max_models"：创建的最大模型数
"max_runtime_secs"：以秒为单位的最大运行时间
基于指标的提前停止，通过设置“stopping_rounds”、“stopping_metric”和“stopping_tolerance”