如何在 R 中为神经网络编译网格搜索

问题描述

我正在尝试为我想在 R 中创建的神经网络找到最佳参数。我正在使用 h2o 包并遵循 https://www.kaggle.com/wti200/deep-neural-network-parameter-search-r/comments 中的教程

我的代码似乎在 1 分钟内运行,据我所知,网格搜索应该运行多个模型,直到确定最佳参数为止,这需要一段时间才能运行。请让我知道我哪里出错了,以及如何进行网格搜索以优化我的参数。

h2o.init(nthreads=-1,max_mem_size='6G')
testHex = as.h2o(test)
trainHex = as.h2o(training)

predictors <-colnames(training)[!(colnames(training) %in% c("responseVar"))]
response = "responseVar"

hyper_params <- list(
  activation=c("Rectifier","Tanh","Maxout","RectifierWithDropout","TanhWithDropout","MaxoutWithDropout"),hidden=list(c(20,20),c(50,50),c(75,75),c(100,100),c(30,30,30),c(25,25,25)),input_dropout_ratio=c(0,0.03,0.05),#rate=c(0.01,0.02,l1=seq(0,1e-4,1e-6),l2=seq(0,1e-6)
)
h2o.rm("dl_grid_random")

search_criteria = list(strategy = "RandomDiscrete",max_runtime_secs = 360,max_models = 100,seed=1234567,stopping_rounds=5,stopping_tolerance=1e-2)
dl_random_grid <- h2o.grid(
  algorithm="deeplearning",grid_id = "dl_grid_random",training_frame=trainHex,x=predictors,y=response,epochs=1,stopping_metric="RMSE",stopping_tolerance=1e-2,## stop when logloss does not improve by >=1% for 2 scoring events
  stopping_rounds=2,score_validation_samples=10000,## downsample validation set for faster scoring
  score_duty_cycle=0.025,## don't score more than 2.5% of the wall time
  max_w2=10,## can help improve stability for Rectifier
  hyper_params = hyper_params,search_criteria = search_criteria
)                            

grid <- h2o.getGrid("dl_grid_random",sort_by="mae",decreasing=FALSE)
grid

grid@summary_table[1,]
best_model <- h2o.getModel(grid@model_ids[[1]]) ## model with lowest logloss
best_model

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)