R中XGBoost的model_function中的错误

问题描述

我正在尝试按照此处的教程将XGBoost应用于时间序列: https://cran.r-project.org/web/packages/forecastML/vignettes/grouped_forecast.html

一切都在复制所包含的代码,直到创建model_function。

# The value of outcome_col can also be set in train_model() with train_model(outcome_col = 1).
model_function <- function(data,outcome_col = 1) {
  
  # xgboost cannot handle missing outcomes data.
  data <- data[!is.na(data[,outcome_col]),]

  indices <- 1:nrow(data)
  
  set.seed(224)
  train_indices <- sample(1:nrow(data),ceiling(nrow(data) * .8),replace = FALSE)
  test_indices <- indices[!(indices %in% train_indices)]

  data_train <- xgboost::xgb.DMatrix(data = as.matrix(data[train_indices,-(outcome_col),drop = FALSE]),label = as.matrix(data[train_indices,outcome_col,drop = FALSE]))

  data_test <- xgboost::xgb.DMatrix(data = as.matrix(data[test_indices,label = as.matrix(data[test_indices,drop = FALSE]))

  params <- list("objective" = "reg:linear")
  watchlist <- list(train = data_train,test = data_test)
  
  set.seed(224)
  model <- xgboost::xgb.train(data = data_train,params = params,max.depth = 8,nthread = 2,nrounds = 30,metrics = "rmse",verbose = 0,early_stopping_rounds = 5,watchlist = watchlist)

  return(model)
}

在以下情况下使用该函数时,我得到一个错误,即数据是字符串而不是xgb.Matrix部分的矩阵。

model_results_cv <- forecastML::train_model(lagged_df = data_train,windows = windows,model_name = "xgboost",model_function = model_function,use_future = FALSE)

xgboost :: xgb.DMatrix(data = as.matrix(data [train_indices,-(outcome_col), “数据”具有“字符”类,长度为2099328。 “数据”接受数字矩阵或单个文件名。 模型为验证窗口1Error返回类'try-error'在xgboost :: xgb.DMatrix(data = as.matrix(data [train_indices,-(outcome_col), “数据”具有“字符”类,长度为2078336。

我尝试将计算结果从函数中提取出来并将其分配给变量,并将其用作data =参数:

model_function <- function(data,replace = FALSE)
  test_indices <- indices[!(indices %in% train_indices)]
  
  data_matrix_train <- as.matrix(data[train_indices,-outcome_col,drop = FALSE])
  label_matrix_train <- as.matrix(data[train_indices,drop = FALSE])
  
  data_matrix_test <- as.matrix(data[test_indices,drop = FALSE])
  label_matrix_test <- as.matrix(data[test_indices,drop = FALSE])

  print(class(data_matrix_train)

  data_train <- xgboost::xgb.DMatrix(data = data_matrix_train,label = label_matrix_train)

  data_test <- xgboost::xgb.DMatrix(data = data_matrix_test,label = label_matrix_test)

  params <- list("objective" = "reg:linear")
  watchlist <- list(train = data_train,test = data_test)

  set.seed(224)
  model <- xgboost::xgb.train(data = data_train,watchlist = watchlist)

  return(model)
}

如果我这样做,并要求函数使用以下代码在调用xgb.DMatrix之前返回data_matrix_test的类,则它返回:

model_function(data_train$horizon_1)

[1]“矩阵”“数组”

但是我仍然遇到相同的错误。我也尝试过使用do.call()来启动xgb.DMatrix函数,并出现相同的错误。我试图调查函数内部生成的传递变量,但没有成功。

关于在影响as.matrix语句属性的函数中如何调用xgb.DMatrix的事情吗?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)