问题描述
我有300,000个变量和大约5000行的大数据。通过使用RSpectra进行奇异值分解,我获得了300个奇异值。使用这300个变量通过超参数调整运行svm变得异常缓慢。 24GB RAM计算机花费了超过17个小时。当我使用包含60,000个变量和5000行的文档特征矩阵(dfm)来运行该算法时,该算法的运行速度要快得多。
library(doMC)
start_time <- Sys.time()
registerDoMC(cores=5)
library(e1071)
set.seed(123) #for reproducibility
svm_tuned_upsample <- tune(svm,train.x = train_svd_df[,-1],train.y = as.factor(train_svd_df$Include),kernel = "radial",type = "C-classification",parallel= TRUE,ranges=list(cost=c(0.001,0.01,0.1,0.2,0.3,0.4,0.5,1,5,6,7,8,10,15),gamma=c(0.0009,0.001,0.002,0.003,0.0035,0.004,0.0045,0.005)),validation.x=tune.control(sampling = "cross",cross=10)
)
Sys.time() - start_time
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)