问题描述
我有一个时间序列数据集(见下文),有很多零值。它按月表示客户的历史销售数据,非销售月份加上零。 我正在尝试预测系列中的下一个项目。
我在 weka 3.8.4 中使用 timeseriesForecasting 插件版本 1.0.27(最新)
至于设置,它们非常基本:
- 目标选择:
omzet
- 要预测的时间单位数:
1
- 时间戳:
maand_global
- 执行评估:
yes
- 基础学习器(这只是默认的 SMOreg,没有配置):
weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I "weka.classifiers.functions.supportVector.RegSMOImproved -T 0.001 -V -P 1.0E-12 -L 0.001 -W 1" -K "weka.classifiers.functions.supportVector.polyKernel -E 1.0 -C 250007"
- 滞后长度:自定义:最小值 =
1
,最大值 =11
(我的数据集的一半); “时间的力量”和“时间的产物”的默认“更多选项”为真 - 评估:
RMSE
;评估训练:true
;评估持续的培训:0.1
使用这些设置,下一个预测值是 69288
,这没有任何意义,因为与其他值相比,该值是巨大的。
我做错了什么,或者这个算法是错误的还是不合适的?
(注意:我对默认的 MultilayerPerceptron 进行了同样的尝试,它产生 3636
作为输出,它较低,但仍然不合理地高)
数据(arff 文件):
@relation QueryResult-weka.filters.unsupervised.attribute.Remove-R2-4
@attribute omzet numeric
@attribute maand_global numeric
@data
85.120003,231
0,232
0,233
0,234
0,235
0,236
0,237
0,238
0,239
0,240
0,241
2354.120117,242
0,243
0,244
1760.160034,245
0,246
0,247
0,248
0,249
0,250
0,251
0,252
完整的运行日志:
=== Run information ===
Scheme:
SMOreg -C 1.0 -N 0 -I "RegSMOImproved -T 0.001 -V -P 1.0E-12 -L 0.001 -W 1" -K "polyKernel -E 1.0 -C 250007"
Lagged and derived variable options:
-F omzet -L 1 -M 11 -G maand_global
Relation: QueryResult-weka.filters.unsupervised.attribute.Remove-R2-4
Instances: 22
Attributes: 2
omzet
maand_global
Transformed training data:
omzet
maand_global
Lag_omzet-1
Lag_omzet-2
Lag_omzet-3
Lag_omzet-4
Lag_omzet-5
Lag_omzet-6
Lag_omzet-7
Lag_omzet-8
Lag_omzet-9
Lag_omzet-10
Lag_omzet-11
maand_global^2
maand_global^3
maand_global*Lag_omzet-1
maand_global*Lag_omzet-2
maand_global*Lag_omzet-3
maand_global*Lag_omzet-4
maand_global*Lag_omzet-5
maand_global*Lag_omzet-6
maand_global*Lag_omzet-7
maand_global*Lag_omzet-8
maand_global*Lag_omzet-9
maand_global*Lag_omzet-10
maand_global*Lag_omzet-11
omzet:
SMOreg
weights (not support vectors):
+ 0.0392 * (normalized) maand_global
+ 0.0165 * (normalized) Lag_omzet-1
+ 0.0135 * (normalized) Lag_omzet-2
+ 0.3874 * (normalized) Lag_omzet-3
- 0.0052 * (normalized) Lag_omzet-4
- 0.005 * (normalized) Lag_omzet-5
- 0.2884 * (normalized) Lag_omzet-6
+ 0.002 * (normalized) Lag_omzet-7
- 0.003 * (normalized) Lag_omzet-8
- 0.0282 * (normalized) Lag_omzet-9
- 0.0351 * (normalized) Lag_omzet-10
+ 0.5198 * (normalized) Lag_omzet-11
+ 0.0401 * (normalized) maand_global^2
+ 0.0409 * (normalized) maand_global^3
+ 0.0185 * (normalized) maand_global*Lag_omzet-1
+ 0.0134 * (normalized) maand_global*Lag_omzet-2
+ 0.3817 * (normalized) maand_global*Lag_omzet-3
- 0.0071 * (normalized) maand_global*Lag_omzet-4
- 0.0058 * (normalized) maand_global*Lag_omzet-5
- 0.287 * (normalized) maand_global*Lag_omzet-6
+ 0.0035 * (normalized) maand_global*Lag_omzet-7
- 0.0014 * (normalized) maand_global*Lag_omzet-8
- 0.0282 * (normalized) maand_global*Lag_omzet-9
- 0.0351 * (normalized) maand_global*Lag_omzet-10
+ 0.5198 * (normalized) maand_global*Lag_omzet-11
- 0.1092
Number of kernel evaluations: 210 (97.266% cached)
=== Future predictions from end of training data ===
inst# omzet
231 85.12
232 0
233 0
234 0
235 0
236 0
237 0
238 0
239 0
240 0
241 0
242 2354.1201
243 0
244 0
245 1760.16
246 0
247 0
248 0
249 0
250 0
251* -4728.9192
=== Future predictions from end of test data ===
inst# omzet
251 0
252 0
253* 69288.7472
=== Evaluation on training data ===
Target 1-step-ahead
========================================
omzet
N 9
Root mean squared error 2.2847
Total number of instances: 20
=== Evaluation on test data ===
Target 1-step-ahead
========================================
omzet
N 2
Root mean squared error 4670.1525
Total number of instances: 2
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)