Weka 预测出错

问题描述

我有一个时间序列数据集（见下文），有很多零值。它按月表示客户的历史销售数据，非销售月份加上零。我正在尝试预测系列中的下一个项目。

我在 weka 3.8.4 中使用 timeseriesForecasting 插件版本 1.0.27（最新）

至于设置，它们非常基本：

目标选择：omzet
要预测的时间单位数：1
时间戳：maand_global
执行评估：yes
基础学习器（这只是默认的 SMOreg，没有配置）：weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I "weka.classifiers.functions.supportVector.RegSMOImproved -T 0.001 -V -P 1.0E-12 -L 0.001 -W 1" -K "weka.classifiers.functions.supportVector.polyKernel -E 1.0 -C 250007"
滞后长度：自定义：最小值 = 1，最大值 = 11（我的数据集的一半）； “时间的力量”和“时间的产物”的默认“更多选项”为真
评估：RMSE；评估训练：true;评估持续的培训：0.1

使用这些设置，下一个预测值是 69288，这没有任何意义，因为与其他值相比，该值是巨大的。

我做错了什么，或者这个算法是错误的还是不合适的？

（注意：我对默认的 MultilayerPerceptron 进行了同样的尝试，它产生 3636 作为输出，它较低，但仍然不合理地高）

数据（arff 文件）：

@relation QueryResult-weka.filters.unsupervised.attribute.Remove-R2-4

@attribute omzet numeric
@attribute maand_global numeric

@data
85.120003,231
0,232
0,233
0,234
0,235
0,236
0,237
0,238
0,239
0,240
0,241
2354.120117,242
0,243
0,244
1760.160034,245
0,246
0,247
0,248
0,249
0,250
0,251
0,252

完整的运行日志：

=== Run information ===

Scheme:
    SMOreg -C 1.0 -N 0 -I "RegSMOImproved -T 0.001 -V -P 1.0E-12 -L 0.001 -W 1" -K "polyKernel -E 1.0 -C 250007"

Lagged and derived variable options:
    -F omzet -L 1 -M 11 -G maand_global

Relation:     QueryResult-weka.filters.unsupervised.attribute.Remove-R2-4
Instances:    22
Attributes:   2
              omzet
              maand_global

Transformed training data:

              omzet
              maand_global
              Lag_omzet-1
              Lag_omzet-2
              Lag_omzet-3
              Lag_omzet-4
              Lag_omzet-5
              Lag_omzet-6
              Lag_omzet-7
              Lag_omzet-8
              Lag_omzet-9
              Lag_omzet-10
              Lag_omzet-11
              maand_global^2
              maand_global^3
              maand_global*Lag_omzet-1
              maand_global*Lag_omzet-2
              maand_global*Lag_omzet-3
              maand_global*Lag_omzet-4
              maand_global*Lag_omzet-5
              maand_global*Lag_omzet-6
              maand_global*Lag_omzet-7
              maand_global*Lag_omzet-8
              maand_global*Lag_omzet-9
              maand_global*Lag_omzet-10
              maand_global*Lag_omzet-11

omzet:
SMOreg

weights (not support vectors):
 +       0.0392 * (normalized) maand_global
 +       0.0165 * (normalized) Lag_omzet-1
 +       0.0135 * (normalized) Lag_omzet-2
 +       0.3874 * (normalized) Lag_omzet-3
 -       0.0052 * (normalized) Lag_omzet-4
 -       0.005  * (normalized) Lag_omzet-5
 -       0.2884 * (normalized) Lag_omzet-6
 +       0.002  * (normalized) Lag_omzet-7
 -       0.003  * (normalized) Lag_omzet-8
 -       0.0282 * (normalized) Lag_omzet-9
 -       0.0351 * (normalized) Lag_omzet-10
 +       0.5198 * (normalized) Lag_omzet-11
 +       0.0401 * (normalized) maand_global^2
 +       0.0409 * (normalized) maand_global^3
 +       0.0185 * (normalized) maand_global*Lag_omzet-1
 +       0.0134 * (normalized) maand_global*Lag_omzet-2
 +       0.3817 * (normalized) maand_global*Lag_omzet-3
 -       0.0071 * (normalized) maand_global*Lag_omzet-4
 -       0.0058 * (normalized) maand_global*Lag_omzet-5
 -       0.287  * (normalized) maand_global*Lag_omzet-6
 +       0.0035 * (normalized) maand_global*Lag_omzet-7
 -       0.0014 * (normalized) maand_global*Lag_omzet-8
 -       0.0282 * (normalized) maand_global*Lag_omzet-9
 -       0.0351 * (normalized) maand_global*Lag_omzet-10
 +       0.5198 * (normalized) maand_global*Lag_omzet-11
 -       0.1092



Number of kernel evaluations: 210 (97.266% cached)

=== Future predictions from end of training data ===
inst#         omzet 
231           85.12 
232               0 
233               0 
234               0 
235               0 
236               0 
237               0 
238               0 
239               0 
240               0 
241               0 
242       2354.1201 
243               0 
244               0 
245         1760.16 
246               0 
247               0 
248               0 
249               0 
250               0 
251*      -4728.9192 

=== Future predictions from end of test data ===
inst#       omzet 
251             0 
252             0 
253*      69288.7472 

=== Evaluation on training data ===
Target                      1-step-ahead
========================================
omzet
  N                                    9
  Root mean squared error         2.2847

Total number of instances: 20

=== Evaluation on test data ===
Target                      1-step-ahead
========================================
omzet
  N                                    2
  Root mean squared error      4670.1525

Total number of instances: 2

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

forecasting time-series weka