Weka 预测出错

问题描述

我有一个时间序列数据集(见下文),有很多零值。它按月表示客户的历史销售数据,非销售月份加上零。 我正在尝试预测系列中的下一个项目。

我在 weka 3.8.4 中使用 timeseriesForecasting 插件版本 1.0.27(最新)

至于设置,它们非常基本:

  • 目标选择:omzet
  • 要预测的时间单位数:1
  • 时间戳:maand_global
  • 执行评估:yes
  • 基础学习器(这只是认的 SMOreg,没有配置):weka.classifiers.functions.SMOreg -C 1.0 -N 0 -I "weka.classifiers.functions.supportVector.RegSMOImproved -T 0.001 -V -P 1.0E-12 -L 0.001 -W 1" -K "weka.classifiers.functions.supportVector.polyKernel -E 1.0 -C 250007"
  • 滞后长度:自定义:最小值 = 1,最大值 = 11(我的数据集的一半); “时间的力量”和“时间的产物”的认“更多选项”为真
  • 评估:RMSE;评估训练:true;评估持续的培训:0.1

使用这些设置,下一个预测值是 69288,这没有任何意义,因为与其他值相比,该值是巨大的。

我做错了什么,或者这个算法是错误的还是不合适的?

(注意:我对认的 MultilayerPerceptron 进行了同样的尝试,它产生 3636 作为输出,它较低,但仍然不合理地高)

数据(arff 文件):

@relation QueryResult-weka.filters.unsupervised.attribute.Remove-R2-4

@attribute omzet numeric
@attribute maand_global numeric

@data
85.120003,231
0,232
0,233
0,234
0,235
0,236
0,237
0,238
0,239
0,240
0,241
2354.120117,242
0,243
0,244
1760.160034,245
0,246
0,247
0,248
0,249
0,250
0,251
0,252

完整的运行日志:

=== Run information ===

Scheme:
    SMOreg -C 1.0 -N 0 -I "RegSMOImproved -T 0.001 -V -P 1.0E-12 -L 0.001 -W 1" -K "polyKernel -E 1.0 -C 250007"

Lagged and derived variable options:
    -F omzet -L 1 -M 11 -G maand_global

Relation:     QueryResult-weka.filters.unsupervised.attribute.Remove-R2-4
Instances:    22
Attributes:   2
              omzet
              maand_global

Transformed training data:

              omzet
              maand_global
              Lag_omzet-1
              Lag_omzet-2
              Lag_omzet-3
              Lag_omzet-4
              Lag_omzet-5
              Lag_omzet-6
              Lag_omzet-7
              Lag_omzet-8
              Lag_omzet-9
              Lag_omzet-10
              Lag_omzet-11
              maand_global^2
              maand_global^3
              maand_global*Lag_omzet-1
              maand_global*Lag_omzet-2
              maand_global*Lag_omzet-3
              maand_global*Lag_omzet-4
              maand_global*Lag_omzet-5
              maand_global*Lag_omzet-6
              maand_global*Lag_omzet-7
              maand_global*Lag_omzet-8
              maand_global*Lag_omzet-9
              maand_global*Lag_omzet-10
              maand_global*Lag_omzet-11

omzet:
SMOreg

weights (not support vectors):
 +       0.0392 * (normalized) maand_global
 +       0.0165 * (normalized) Lag_omzet-1
 +       0.0135 * (normalized) Lag_omzet-2
 +       0.3874 * (normalized) Lag_omzet-3
 -       0.0052 * (normalized) Lag_omzet-4
 -       0.005  * (normalized) Lag_omzet-5
 -       0.2884 * (normalized) Lag_omzet-6
 +       0.002  * (normalized) Lag_omzet-7
 -       0.003  * (normalized) Lag_omzet-8
 -       0.0282 * (normalized) Lag_omzet-9
 -       0.0351 * (normalized) Lag_omzet-10
 +       0.5198 * (normalized) Lag_omzet-11
 +       0.0401 * (normalized) maand_global^2
 +       0.0409 * (normalized) maand_global^3
 +       0.0185 * (normalized) maand_global*Lag_omzet-1
 +       0.0134 * (normalized) maand_global*Lag_omzet-2
 +       0.3817 * (normalized) maand_global*Lag_omzet-3
 -       0.0071 * (normalized) maand_global*Lag_omzet-4
 -       0.0058 * (normalized) maand_global*Lag_omzet-5
 -       0.287  * (normalized) maand_global*Lag_omzet-6
 +       0.0035 * (normalized) maand_global*Lag_omzet-7
 -       0.0014 * (normalized) maand_global*Lag_omzet-8
 -       0.0282 * (normalized) maand_global*Lag_omzet-9
 -       0.0351 * (normalized) maand_global*Lag_omzet-10
 +       0.5198 * (normalized) maand_global*Lag_omzet-11
 -       0.1092



Number of kernel evaluations: 210 (97.266% cached)

=== Future predictions from end of training data ===
inst#         omzet 
231           85.12 
232               0 
233               0 
234               0 
235               0 
236               0 
237               0 
238               0 
239               0 
240               0 
241               0 
242       2354.1201 
243               0 
244               0 
245         1760.16 
246               0 
247               0 
248               0 
249               0 
250               0 
251*      -4728.9192 

=== Future predictions from end of test data ===
inst#       omzet 
251             0 
252             0 
253*      69288.7472 

=== Evaluation on training data ===
Target                      1-step-ahead
========================================
omzet
  N                                    9
  Root mean squared error         2.2847

Total number of instances: 20

=== Evaluation on test data ===
Target                      1-step-ahead
========================================
omzet
  N                                    2
  Root mean squared error      4670.1525

Total number of instances: 2

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)