问题描述
我有以下时间序列:
data = data.frame(Date=c('2017-01-01','2017-02-01','2017-03-01','2017-04-01','2017-05-01','2017-06-01','2017-01-01','2017-06-01'),store=c('A','A','B','C','C'),prod_id=c('p1','p2','p3','p4','p5','p6','p1','p6'),sales=c('12.1','13','15','10','12','9.0','12.5','13.3','14.8','11','12.1','12.2','10.9','13.4','11.1'))
data
Date store prod_id sales
1 2017-01-01 A p1 12.1
2 2017-02-01 A p2 13
3 2017-03-01 A p3 15
4 2017-04-01 A p4 10
5 2017-05-01 A p5 12
6 2017-06-01 A p6 9.0
7 2017-01-01 B p1 12.5
8 2017-02-01 B p2 13.3
9 2017-03-01 B p3 14.8
10 2017-04-01 B p4 11
11 2017-05-01 B p5 10
12 2017-06-01 B p6 12.1
13 2017-01-01 C p1 13
14 2017-02-01 C p2 12.2
15 2017-03-01 C p3 11
16 2017-04-01 C p4 10.9
17 2017-05-01 C p5 13.4
18 2017-06-01 C p6 11.1
列中的唯一值是:
sapply(data[c('Date','store','prod_id')],unique)
$Date
[1] "2017-01-01" "2017-02-01" "2017-03-01" "2017-04-01" "2017-05-01" "2017-06-01"
$store
[1] "A" "B" "C"
$prod_id
[1] "p1" "p2" "p3" "p4" "p5" "p6"
我需要帮助明智地预测产品销售情况。粗略的方法可能是使用for循环,首先选择每个商店,然后在每个商店内再次进行for循环,遍历每个产品时间序列并预测1个周期。使用tidyverse包中purrr的嵌套功能,可以使此过程更有效? (TS模型:预测包中的auto.arima)
解决方法
这是一个选项-将“日期”转换为Date
类后,将“存储”将nest
编辑的数据更改为tsibble
,并生成model
library(dplyr)
library(purrr)
library(fpp3)
library(tsibble)
data %>%
mutate(Date = as.Date(Date)) %>%
group_by(store) %>%
nest %>%
mutate(data = map(data,~ as_tsibble(.x,index = 'Date') %>%
model(arima = ARIMA(sales))))