如何将置信区间添加到圆形直方图冯·米塞斯分布

问题描述

我有时间数据,我想在24小时制的时钟上绘制每小时的频率。

将数据转换为circular,并使用mu计算“定期平均值” kappa和“浓度” mle.vonmises()的估计值。

该图是使用ggplot2geom_hist()coord_polar()生成的。只需调用geom_vline(),即可在图上绘制周期均值。

问题

我想在均值周围画一个置信区间,为95%。然后,我想在视觉上检查给定的时间戳(例如“ 22:00:00”)是否位于CI中。 如何使用冯·米斯分布和ggplot2做到这一点?

下面的代码显示了我走了多远。

数据

timestamps <- c("08:43:48","09:17:52","12:56:22","12:27:32","10:59:23","07:22:45","11:13:59","10:13:26","10:07:01","06:09:56","12:43:17","07:07:35","09:36:44","10:45:00","08:27:36","07:55:35","11:32:56","13:18:35","11:09:51","09:46:33","06:59:12","10:19:36","09:39:47","09:39:46","18:23:54")

代码

library(lubridate)
library(circular)
library(ggplot2)

## Convert from char to hours
timestamps_hrs <- as.numeric(hms(timestamps)) / 3600

## Convert to class circular
timestamps_hrs_circ <- circular(timestamps_hrs,units = "hours",template = "clock24")

## Estimate the periodic mean and the concentration 
## from the von Mises distribution
estimates <- mle.vonmises(timestamps_hrs_circ)
periodic_mean <- estimates$mu %% 24
concentration <- estimates$kappa

## Clock plot // Circular Histogram
clock01 <- ggplot(data.frame(timestamps_hrs_circ),aes(x = timestamps_hrs_circ)) +
  geom_histogram(breaks = seq(0,24),colour = "blue",fill = "lightblue") +
  coord_polar() + 
  scale_x_continuous("",limits = c(0,breaks = seq(0,minor_breaks = NULL) +
  theme_light()

clock01

## Add the periodic_mean
clock01 + 
  geom_vline(xintercept = as.numeric(periodic_mean),color = "red",linetype = 3,size = 1.25) 

这将产生以下图形:

enter image description here

解决方法

我想我找到了一种近似的解决方案。正如我们知道参数devtoolsmu(分别是周期平均值和浓度)一样,我们知道分布。反过来,这意味着我们知道给定时间戳的密度,并且我们可以计算95%置信水平的截止。

有了这些,我们就可以为一天中的每一分钟生成时间戳。我们根据需要转换时间戳,计算密度,然后与截止值进行比较。

通过这种方式,我们可以在1分钟内知道我们是否处于置信区间内。

代码

(假设问题中的代码已运行)

kappa

使用以上信息,并使用quantile <- qvonmises((1 - 0.95)/2,mu = periodic_mean,kappa = concentration) cutoff <- dvonmises(quantile,kappa = concentration) ## generate a timestamp for every minute in a day ## then the transformations needed ts_1min <- format(seq.POSIXt(as.POSIXct(Sys.Date()),as.POSIXct(Sys.Date()+1),by = "1 min"),"%H:%M:%S",tz = "GMT") ts_1min_hrs <- as.numeric(hms(ts_1min)) / 3600 ts_1min_hrs_circ <- circular(ts_1min_hrs,units = "hours",template = "clock24") ## generate densities to compare with the cutoff dens_1min <- dvonmises(ts_1min_hrs_circ,kappa = concentration) ## compare: vector of FALSE/TRUE feat_1min <- dens_1min >= cutoff df_1min_feat <- data.frame(ts = ts_1min_hrs_circ,feature = feat_1min) ## get the min and max time of the CI CI <- df_1min_feat %>% filter(feature == TRUE) %>% summarise(min = min(ts),max= max(ts)) CI # min max # 5.283333 14.91667 ,我们可以得到想要的东西:

geom_rect()

结果如下图所示:

enter image description here

我希望有人也能从中受益。