如何解读发散和低效样本警告?

问题描述

我正在运行一个教程,教一些新手如何使用 PyMC3 进行回归。我以 Ted 演讲数据为例,试图找出评论数量、转录语言数量和演讲视频长度如何预测 Ted 演讲的受欢迎程度。我为 PyMC3 运行了以下代码

    intercept = pm.normal("Intercept",5,sigma=3)
    beta_duration = pm.normal('duration',mu = 0.05,sd = 0.3) 
    beta_languages = pm.normal('languages',sd = 0.1) 
    beta_comments = pm.normal('comments',sd = 0.1)
    epsilon = pm.HalfCauchy('epsilon',5)

    likelihood = pm.normal('likelihood',mu = intercept + beta_duration * ted_talk['duration'] + beta_languages * ted_talk['languages'] + beta_comments * ted_talk['comments'],sd = epsilon,observed = ted_talk['views'])
    trace = pm.sample(4000,tune = 2000,chains = 3)

结果:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (3 chains in 1 job)
NUTS: [epsilon,comments,languages,duration,Intercept]

Sampling 3 chains for 2_000 tune and 4_000 draw iterations (6_000 + 12_000 draws total) took 91 seconds.
There were 973 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.5480812333460533,but should be close to 0.8. Try to increase the number of tuning steps.
There were 973 divergences after tuning. Increase `target_accept` or reparameterize.
There were 973 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.

问题 1:为什么 MCMC 模拟即使在调整后仍返回一些分歧的可能原因是什么?由于程序建议我增加target_accept,并增加调优,您认为哪个更有用?但如果它高度依赖,我想知道为什么会这样?

问题2:如果有效样本太小,潜在的问题是什么?由于我还没有看到任何“阈值”来确定有效样本的数量是否太大/太小(包括mcmc_diagnostic),您认为贝叶斯回归模型中多少有效样本是合理的?

非常感谢您的时间!您的帮助是巨大的!

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)