R中的手动交互图线性回归

问题描述

我正在尝试使用对数转换后的丰度数据(更好的拟合度)和其他一些变量来预测在不同月相下看到的动物的平均丰度(因子)。最好的模型(最低的AIC)证明包括相位和调查持续时间与云层的相互作用(都是连续的):

LMoon<-lm(ln~Phase*Duration+Clouds,data=abund)

summary(LMoon)

Call:
lm(formula = ln ~ Phase * Duration + Clouds,data = abund)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.75416 -0.46311  0.09522  0.46591  1.85978 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       0.382031   0.876865   0.436 0.664125    
Phase2            2.130065   1.226305   1.737 0.085851 .  
Phase3            1.971060   1.818542   1.084 0.281351    
Phase4            0.608043   1.140122   0.533 0.595146    
Phase5            4.786674   1.151850   4.156 7.44e-05 ***
Phase6            0.958706   1.046831   0.916 0.362238    
Phase7            0.254711   3.425214   0.074 0.940888    
Phase8            0.865995   1.043916   0.830 0.409005    
Duration          0.069153   0.035407   1.953 0.053952 .  
Clouds           -0.004259   0.002401  -1.774 0.079494 .  
Phase2:Duration  -0.087843   0.047818  -1.837 0.069545 .  
Phase3:Duration  -0.089908   0.069652  -1.291 0.200109    
Phase4:Duration  -0.005424   0.046675  -0.116 0.907749    
Phase5:Duration  -0.172016   0.049369  -3.484 0.000768 ***
Phase6:Duration  -0.035597   0.041435  -0.859 0.392583    
Phase7:Duration   0.024084   0.176773   0.136 0.891939    
Phase8:Duration  -0.033424   0.042064  -0.795 0.428963    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7721 on 89 degrees of freedom
Multiple R-squared:  0.3368,Adjusted R-squared:  0.2176 
F-statistic: 2.825 on 16 and 89 DF,p-value: 0.0009894

现在,由于这种相互作用,我需要绘制一个相互作用图(绘制lsmeans时CI太宽)。 我尝试使用这里提到的不同功能,但是没有一个起作用。 显然,我需要手动计算和绘图,就像这样:

intercepts <- c(coef(LMoon)["(Intercept)"],coef(LMoon)["(Intercept)"] + coef(LMoon)["Phase2"],coef(LMoon)["(Intercept)"] + coef(LMoon)["Phase3"],coef(LMoon)["(Intercept)"] + coef(LMoon)["Phase4"],coef(LMoon)["(Intercept)"] + coef(LMoon)["Phase5"],coef(LMoon)["(Intercept)"] + coef(LMoon)["Phase6"],coef(LMoon)["(Intercept)"] + coef(LMoon)["Phase7"],coef(LMoon)["(Intercept)"] + coef(LMoon)["Phase8"])

lines.df <- data.frame(intercepts = intercepts,slopes = c(coef(LMoon)["Duration"],coef(LMoon)["Duration"]+coef(LMoon)["Phase2:Duration"],coef(LMoon)["Duration"]+coef(LMoon)["Phase3:Duration"],coef(LMoon)["Duration"]+coef(LMoon)["Phase4:Duration"],coef(LMoon)["Duration"]+coef(LMoon)["Phase5:Duration"],coef(LMoon)["Duration"]+coef(LMoon)["Phase6:Duration"],coef(LMoon)["Duration"]+coef(LMoon)["Phase7:Duration"],coef(LMoon)["Duration"]+coef(LMoon)["Phase8:Duration"]),Phase2 = levels(abund$Phase))

qplot(x = Duration,y = Sp2,color = Phase,data = abund) + 
  geom_abline(aes(intercept = intercepts,slope = slopes,color = Phase),data = lines.df)

我得到的图是错误的,因为y值是在原始的真实比例上,但是这些线是基于使用对数转换数据的lm。

interaction plot abundance,duration,lunar phases

要对此进行反向转换,有人告诉我,实际上我最终不会得到直线。 而不是使用abline(),我应该创建一组例如100个新的x值覆盖了持续时间数据的范围,并使用系数来计算您的预测y值。然后使用lines()绘制这些图形,它看起来应该像一条平滑的曲线。

这就是我迷路的地方。

因此,我为调查持续时间的范围(最小15到最大45分钟)创建了一组新的x值: dur2 <- seq(from = 15,to = 45,length.out=100)

然后,一旦我获得了这些值,就应该使用LM的系数来获得每个x值的预测y值。之后,将y值反转换为原始比例。然后使用x值和反向转换的y值将线添加到绘图中。

我现在如何准确地获得预测值?我不能使用任何pred类型/函数,我已经尝试了全部。只是不适用于我的模型,所以手动是唯一的方法,但是我不知道如何...

希望任何人都可以帮助我,到目前为止,我已经尝试了好几周,但绝望了,快要放弃了。

干杯!

PS 这里的数据:

> dput(subset(abund,Phase %in% c("Phase1","Phase2")))

structure(list(Year = integer(0),Date = structure(integer(0),.Label = c("01/08/2009","01/08/2016","02/07/2019","02/08/2009","02/08/2012","02/08/2016","02/09/2007","03/08/2007","03/08/2009","03/08/2014","03/08/2015","04/07/2019","04/08/2009","04/08/2013","05/08/2009","05/08/2014","05/08/2015","06/07/2008","06/07/2019","07/08/2009","08/07/2010","09/07/2010","09/08/2015","10/08/2009","11/08/2009","12/08/2009","13/08/2009","13/08/2014","14/08/2009","14/08/2012","16/07/2006","18/07/2009","18/08/2015","19/07/2011","20/08/2009","21/07/2011","21/09/2009","22/07/2011","22/07/2016","22/07/2017","23/07/2007","23/07/2016","23/07/2017","24/07/2017","25/07/2007","25/07/2010","25/07/2017","25/08/2016","26/07/2010","26/07/2011","27/07/2006","27/07/2011","27/07/2012","28/07/2016","29/06/2019","29/07/2005","29/07/2009","29/07/2010","29/07/2016","29/07/2019","30/07/2005","30/07/2007","30/07/2016","30/08/2005","31/07/2005","31/07/2009","31/07/2014","31/07/2016"),class = "factor"),NrSurvey = integer(0),Duration = integer(0),Sp2 = integer(0),Phase = structure(integer(0),.Label = c("1","2","3","4","5","6","7","8"),Clouds = integer(0),Visibility = integer(0),ln = numeric(0)),row.names = integer(0),class = "data.frame")

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)