强制 GAM 模型拟合为单调并通过 R mgcv 的固定点 (x0, y0) 到目前为止的尝试:在两个约束条件下分别地

问题描述

我正在尝试将 GAM 模型拟合到两个约束下的数据 同时:(1) 拟合是单调的(递增),(2) 拟合经过一个固定点,例如,{ {1}}。

到目前为止,我设法让这两个约束单独工作

  • 对于 (1),基于 mgcv::pcls() documentation examples,通过使用 (x0,y0) 获得足以满足单调性的线性约束,并通过 mgcv::mono.con() 使用约束估计模型系数。

  • 对于 (2),基于 this post,通过使用模型公式中的偏移项将节点位置 x0 处的样条值设置为 0 +。

然而,我很难同时结合这两个约束。猜测一个方法mgcv::pcls(),但我无法解决 (a)使用偏移量或 (b) 设置等式约束(我认为这可以产生我的 (2) 约束设置)执行类似的技巧,将节点位置 x0 处的样条值设置为 0 +

我还注意到,对于我的约束条件 (2),将节点位置 x0 处的样条值设置为 0 的方法产生了奇怪的摆动结果(与不受约束的 GAM 拟合相比)——如下所示。

到目前为止的尝试:在两个约束条件下分别地

为数据拟合一个平滑函数

模拟一些数据

mgcv::pcls()

GAM 无约束(用于比较)

library(mgcv)
set.seed(1)
x <- sort(runif(100) * 4 - 1)
f <- exp(4*x)/(1+exp(4*x))
y <- f + rnorm(100) * 0.1
dat <- data.frame(x=x,y=y)

GAM 约束:(1) 拟合是单调的(递增)

k <- 13
fit0 <- gam(y ~ s(x,k = k,bs = "cr"),data = dat)
# predict from unconstrained GAM fit
newdata <- data.frame(x = seq(-1,3,length.out = 1000))
newdata$y_pred_fit0 <- predict(fit0,newdata = newdata)

蓝线:受限;红线:无约束

enter image description here

GAM 约束:(2) 拟合通过 k <- 13 # Show regular spline fit (and save fitted object) f.ug <- gam(y~s(x,k=k,bs="cr")) # explicitly construct smooth term's design matrix sm <- smoothCon(s(x,bs="cr"),dat,knots=NULL)[[1]] # find linear constraints sufficient for monotonicity of a cubic regression spline # it assumes "cr" is the basis and its knots are provided as input F <- mono.con(sm$xp) G <- list( X=sm$X,C=matrix(0,0),# [0 x 0] matrix (no equality constraints) sp=f.ug$sp,# smoothing parameter estimates (taken from unconstrained model) p=sm$xp,# array of feasible initial parameter estimates y=y,w= dat$y * 0 + 1 # weights for data ) G$Ain <- F$A # matrix for the inequality constraints G$bin <- F$b # vector for the inequality constraints G$S <- sm$S # list of penalty matrices; The first parameter it penalizes is given by off[i]+1 G$off <- 0 # Offset values locating the elements of M$S in the correct location within each penalty coefficient matrix. (Zero offset implies starting in first location) p <- pcls(G); # fit spline (using smoothing parameter estimates from unconstrained fit) # predict newdata$y_pred_fit2 <- Predict.matrix(sm,data.frame(x = newdata$x)) %*% p # plot plot(y ~ x,data = dat) lines(y_pred_fit0 ~ x,data = newdata,col = 2,lwd = 2) lines(y_pred_fit2 ~ x,col = 4,lwd = 2)

(x0,y0)=(-1,-0.1)

绿线:受限;红线:无约束

enter image description here

解决方法

我认为你可以用 x 增加数据向量 y(x0,y0),然后在第一次观察上(即添加一个权重向量到你的 G {1}} 个列表)。

替代简单的加权策略,我们可以从初步平滑的结果开始编写二次规划问题。这在下面的第二个 R 代码中进行了说明(在这种情况下,我使用了 p 样条平滑器,参见 Eilers 和 Marx 1991)。

希望这会有所帮助 (a similar problem is discussed here)。

Rcode 示例 1(权重策略)

set.seed(123)
N = 100
x <- sort(runif(N) * 4 - 1)
f <- exp(4*x)/(1+exp(4*x))
y <- f + rnorm(N) * 0.1
x = c(-1,x)
y = c(-0.1,y)
dat = data.frame(x = x,y= y)

k <- 13
fit0 <- gam(y ~ s(x,k = k,bs = "cr"),data = dat)
# predict from unconstrained GAM fit
newdata <- data.frame(x = seq(-1,3,length.out = 1000))
newdata$y_pred_fit0 <- predict(fit0,newdata = newdata)

k <- 13

# Show regular spline fit (and save fitted object)
f.ug <- gam(y~s(x,k=k,bs="cr"))

# explicitly construct smooth term's design matrix
sm <- smoothCon(s(x,bs="cr"),dat,knots=NULL)[[1]]
# find linear constraints sufficient for monotonicity of a cubic regression spline
# it assumes "cr" is the basis and its knots are provided as input
F <- mono.con(sm$xp)

G <- list(
X=sm$X,C=matrix(0,0),# [0 x 0] matrix (no equality constraints)
sp=f.ug$sp,# smoothing parameter estimates (taken from unconstrained model)
p=sm$xp,# array of feasible initial parameter estimates
y=y,w= c(1e8,1:N * 0 + 1)  # weights for data    
)
G$Ain <- F$A        # matrix for the inequality constraints
G$bin <- F$b        # vector for the inequality constraints
G$S <- sm$S         # list of penalty matrices; The first parameter it penalizes is given by off[i]+1
G$off <- 0          # Offset values locating the elements of M$S in the correct location within each penalty coefficient matrix.  (Zero offset implies starting in first location)

p <- pcls(G);       # fit spline (using smoothing parameter estimates from unconstrained fit)

# predict 
newdata$y_pred_fit2 <- Predict.matrix(sm,data.frame(x = newdata$x)) %*% p
# plot
plot(y ~ x,data = dat)
lines(y_pred_fit0 ~ x,data = newdata,col = 2,lwd = 2)
lines(y_pred_fit2 ~ x,col = 4,lwd = 2)
abline(v = -1)
abline(h = -0.1)

enter image description here

rm(list = ls())
library(mgcv)
library(pracma)
library(colorout)

set.seed(123)
N   = 100
x   = sort(runif(N) * 4 - 1)
f   = exp(4*x)/(1+exp(4*x))
y   = f + rnorm(N) * 0.1
x0  = -1
y0  = -0.1 
dat = data.frame(x = x,y= y)

k = 50
# Show regular spline fit (and save fitted object)
f.ug = gam(y~s(x,bs="ps"))

# explicitly construct smooth term's design matrix
sm = smoothCon(s(x,bs="ps"),knots=NULL)[[1]]

# Build quadprog to estimate the coefficients 
scf = sapply(f.ug$smooth,'[[','S.scale')
lam = f.ug$sp / scf
Xp  = rbind(sm$X,sqrt(lam) * f.ug$smooth[[1]]$D)  
yp  = c(dat$y,rep(0,k - 2))
X0  = Predict.matrix(sm,data.frame(x = x0))
sm$deriv = 1
X1 = Predict.matrix(sm,data.frame(x = dat$x))
coef_mono = pracma::lsqlincon(Xp,yp,Aeq = X0,beq = y0,A = -X1,b = rep(0,N))

# fitted values
fit = sm$X %*% coef_mono
sm$deriv = 0
xf = seq(-1,len = 1000)
Xf = Predict.matrix(sm,data.frame(x = xf))
fine_fit = Xf %*% coef_mono

# plot
par(mfrow = c(2,1),mar = c(3,3))
plot(dat$x,dat$y,pch = 1,main= 'Data and fit')
lines(dat$x,f.ug$fitted,lwd = 2,col = 2)
lines(dat$x,fit,lty = 1,lwd = 2)
lines(xf,fine_fit,col = 3,lty = 2)
abline(h = -0.1)
abline(v = -1)

plot(dat$x,X1 %*% coef_mono,type = 'l',main = 'Derivative of the fit',lwd = 2)
abline(h = 0.0)

enter image description here

,

以下包似乎实现了您正在寻找的内容:

建议的形状约束平滑已被纳入广义 具有无约束和形状受限平滑项的混合的加法模型 (单 GAM)。 [...]

建议的建模方法已在 R 包 monogam 中实现。 模型设置与 mgcv(gam) 中的相同,但添加了形状约束 平滑。为了与无约束的 GAM 保持一致,该包提供了 关键功能类似于与 mgcv(gam) 相关联的功能。

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...