问题描述
我正在尝试编写一个脚本,该脚本将采用 x 和 y 值的输入数组并将它们拟合到 5-PL 曲线(由方程 F(x) = D+(AD)/((1+( x/C)^B)^E))。然后我希望能够使用预测曲线来获取给定的 y 值并从曲线中推断出 x 值,由方程 F(y) = C(((AD)/(-D+y))^ (1/E)-1)^(1/B).
下面的答案修复了之前的错误,但拟合仍然很糟糕。我已经引入了一个打印函数,该函数将整个范围内的少量 y 值输入到 curve_fit 中,它在整个范围内产生几乎完全相同的 x 值。有什么想法可能会发生在这里吗?
编辑:对于现在查看的任何人来说,问题似乎是我对 B 的估计。在大多数情况下,山坡坡度应该在 -1 和 1 之间,而不是在数千之间。这使得估计太过分了。
import numpy as np
import scipy.optimize as sp
def logistic5(x,A,B,C,D,E):
'''5PL logistic equation'''
log = D + (A-D)/(np.power((1 + np.power((x/C),B)),E))
return log
def residuals(p,y,x):
'''Deviations of data from fitted 5PL curve'''
A,E = p
err = y - logistic5(x,E)
print(err)
return err
def log_solve_for_x(curve,y):
'''Returns the estimated x value for the provided y value'''
A,E = curve
return C*(np.power((np.power(((A-D)/(-D+y)),(1/E))-1),(1/B)))
# Toy data set
x = np.array([130,38,15,4.63,1.41])
y = np.array([9121,1987,1017,343,117])
# Set initial guess for parameters
A = np.amin(y) # Min asymptote
D = np.amax(y) # Max asymptote
B = (D-A)/(np.amax(x)-np.amin(x)) # Steepness
C = (np.amax(x)-np.amin(x))/2 # inflection point
E = 1 # Asymmetry factor
# Optimize curve for initial parameters
p0 = [A,E]
# set bounds for each parameter
pu = []
pl = []
for p in p0:
pu.append(P*1.5)
pl.append(P*0.5)
print(pu)
print(pl)
print("Initial guess of parameters is: ",p0)
curve = sp.least_squares(fun=residuals,x0=p0,args=(y,x),bounds=(pl,pu))
curve = curve.x.tolist()
print("Optimized curve parameters are: ",curve)
# Predict x values based on given y
y = [1000,2000,3000,4000,5000,6000,7000,8000,9000]
for sample in y:
solve = log_solve_for_x(curve,sample)
print("Predicted X value for y =",sample," is: ",solve)
解决方法
您的曲线不是针对任何参数值定义的。但是您没有为 least_squares
提供该信息。在某些时候,求解器进入一个不可接受的区域并卡在那里,从 residuals
获取 nans,您会收到有关无效功率的消息。您拥有微不足道的权力,可以只设置 E>=0,B>=0
。但你的基础是不平凡的。您要么需要切换到支持通用约束(例如 scipy.optimize.minimize
)的求解器并添加 base >=0
的约束,或者以其他方式将搜索限制为可允许的域,例如:
pu = []
pl = []
for p in p0:
pu.append(p*1.5)
pl.append(p*.5)
curve = sp.least_squares(fun=residuals,x0=p0,args=(y,x),bounds=(pl,pu))
您也可以尝试修复残差,使其适用于任何参数,例如用与初始猜测的距离替换 nan 。但它可能工作效率低下。
为了改善拟合结果,您可以尝试更好的初始点或多重起点或两者兼而有之。
A = np.amin(y) # Min asymptote
D = np.amax(y) # Max asymptote
B = (D-A)/np.amax(x)*10 # Steepness
C = np.amax(x)/10 # inflection point
E = 0.001 # Asymmetry factor
p0 = [A,B,C,D,E]
print("Initial guess of parameters is: ",p0)
pu = []
pl = []
for p in p0:
pu.append(p*1.5)
pl.append(p*.5)
best_cost = np.inf
for i in range(100):
for i in range(5):
p0[i] = np.random.uniform(pl[i],pu[i])
curve = sp.least_squares(fun=residuals,pu))
print(p0,curve.cost)
if best_cost > curve.cost:
best_cost = curve.cost
curve_out = curve.x.tolist()
print("Optimized curve parameters are: ",curve_out)
plt.plot(x,y,'.')
xx = np.linspace(0,150,100)
yy = []
for x in xx:
yy.append(logistic5(x,*curve_out))
plt.plot(xx,yy)
plt.show()