问题描述
我正在尝试对两个DataFrame中的值进行线性拟合。以下代码是我的脚本的一部分,我主要在其中使用DataFrames。
import numpy as np
import pandas as pd
import pylab as plt
import scipy
from scipy.optimize import curve_fit
X = np.array([[76.17E-3,52.62E-3,42.95E-3,29.78E-3,27.50E-3,21.78E-3,14.00E-3,7.45E-3]])
Y = np.array([[4.573085e+06,3.906632e+06,3.589304e+06,3.408189e+06,3.149472e+06,3.010599e+06,2.678995e+06,2.599270e+06]])
X = pd.DataFrame(data=X)
Y = pd.DataFrame(data=Y)
print(type(X))
print(X.shape)
print(type(Y))
print(Y.shape)
def Parameters(X,A,B):
return A + B*X
def fit_Parameters(X):
As = []
Bs = []
Fit_AB = []
popt = scipy.optimize.curve_fit(Parameters,X,Y,p0=None,maxfev=5000,method='lm')
A,B = popt
fitted_parameters = Parameters(X,B)
As.append(A)
Bs.append(B)
Fit_AB.append(fitted_parameters)
print(fitted_parameters)
return As,Bs,Fit_AB
As,Fit_AB = fit_Parameters(X)
plt.plot(X,marker='*',color='b',markersize=13)
plt.plot(X,Fit_AB,linestyle='dashed',color='k')
plt.show()
<class 'pandas.core.frame.DataFrame'>
(1,8)
<class 'pandas.core.frame.DataFrame'>
(1,8)
Traceback (most recent call last):
File "//Try_pandas to fit.py",line 37,in <module>
As,Fit_AB = fit_Parameters(X)
File "//Try_pandas to fit.py",line 28,in fit_Parameters
Parameters,method='lm')
File "C:\ProgramData\Anaconda3\Lib\site-packages\scipy\optimize\minpack.py",line 763,in curve_fit
res = leastsq(func,p0,Dfun=jac,full_output=1,**kwargs)
File "C:\ProgramData\Anaconda3\Lib\site-packages\scipy\optimize\minpack.py",line 392,in leastsq
raise TypeError('Improper input: N=%s must not exceed M=%s' % (n,m))
TypeError: Improper input: N=2 must not exceed M=1
我也尝试过使用数组而不是DataFrames来做到这一点,但是仍然存在相同的输出错误。 有谁知道为什么会发生此错误?
谢谢。
解决方法
嘿,关于这个错误,您的X和y应该是一维的
这里如何解决它: 如果您无法控制数据,则可以使用squeeze()
import numpy as np
import pandas as pd
import pylab as plt
import scipy
from scipy.optimize import curve_fit
xdata = np.array([[76.17E-3,52.62E-3,42.95E-3,29.78E-3,27.50E-3,21.78E-3,14.00E-3,7.45E-3]])
ydata = np.array([[4.573085e+06,3.906632e+06,3.589304e+06,3.408189e+06,3.149472e+06,3.010599e+06,2.678995e+06,2.599270e+06]])
X = pd.DataFrame(data=X)
Y = pd.DataFrame(data=Y)
print(type(X))
print(X.shape)
print(type(Y))
print(Y.shape)
def Parameters(X,A,B ):
return A + B*X
xdata =xdata.squeeze()
ydata = ydata.squeeze()
print("x:",xdata)
print("y:",ydata)
popt = scipy.optimize.curve_fit(Parameters,xdata,ydata,p0=None,maxfev=5000,method='lm')
popt
,
正如您的错误所暗示的那样,代码在这一点上失败了:
popt = scipy.optimize.curve_fit(Parameters,X,Y,method='lm')
您可以通过使用以下3行替换此行来解决此问题:
XX=[each for lst in X.values for each in lst]
YY=[each for lst in Y.values for each in lst]
popt = scipy.optimize.curve_fit(Parameters,XX,YY,method='lm')
该脚本在稍后仍将存在一个错误,但这是我认为的单独问题。
,只需一维numpy数组即可实现所需的功能,而无需熊猫:
import scipy
from scipy.optimize import curve_fit
X = np.array([76.17E-3,7.45E-3])
Y = np.array([4.573085e+06,2.599270e+06])
def func(X,B):
return A + B*X
popt,_ = scipy.optimize.curve_fit(func,method='lm')
popt
array([ 2371977.38020794,29163368.07469793])