带有数据框的Curve_fit TypeError:输入错误

问题描述

我正在尝试对两个DataFrame中的值进行线性拟合。以下代码是我的脚本的一部分,我主要在其中使用DataFrames。

import numpy as np
import pandas as pd
import pylab as plt
import scipy
from scipy.optimize import curve_fit

X = np.array([[76.17E-3,52.62E-3,42.95E-3,29.78E-3,27.50E-3,21.78E-3,14.00E-3,7.45E-3]])

Y = np.array([[4.573085e+06,3.906632e+06,3.589304e+06,3.408189e+06,3.149472e+06,3.010599e+06,2.678995e+06,2.599270e+06]])

X = pd.DataFrame(data=X)
Y = pd.DataFrame(data=Y)

print(type(X))
print(X.shape)
print(type(Y))
print(Y.shape)

def Parameters(X,A,B):
    return A + B*X

def fit_Parameters(X):
    As = []
    Bs = []
    Fit_AB = []
    popt = scipy.optimize.curve_fit(Parameters,X,Y,p0=None,maxfev=5000,method='lm')
    A,B = popt
    fitted_parameters = Parameters(X,B)
    As.append(A)
    Bs.append(B)
    Fit_AB.append(fitted_parameters)
    print(fitted_parameters)
    return As,Bs,Fit_AB

As,Fit_AB = fit_Parameters(X)

plt.plot(X,marker='*',color='b',markersize=13)
plt.plot(X,Fit_AB,linestyle='dashed',color='k')
plt.show()

我获得下一个输出

<class 'pandas.core.frame.DataFrame'>
(1,8)
<class 'pandas.core.frame.DataFrame'>
(1,8)
Traceback (most recent call last):
  File "//Try_pandas to fit.py",line 37,in <module>   
    As,Fit_AB = fit_Parameters(X)
  File "//Try_pandas to fit.py",line 28,in fit_Parameters
    Parameters,method='lm')
  File "C:\ProgramData\Anaconda3\Lib\site-packages\scipy\optimize\minpack.py",line 763,in curve_fit
    res = leastsq(func,p0,Dfun=jac,full_output=1,**kwargs)
  File "C:\ProgramData\Anaconda3\Lib\site-packages\scipy\optimize\minpack.py",line 392,in leastsq
    raise TypeError('Improper input: N=%s must not exceed M=%s' % (n,m))
TypeError: Improper input: N=2 must not exceed M=1

我也尝试过使用数组而不是DataFrames来做到这一点,但是仍然存在相同的输出错误。 有谁知道为什么会发生此错误

谢谢。

解决方法

嘿,关于这个错误,您的X和y应该是一维的

enter image description here

这里如何解决它: 如果您无法控制数据,则可以使用squeeze()

import numpy as np
import pandas as pd
import pylab as plt
import scipy
from scipy.optimize import curve_fit

xdata = np.array([[76.17E-3,52.62E-3,42.95E-3,29.78E-3,27.50E-3,21.78E-3,14.00E-3,7.45E-3]])

ydata = np.array([[4.573085e+06,3.906632e+06,3.589304e+06,3.408189e+06,3.149472e+06,3.010599e+06,2.678995e+06,2.599270e+06]])

X = pd.DataFrame(data=X)
Y = pd.DataFrame(data=Y)

print(type(X))
print(X.shape)
print(type(Y))
print(Y.shape)

def Parameters(X,A,B ):
   return A + B*X

xdata =xdata.squeeze()
ydata = ydata.squeeze()

print("x:",xdata)
print("y:",ydata)
popt = scipy.optimize.curve_fit(Parameters,xdata,ydata,p0=None,maxfev=5000,method='lm')
popt

enter image description here

,

正如您的错误所暗示的那样,代码在这一点上失败了:

popt = scipy.optimize.curve_fit(Parameters,X,Y,method='lm')

您可以通过使用以下3行替换此行来解决此问题:

XX=[each for lst in X.values for each in lst]
YY=[each for lst in Y.values for each in lst]
popt = scipy.optimize.curve_fit(Parameters,XX,YY,method='lm')

该脚本在稍后仍将存在一个错误,但这是我认为的单独问题。

,

只需一维numpy数组即可实现所需的功能,而无需熊猫:

import scipy
from scipy.optimize import curve_fit

X = np.array([76.17E-3,7.45E-3])
Y = np.array([4.573085e+06,2.599270e+06])

def func(X,B):
    return A + B*X

popt,_ = scipy.optimize.curve_fit(func,method='lm')

popt
array([ 2371977.38020794,29163368.07469793])