问题描述
我正在尝试为组成数据建立一个简单的线性回归示例。我正在使用以下代码:
from pandas import DataFrame
import numpy as np
from skbio import TreeNode
from gneiss.regression import ols
from IPython.display import display
#define table of compositions
yTrain = DataFrame({'y1': [0.8,0.3,0.5],'y2': [0.2,0.7,0.5]})
#define predictors for compositions
xTrain = DataFrame({'x1': [1,3,2]})
#Once these variables are defined,a regression can be performed. These proportions will be converted to balances according to the tree specified. And the regression formula is specified to run temp and ph against the proportions in a single model.
model = ols('x1',yTrain,xTrain)
model.fit()
xTest = DataFrame({'x1': [1,3]})
yTest = model.predict(xTest)
display(yTest)
我收到错误matrices are not aligned
。对如何使它运行有任何想法吗?
解决方法
您似乎在训练和测试阶段之间混合了x
和y
矩阵。您的xTest
的结构应该与yTrain
相同。在您的代码中xTest
看起来像xTrain
,似乎与标签相对应。
ML的一般约定是将x
用于输入,将y
用于输出。在您的情况下,您在训练过程中使用了y
作为输入,在训练过程中使用了x
作为标签。
例如,尝试将xTest设置为以下内容:
xTest = DataFrame({'y1': [0.1,0.4,0.6],'y2': [0.4,0.2,0.8]})
那应该摆脱错误。理想情况下,您可以按照以下方式进行操作:
from pandas import DataFrame
import numpy as np
from skbio import TreeNode
from gneiss.regression import ols
from IPython.display import display
#define table of compositions
xTrain = DataFrame({'x1': [0.8,0.3,0.5],'x2': [0.2,0.7,0.5]})
#define predictors for compositions
yTrain = DataFrame({'y1': [1,3,2]})
model = ols('y1',xTrain,yTrain)
model.fit()
xTest = DataFrame({'x1': [0.1,'x2': [0.4,0.8]})
yTest = model.predict(xTest)
display(yTest)