如何从Octave中的Andrew Ng赋值编写成本函数公式？

问题描述

我的实现（见下文）给出了标量值3.18，这不是正确的答案。该值应为0.693。我的代码在哪里偏离方程式？

以下是解决数据以运行Octave中的成本函数方法的说明：

data = load('ex2data1.txt');
X = data(:,[1,2]); y = data(:,3);
[m,n] = size(X);
X = [ones(m,1) X];
initial_theta = zeros(n + 1,1);
[cost,grad] = costFunction(initial_theta,X,y);

这里是ex2data上的链接，在此程序包中有数据：data link。

成本函数的公式为

这是我正在使用的代码：

function [J,grad] = costFunction(theta,y)

m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0; %#ok<NASGU>
grad = zeros(size(theta)); %#ok<NASGU>

hx = sigmoid(X * theta)';
m = length(X);

J = sum(-y' * log(hx) - (1 - y')*log(1 - hx)) / m;

grad = X' * (hx - y) / m;

end

这是S型函数：

function g = sigmoid(z)
g = 1/(1+exp(-z));
end

解决方法

这是S型函数的代码，我认为您在以下地方犯了错误：

function g = sigmoid(z)
   g = zeros(size(z));
   temp=1+exp(-1.*z);
   g=1./temp;
end


function [J,grad] = costFunction(theta,X,y)
   m = length(y); 
   J = 0;
   grad = zeros(size(theta));
   h=X*theta;
   xtemp=sigmoid(h);
   temp1=(-y'*log(xtemp));
   temp2=(1-y)'*log(1-xtemp);
   J=1/m*sum(temp1-temp2);
   grad=1/m*(X'*(xtemp-y));
end

我认为应该是（1-y）'，如temp2 =（1-y）'

您的sigmoid函数不正确。输入的数据类型是向量，但是您正在使用的操作正在执行矩阵除法。这需要是明智的。

function g = sigmoid(z)
    g = 1.0 ./ (1.0 + exp(-z));
end

通过做1 / A，其中A是一个表达式，实际上您是在计算A的 inverse ，因为逆仅存在于平方矩阵中，因此计算绝对不是您想要的伪逆。

您可以将大多数costFunction代码保持与使用点乘积相同。我将摆脱sum，因为这是点积所隐含的。我将在注释中标记我的更改：

function [J,y)

m = length(y); % number of training examples

% You need to return the following variables correctly 
%J = 0; %#ok<NASGU> <-- Don't need to declare this as you'll create the variables later
%grad = zeros(size(theta)); %#ok<NASGU>

hx = sigmoid(X * theta);  % <-- Remove transpose
m = length(X);

J = (-y' * log(hx) - (1 - y')*log(1 - hx)) / m; % <-- Remove sum

grad = X' * (hx - y) / m;

end

gradient-descent logistic-regression machine-learning octave