使用带有未知常数项的二次函数，如何使用梯度下降找到这些未知常数？

问题描述

所有人。

我是机器学习的初学者，现在开始学习有关梯度下降的知识。但是，我遇到了一个大问题。以下问题是这样的：

given numbers [0,0],[1,1],2],[2,1] and 
 equation will be [ f=(a2)*x^2 + (a1)*x + a0 ]

通过手动解决，我得到了答案[-1,5/2,0] 但是很难通过使用这些给定数据制作具有梯度下降的python代码来找到解决方案。

就我而言，我尝试以最简单，最快的方式使用梯度下降方法编写代码：

learningRate = 0.1

make **a series of number of x

initialize given 1,1,1 for a2,a1,a0

partial derivative for a2,a0 (a2_p:2x,a1_p:x,a0_p:1)

gradient descent method : (ex) a2 = a2 - (learningRate)( y - [(a2)*x^2 + (a1)*x + a0] )(a2_p)

ps。老实说，我不知道该把'x' and 'y' or a2,a0放在什么位置。

但是，每次我得到错误的答案都会得到不同的结果。因此，我想提示正确的公式或代码序列。

感谢您阅读我最低级别的问题。

解决方法

方程式中有一些错误

对于函数 f(x) = a2*x^2+a1*x+a0，a2，a1和a0的偏导数为x^2，{{1} }和x。

假设成本函数为1

成本函数对(1/2)*(y-f(x))^2的偏导数为ai，其中-(y-f(x))* partial derivative of f(x) for ai属于i

因此，梯度下降方程为：
[0,2]，其中ai = ai + learning_rate*(y-f(x)) * partial derivative of f(x) for ai属于i

我希望这个代码可以帮助

[0,2]

输出：

#Training sample
sample = [(0,0),(1,1),2),(2,1)]

#Our function => a2*x^2+a1*x+a0
class Function():
    def __init__(self,a2,a1,a0):
        self.a2 = a2
        self.a1 = a1
        self.a0 = a0
    
    def eval(self,x):
        return self.a2*x**2+self.a1*x+self.a0
    
    def partial_a2(self,x):
        return x**2
    
    def partial_a1(self,x):
        return x
    
    def partial_a0(self,x):
        return 1

#Initialise function
f = Function(1,1,1)

#To Calculate loss from the sample
def loss(sample,f):
    return sum([(y-f.eval(x))**2 for x,y in sample])/len(sample)

epochs = 100000
lr = 0.0005
#To record the best values
best_values = (0,0)

for epoch in range(epochs):
    min_loss = 100
    for x,y in sample:
       #Gradient descent
       f.a2 = f.a2+lr*(y-f.eval(x))*f.partial_a2(x)
       f.a1 = f.a1+lr*(y-f.eval(x))*f.partial_a1(x)
       f.a0 = f.a0+lr*(y-f.eval(x))*f.partial_a0(x)
    
    #Storing the best values
    epoch_loss = loss(sample,f)
    if min_loss > epoch_loss:
        min_loss = epoch_loss
        best_values = (f.a2,f.a1,f.a0)
       
print("Loss:",min_loss)
print("Best values (a2,a0):",best_values)

deep-learning derivative gradient-descent machine-learning