问题描述
我试图通过最小化受约束的平方误差总和来对 NFL 球队进行评分。误差定义为比赛中的实际得分差减去预测得分差。我的数据包括逐场比赛的得分。它们看起来像:
# Imports
import pandas as pd
import numpy as np
# Data
dat = {"Home_Team": ["KC Chiefs","LA Chargers","Baltimore Ravens"],"Away_Team": ["Houston Texans","Miami Dolphins","KC Chiefs"],"Home_score": [34,20,20],"Away_score": [20,23,34],"Margin": [14,-3,-34]
}
df = pd.DataFrame(dat)
df
Home_Team Away_Team Home_score Away_score Margin
0 KC Chiefs Houston Texans 34 20 14
1 LA Chargers Miami Dolphins 20 23 -3
2 Baltimore Ravens KC Chiefs 20 34 -34
保证金是保证金 = Home_score - Away_score。我的目标是为每个团队提出一个数字评级,使得所有团队的评级平均值为零。因此,如果酋长队的评分为 3.0,那么他们比一般球队高 3 分。
给定一组评分,我们以这种方式生成预测:主队的预测胜率是 Home_Edge + Home_rating - Away_rating,其中 Home_Edge 是主场优势(所有主队的常数),Home_rating 是主场球队的评分,以及 Away_rating 客队的评分。除了团队评分,我还想找到最佳的 Home_Edge 值。
正如我之前所说,预测中的误差是实际得分余量 - 预测余量,我想最小化这些误差的平方和。我正在尝试通过以下方式使用 scipy.optimize
执行此操作:
# Our objective function,where x is our array of parameters,# x[0] is the home edge,x[1] the home rating,and x[2] the away rating
# Y is the true,observed margin
def obj_fun(x,Y):
y = x[0] + x[1] - x[2]
return np.sum((y - Y)**2)
# Define the constraint function. We have that the ratings average to 0
def con(x):
return np.mean(x[1])
# Constraint dictionary
cons = {'type': 'eq','fun': con}
# Minimize sum of squared errors
from scipy import optimize
# Initial guesses (numbers I randomly thought of in my head)
home_edge = 0.892
home_ratings = np.array([1.46,9.67,-0.82])
away_ratings = np.array([-3.10,-6.57,1.46])
x_init = [np.repeat(home_edge,3),home_ratings,away_ratings]
# Minimize
results = optimize.minimize(fun = obj_fun,args = (df["Margin"]),x0 = x_init,constraints = cons)
print(results.x)
[-2.9413615 0. 4.72534244 1.46 9.67 -0.82
-3.1 -6.57 1.46 ]
我希望我的输出有 6 个解决方案,而不是 9 个,所以我不太确定我哪里出错了。我们应该有一个解决主场优势的解决方案,另外还有五个解决方案(数据中的每支球队一个)。怎么了?谢谢!
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)