问题描述
我为幻想足球构建了一个优化脚本。从本质上讲,它必须在6个拥有1个队长和5个FLEX位置的球员之间优化投射点。我的问题是,无论求解员选择担任船长,还是被选为FLEX。我想限制它,所以没有重复的玩家名称被选中。除此之外,其他所有方法都起作用,下面是我使用的代码和CSV文件示例:
import pulp
import pandas as pd
import numpy as np
from itertools import chain
import csv
file_name = 'C:/Users/Michael Arena/Desktop/Football/Showdown/Simulation_Showdown.csv'
raw_data = pd.read_csv(file_name,engine="python",index_col=False,header=0,delimiter=",",quoting = 3)
player_ids = raw_data.index
player_vars = pulp.LpVariable.dicts('player',player_ids,cat='Binary')
prob = pulp.LpProblem("DFS Optimizer",pulp.LpMaximize)
prob += pulp.lpSum([raw_data['Projection'][i]*player_vars[i] for i in player_ids])
##Total Salary upper:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) <= 50000
##Total Salary lower:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) >= 10000
##Exactly 6 players:
prob += pulp.lpSum([player_vars[i] for i in player_ids]) == 6
## 5 Flex:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'FLEX']) >= 5
##1 Captain:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'CPT']) == 1
pulp.pulpTestAll()
prob.status
prob.solve()
raw_data["is_drafted"] = 0.0
for var in prob.variables():
# Set is drafted to the value determined by the LP
raw_data.iloc[int(var.name[7:]),7] = var.varValue # column 11 = is_drafted
my_team = raw_data[raw_data["is_drafted"] != 0]
my_team = my_team[["Name","Position","Team","Salary","Projection","Opponent"]]
print(my_team.head(10))
print("Total used amount of salary cap: {}".format(my_team["Salary"].sum()))
print("Projected points: {}".format(my_team["Projection"].sum().round(1)))
print(my_team["Projection"].sum().round(1))
解决方法
您可能希望将数据分解为(名称,位置)的双索引,事情将变得更容易很多。这里的另一个解决方案是好的,但是非常脆弱,因为它假定所有可能为FLEX的名称都与CPT相同,并且顺序相同。
这是一个不需要的想法,如果您扩展模型,它将更加灵活。注意,我使用grep -Po '(?<=^Hello /hi/)[0-9]+(?=/)' file.txt
来提取数据,但是您也可以花几行来读取csv。另外,您可以使用pandas索引(在这种情况下为多索引)来构建模型,但是我认为基本python索引在这种情况下更加清晰,因此我将关键数据扔进了2个字典中。
我缩小了模型的大小,以便从此.csv中的6个名称-位置对中进行选择:
draw.arc()
脚本
C:\Windows\system32>cd drivers
C:\Windows\System32\drivers>cd etc
C:\Windows\System32\drivers\etc>python block.py
Time to block some websites
> Give me the link... https://stackoverflow.com/questions/64506935/run-python-file-as-admin
Protocol : https
https://stackoverflow.com/questions/64506935/run-python-file-as-admin
# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally,comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
# 102.54.94.97 rhino.acme.com # source server
# 38.25.63.10 x.acme.com # x client host
# localhost name resolution is handled within DNS itself.
# 127.0.0.1 localhost
# ::1 localhost
127.0.0.1 stackoverflow.com
C:\Windows\System32\drivers\etc>
输出...
pandas
,
根据定义,由于CSV列中的名称将重复,因此您需要为脚本添加一些内容以区分两者。
,这是我正在谈论的解决方案
import pulp
import numpy as np
#%% DUMMY ARRAY TO SIMULATE YOUR DATA
Nplayers = 7
raw_data = {}
names = [chr(ord('A')+i) for i in range(Nplayers)]*2
raw_data['Position'] = ['FLEX']*Nplayers + ['CPT']*Nplayers
raw_data['Projection'] = np.random.rand(2*Nplayers) * 10
raw_data['Salary'] = (np.random.rand(2*Nplayers)*10000).astype(int)
player_ids = np.arange(2*Nplayers)
#%% YOUR CODE (I commented few stuff)
#player_ids = raw_data.index
player_vars = pulp.LpVariable.dicts('player',player_ids,cat='Binary')
prob = pulp.LpProblem("DFS Optimizer",pulp.LpMaximize)
prob += pulp.lpSum([raw_data['Projection'][i]*player_vars[i] for i in player_ids])
##Total Salary upper:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) <= 50000
##Total Salary lower:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) >= 10000
##Exactly 6 players:
prob += pulp.lpSum([player_vars[i] for i in player_ids]) == 6
## 5 Flex:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'FLEX']) >= 5
##1 Captain:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'CPT']) == 1
# HERE IS THE IMPORTANT STUFF!!!
# you can create a clever way to match 'flex_indx' with 'cpt_indx' by their names
flex_indx = np.arange(Nplayers)
cpt_indx = np.arange(Nplayers)+Nplayers
for i in range(Nplayers):
prob += pulp.lpSum([player_vars[cpt_indx[i]],player_vars[flex_indx[i]]])<=1
#pulp.pulpTestAll()
#prob.status
prob.solve()
#%% PLOT RESULT
tot_proj = 0.0
tot_salary = 0.0
print('-'*40)
print('%9s : %s : %4s : %4s : %4s'%('VAR','NAMES','POS','PROJ','SALARY'))
print('-'*40)
for v in prob.variables():
if v.value()>0:
num = int(v.name[7:])
tot_proj += raw_data['Projection'][num]
tot_salary += raw_data['Salary'][num]
print('%9s : %5s : %4s : %4.1f : %6u'%(v.name,names[num],raw_data['Position'][num],raw_data['Projection'][num],raw_data['Salary'][num]))
print('-'*40)
print('TOTAL' + ' '*20 + ': %4.1f : %6u'%(tot_proj,tot_salary))
print('-'*40)