Python Pulp-没有重复的名称

问题描述

我为幻想足球构建了一个优化脚本。从本质上讲,它必须在6个拥有1个队长和5个FLEX位置的球员之间优化投射点。我的问题是,无论求解员选择担任船长,还是被选为FLEX。我想限制它,所以没有重复的玩家名称被选中。除此之外,其他所有方法都起作用,下面是我使用的代码和CSV文件示例:

`

CSV File that I am Optimizing "Projection"

import pulp
import pandas as pd
import numpy as np
from itertools import chain
import csv

file_name = 'C:/Users/Michael Arena/Desktop/Football/Showdown/Simulation_Showdown.csv'
raw_data = pd.read_csv(file_name,engine="python",index_col=False,header=0,delimiter=",",quoting = 3)

player_ids = raw_data.index
player_vars = pulp.LpVariable.dicts('player',player_ids,cat='Binary')


prob = pulp.LpProblem("DFS Optimizer",pulp.LpMaximize)

prob += pulp.lpSum([raw_data['Projection'][i]*player_vars[i] for i in player_ids])

##Total Salary upper:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) <= 50000

##Total Salary lower:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) >= 10000

##Exactly 6 players:
prob += pulp.lpSum([player_vars[i] for i in player_ids]) == 6

## 5 Flex:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'FLEX']) >= 5

##1 Captain:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'CPT']) == 1


pulp.pulpTestAll()

prob.status

prob.solve()


raw_data["is_drafted"] = 0.0
for var in prob.variables():
    # Set is drafted to the value determined by the LP
    raw_data.iloc[int(var.name[7:]),7] = var.varValue # column 11 = is_drafted


my_team = raw_data[raw_data["is_drafted"] != 0]
my_team = my_team[["Name","Position","Team","Salary","Projection","Opponent"]]

print(my_team.head(10))

print("Total used amount of salary cap: {}".format(my_team["Salary"].sum()))
print("Projected points: {}".format(my_team["Projection"].sum().round(1)))



print(my_team["Projection"].sum().round(1))

解决方法

您可能希望将数据分解为(名称,位置)的双索引,事情将变得更容易很多。这里的另一个解决方案是好的,但是非常脆弱,因为它假定所有可能为FLEX的名称都与CPT相同,并且顺序相同。

这是一个不需要的想法,如果您扩展模型,它将更加灵活。注意,我使用grep -Po '(?<=^Hello /hi/)[0-9]+(?=/)' file.txt 来提取数据,但是您也可以花几行来读取csv。另外,您可以使用pandas索引(在这种情况下为多索引)来构建模型,但是我认为基本python索引在这种情况下更加清晰,因此我将关键数据扔进了2个字典中。

我缩小了模型的大小,以便从此.csv中的6个名称-位置对中进行选择:

draw.arc()

脚本


C:\Windows\system32>cd drivers

C:\Windows\System32\drivers>cd etc

C:\Windows\System32\drivers\etc>python block.py
Time to block some websites
> Give me the link... https://stackoverflow.com/questions/64506935/run-python-file-as-admin
Protocol :  https
https://stackoverflow.com/questions/64506935/run-python-file-as-admin
# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally,comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#       127.0.0.1       localhost
#       ::1             localhost
127.0.0.1 stackoverflow.com

C:\Windows\System32\drivers\etc>

输出...

pandas
,

根据定义,由于CSV列中的名称将重复,因此您需要为脚本添加一些内容以区分两者。

,

这是我正在谈论的解决方案

import pulp
import numpy as np


#%% DUMMY ARRAY TO SIMULATE YOUR DATA
Nplayers = 7

raw_data = {}

names = [chr(ord('A')+i) for i in range(Nplayers)]*2
raw_data['Position'] = ['FLEX']*Nplayers + ['CPT']*Nplayers
raw_data['Projection'] = np.random.rand(2*Nplayers) * 10
raw_data['Salary'] = (np.random.rand(2*Nplayers)*10000).astype(int)
player_ids = np.arange(2*Nplayers)


#%% YOUR CODE (I commented few stuff)

#player_ids = raw_data.index
player_vars = pulp.LpVariable.dicts('player',player_ids,cat='Binary')

prob = pulp.LpProblem("DFS Optimizer",pulp.LpMaximize)

prob += pulp.lpSum([raw_data['Projection'][i]*player_vars[i] for i in player_ids])

##Total Salary upper:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) <= 50000
##Total Salary lower:
prob += pulp.lpSum([raw_data['Salary'][i]*player_vars[i] for i in player_ids]) >= 10000
##Exactly 6 players:
prob += pulp.lpSum([player_vars[i] for i in player_ids]) == 6
## 5 Flex:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'FLEX']) >= 5
##1 Captain:
prob += pulp.lpSum([player_vars[i] for i in player_ids if raw_data['Position'][i] == 'CPT']) == 1



# HERE IS THE IMPORTANT STUFF!!!
# you can create a clever way to match 'flex_indx' with 'cpt_indx' by their names
flex_indx = np.arange(Nplayers)
cpt_indx = np.arange(Nplayers)+Nplayers
for i in range(Nplayers):
    prob += pulp.lpSum([player_vars[cpt_indx[i]],player_vars[flex_indx[i]]])<=1



#pulp.pulpTestAll()
#prob.status
prob.solve()

#%% PLOT RESULT

tot_proj = 0.0
tot_salary = 0.0

print('-'*40)
print('%9s : %s : %4s : %4s : %4s'%('VAR','NAMES','POS','PROJ','SALARY'))
print('-'*40)
for v in prob.variables():
    if v.value()>0:
        num = int(v.name[7:])
        tot_proj += raw_data['Projection'][num]
        tot_salary += raw_data['Salary'][num] 
        print('%9s : %5s : %4s : %4.1f : %6u'%(v.name,names[num],raw_data['Position'][num],raw_data['Projection'][num],raw_data['Salary'][num]))
print('-'*40)
print('TOTAL' + ' '*20 + ': %4.1f : %6u'%(tot_proj,tot_salary))
print('-'*40)