问题描述
所以我有一个熊猫数据框,其中显示了来自不同坐标的曲棍球比赛清单的射门次数和进球数。数据框列出了这样的镜头和目标(4,2),我想添加另一列,将目标除以镜头,以给出每个坐标的镜头百分比。到目前为止,这是我的代码...
key in contents['liveData']['plays']['allPlays']:
# for plays in key['result']['event']:
# print(key)
if (key['result']['event'] == "Shot"):
#print(key['result']['event'])
scoordinates = (key['coordinates']['x'],key['coordinates']['y'])
if scoordinates not in shots:
shots[scoordinates] = (1,0)
else:
shots[scoordinates] = tuple(map(sum,zip((1,0),shots[scoordinates])))
if (key['result']['event'] == "Goal"):
#print(key['result']['event'])
gcoordinates = (key['coordinates']['x'],key['coordinates']['y'])
if gcoordinates not in shots:
shots[gcoordinates] = (1,1)
else:
shots[gcoordinates] = tuple(map(sum,1),shots[gcoordinates])))
#create data frame using pandas
pd.set_option("display.max_rows",None,"display.max_columns",None)
sdf = pd.DataFrame(list(shots.items()),columns = ['Coordinates','Occurences (S,G)'])
file.write(f"{sdf}\n")
这将结果数据框设为-
Coordinates Occurences (S,G)
0 (78.0,-19.0) (2,1)
1 (-37.0,-10.0) (2,0)
2 (47.0,-23.0) (3,1)
3 (53.0,14.0) (1,0)
4 (77.0,-2.0) (8,4)
5 (80.0,1.0) (12,5)
6 (74.0,14.0) (7,0)
7 (87.0,-3.0) (1,1)
如果有人可以帮助,那就太好了!
解决方法
尝试一下:
df['new_col']=df['old_col'].apply( lambda x: x[1]/x[0])
,
只需将2列分开即可。这是“更长”的方式。将S,G元组分成各自的列,然后划分。或使用Ave799提供的带lambda的单线。两者都可以,但是Ave799可能是首选方式
import pandas as pd
data = pd.DataFrame([[(78.0,-19.0),(2,1)],[(-37.0,-10.0),0)],[(47.0,-23.0),(3,[(53.0,14.0),(1,[(77.0,-2.0),(8,4)],[(80.0,1.0),(12,5)],[(74.0,(7,[(87.0,-3.0),1)]],columns=['Coordinates','Occurences (S,G)'])
data[['S','G']] = pd.DataFrame(data['Occurences (S,G)'].tolist(),index=data.index)
data['Percentage'] = data['G'] / data['S']
输出:
print(data)
Coordinates Occurences (S,G) Percentage S G
0 (78.0,-19.0) (2,1) 0.500000 2 1
1 (-37.0,-10.0) (2,0) 0.000000 2 0
2 (47.0,-23.0) (3,1) 0.333333 3 1
3 (53.0,14.0) (1,0) 0.000000 1 0
4 (77.0,-2.0) (8,4) 0.500000 8 4
5 (80.0,1.0) (12,5) 0.416667 12 5
6 (74.0,14.0) (7,0) 0.000000 7 0
7 (87.0,-3.0) (1,1) 1.000000 1 1