问题描述
我想根据B列对数据框进行从高到低的排序。找不到如何对外部(即第一)索引列进行排序的答案。
我有以下示例数据:
A B
Item Type
0 X 'rtr' 2
Tier 'sfg' 104
1 X 'zad' 7
Tier 'asd' 132
2 X 'frs' 4
Tier 'plg' 140
3 X 'gfq' 9
Tier 'bcd' 100
每个多索引行都包含一个“ Tier”行。我想基于与每个“层”相关的“ B”列值对外部索引“项目”进行排序。出于排序目的,“ A”列可以忽略,但需要包含在数据框中。
A B
Item Type
2 X 'frs' 4
Tier 'plg' 140
1 X 'zad' 7
Tier 'asd' 132
0 X 'rtr' 2
Tier 'sfg' 104
3 X 'gfq' 9
Tier 'bcd' 100
解决方法
新回复#2
根据收到的所有输入,以下是解决方案。希望这对您有用。
import pandas as pd
df = pd.read_csv("xyz.txt")
df1 = df.copy()
#capture the original index of each row. This will be used for sorting later
df1['idx'] = df1.index
#create a dataframe with only items that match 'Tier'
#assumption is each Index has a row with 'Tier'
tier = df1.loc[df1['Type']=='Tier']
#sort Total for only the Tier rows
tier = tier.sort_values('Total')
#Create a list of the indexes in sorted order
#this will be the order to print the rows
tier_list = tier['Index'].tolist()
# Create the dictionary that defines the order for sorting
sorterIndex = dict(zip(tier_list,range(len(tier_list))))
# Generate a rank column that will be used to sort the dataframe numerically
df1['Tier_Rank'] = df1['Index'].map(sorterIndex)
#Now sort the dataframe based on rank column and original index
df1.sort_values(['Tier_Rank','idx'],ascending = [True,True],inplace = True)
#drop the temporary column we created
df1.drop(['Tier_Rank',1,inplace = True)
#print the dataframe
print (df1)
根据源数据,这是最终输出。让我知道这是否符合您的需求。
Index Type Id ... Intellect Strength Total
12 2 Chest Armor "6917529202229928161" ... 17 8 62
13 2 Gauntlets "6917529202229927889" ... 16 14 60
14 2 Helmet "6917529202223945870" ... 10 9 66
15 2 Leg Armor "6917529202802011569" ... 15 2 61
16 2 Set NaN ... 58 33 249
17 2 Tier NaN ... 5 3 22
24 4 Chest Armor "6917529202229928161" ... 17 8 62
25 4 Gauntlets "6917529202802009244" ... 7 9 63
26 4 Helmet "6917529202223945870" ... 10 9 66
27 4 Leg Armor "6917529202802011569" ... 15 2 61
28 4 Set NaN ... 49 28 252
29 4 Tier NaN ... 4 2 22
42 7 Chest Armor "6917529202229928161" ... 17 8 62
43 7 Gauntlets "6917529202791088503" ... 7 14 61
44 7 Helmet "6917529202223945870" ... 10 9 66
45 7 Leg Armor "6917529202229923870" ... 7 19 57
46 7 Set NaN ... 41 50 246
47 7 Tier NaN ... 4 5 22
0 0 Chest Armor "6917529202229928161" ... 17 8 62
1 0 Gauntlets "6917529202778947311" ... 10 15 62
2 0 Helmet "6917529202223945870" ... 10 9 66
3 0 Leg Armor "6917529202802011569" ... 15 2 61
4 0 Set NaN ... 52 34 251
5 0 Tier NaN ... 5 3 23
6 1 Chest Armor "6917529202229928161" ... 17 8 62
7 1 Gauntlets "6917529202778947311" ... 10 15 62
8 1 Helmet "6917529202223945870" ... 10 9 66
9 1 Leg Armor "6917529202229923870" ... 7 19 57
10 1 Set NaN ... 44 51 247
11 1 Tier NaN ... 4 5 23
18 3 Chest Armor "6917529202229928161" ... 17 8 62
19 3 Gauntlets "6917529202229927889" ... 16 14 60
20 3 Helmet "6917529202223945870" ... 10 9 66
21 3 Leg Armor "6917529202229923870" ... 7 19 57
22 3 Set NaN ... 50 50 245
23 3 Tier NaN ... 5 5 23
30 5 Chest Armor "6917529202229928161" ... 17 8 62
31 5 Gauntlets "6917529202802009244" ... 7 9 63
32 5 Helmet "6917529202223945870" ... 10 9 66
33 5 Leg Armor "6917529202229923870" ... 7 19 57
34 5 Set NaN ... 41 45 248
35 5 Tier NaN ... 4 4 23
36 6 Chest Armor "6917529202229928161" ... 17 8 62
37 6 Gauntlets "6917529202791088503" ... 7 14 61
38 6 Helmet "6917529202223945870" ... 10 9 66
39 6 Leg Armor "6917529202802011569" ... 15 2 61
40 6 Set NaN ... 49 33 250
41 6 Tier NaN ... 4 3 23
[48 rows x 11 columns]
新回复:
基于共享的source data file,这是分组依据和排序。让我知道您希望如何对值进行排序。我假设您要按索引,然后按总数排序。
df = df.groupby(['Index','Type',])\
.agg({'Total':'mean'})\
.sort_values(['Index','Total'])
其输出如下:
Total
Index Type
0 Tier 23
Leg Armor 61
Chest Armor 62
Gauntlets 62
Helmet 66
Set 251
1 Tier 23
Leg Armor 57
Chest Armor 62
Gauntlets 62
Helmet 66
Set 247
2 Tier 22
Gauntlets 60
Leg Armor 61
Chest Armor 62
Helmet 66
Set 249
3 Tier 23
Leg Armor 57
Gauntlets 60
Chest Armor 62
Helmet 66
Set 245
4 Tier 22
Leg Armor 61
Chest Armor 62
Gauntlets 63
Helmet 66
Set 252
初始响应:
我没有您的原始数据。创建了一些数据,向您展示如何对groupby数据进行排序。看看这是否是您想要的。
将熊猫作为pd导入
df = pd.DataFrame({'Animal': ['Falcon','Falcon','Parrot','Parrot'],'Type':['Wild','Captive','Wild','Captive'],'Air': ['Good','Bad','Good'],'Max Speed': [380.,370.,24.,26.]})
df = df.groupby(['Animal','Air'])\
.agg({'Max Speed':'mean'})\
.sort_values('Max Speed')
print(df)
输出如下:
Max Speed
Animal Type Air
Parrot Wild Bad 24.0
Captive Good 26.0
Falcon Captive Bad 370.0
Wild Good 380.0
没有sort命令,输出将有所不同。
df = df.groupby(['Animal','Air'])\
.agg({'Max Speed':'mean'})
这将显示在下面。最大速度未排序。相反,它是按动物类别然后按类型使用分组:
Max Speed
Animal Type Air
Falcon Captive Bad 370.0
Wild Good 380.0
Parrot Captive Good 26.0
Wild Bad 24.0