Pandas:基于多列将 df 拆分为多个 dfs

问题描述

我有一个 df,我想根据“Name”和“Plan”列中的值将其拆分为多个 df。对于低于 df,我希望拆分为 6 个 dfs,其中第 1 行和第 6 行将在同一个 df 中

df:

City    State       Name     Plan   Price
 A        CA     Star Inn     CTS    50
 B        CA      1 Inn       KVG    100
 C        IN     GS Hotel     KHA    25
 D        FL     HJ Resort    2QN    45
 E        AL     PQ Inn       POI    55
 A        CA     Star Inn     CTS    80
 A        CA     Star Inn     MNB    65

期望的输出

df1:

City    State       Name     Plan   Price
 A        CA     Star Inn     CTS    50
 A        CA     Star Inn     CTS    80

df2:

City    State       Name     Plan   Price
 B        CA      1 Inn       KVG    100

依此类推,直到 df6...

解决方法

此示例将按 NamePlan 拆分数据框并打印它们:

dataframes = []
for _,d in df.groupby(["Name","Plan"]):
    dataframes.append(d)

# print it:
for d in dataframes:
    print(d)
    print("-" * 80)

打印:

  City State   Name Plan  Price
1    B    CA  1_Inn  KVG    100
--------------------------------------------------------------------------------
  City State      Name Plan  Price
2    C    IN  GS_Hotel  KHA     25
--------------------------------------------------------------------------------
  City State       Name Plan  Price
3    D    FL  HJ_Resort  2QN     45
--------------------------------------------------------------------------------
  City State    Name Plan  Price
4    E    AL  PQ_Inn  POI     55
--------------------------------------------------------------------------------
  City State      Name Plan  Price
0    A    CA  Star_Inn  CTS     50
5    A    CA  Star_Inn  CTS     80
--------------------------------------------------------------------------------
  City State      Name Plan  Price
6    A    CA  Star_Inn  MNB     65
--------------------------------------------------------------------------------
,

在 Pandas 中使用 group_by 你会得到一个 Grouper 对象:

grouped = df.groupby(["Name","Plan"])

当您迭代时,它将为您提供一个元组,其中第一个元素是组(在本例中为 ("Name","Plan")),第二个元素是拆分的 dfs:

grouped = df.groupby(["Name","Plan"])
for _,split_df in grouped:
    print(split_df)
    print("-----")

会给你:

  City State   Name Plan  Price
1    B    CA  1 Inn  KVG    100
-----
  City State      Name Plan  Price
2    C    IN  GS Hotel  KHA     25
-----
  City State       Name Plan  Price
3    D    FL  HJ Resort  2QN     45
-----
  City State    Name Plan  Price
4    E    AL  PQ Inn  POI     55
-----
  City State      Name Plan  Price
0    A    CA  Star Inn  CTS     50
5    A    CA  Star Inn  CTS     80
-----
  City State      Name Plan  Price
6    A    CA  Star Inn  MNB     65
-----