基于年份索引的pandas分层索引切片

问题描述

我有一个数据集,data1。我正在尝试使用基于 input

的索引切片

哪里data1 =

                  stats
gender  year    
        
women   2003    cellphone use
        2007    height
        2007    cigarette use
        2008    weight
        2009    cellphone use
        2015    cigarette use
        2018    weight
        2020    height

这是我对索引切片的尝试:

 isvalid_yr = False
 while not isvalid_yr:
     year_input = int(input("Input the year you want to compare data from: "))
     if year_input in data1.index.get_level_values('year') 
         idx = pd.IndexSlice
         isvalid_yr = True
         new_data1 = data1.loc(axis = 0)[idx[year_input:year_input],idx[:]]
     else:
          isvalid_yr = False
     try:
         if isvalid_yr ==True:
             pass
         else:
             raise ValueError("Year not in data!")
         except ValueError as err:
             print("Year not in data!")

它给了我这个我不想要的输出

Empty DataFrame
Columns: [stats]
Index: []

我想要实现的最终期望输出如下

Input the year you want to compare data from: 2007

new_data1 = 的结果

                  stats
gender  year    

women   
        2007    height
        2007    cigarette use

解决方法

使用 xs 抓取 DataFrame 的横截面:

res = df.xs(2007,axis=0,level='year',drop_level=False)

res

                     stats
gender year               
women  2007         height
       2007  cigarette use

使用用户输入:

while True:
    try:
        year_input = int(
            input("Input the year you want to compare data from: ")
        )
        res = df.xs(year_input,drop_level=False)
        break
    except KeyError:
        print("Year not in data!")
    except ValueError:
        print("Please enter a valid year")

df 使用:

df = pd.DataFrame({
    'gender': ['women','women','women'],'year': [2003,2007,2008,2009,2015,2018,2020],'stats': ['cellphone use','height','cigarette use','weight','cellphone use','height']
}).set_index(['gender','year'])

df

                     stats
gender year               
women  2003  cellphone use
       2007         height
       2007  cigarette use
       2008         weight
       2009  cellphone use
       2015  cigarette use
       2018         weight
       2020         height