在python中将函数与pandas系列一起使用时出错

问题描述

作为输入,有一个数字从1到12的数组。在输出处,我想获得一个将根据数字生成一年中的时间的数组

import pandas as pd

month = pd.Series([i for i in range(1,13)])
def mkseason(n):
    if 3<=n<=5: season = 'spring'
    elif 6<=n<=8: season = 'summer'
    elif 9<=n<=11: season = 'fall'
    elif n<=2 or n==12: season = 'winter'
    else: season = 'unkNown'
    return(season)

结果,我想获取数组-

['winter','winter','spring','summer','fall','winter']

当我尝试做这样的事情时:

mkseason(month)

我有一个错误。我应该如何解决我的问题?我需要使用没有循环的熊猫

解决方法

12使用模和组的整数除法,对dictionary使用最后的映射:

month = (((month % 12) // 3).map({0:'winter',1:'spring',2:'summer',3:'fall'})
                            .fillna('unknown'))
print (month)
0     winter
1     winter
2     spring
3     spring
4     spring
5     summer
6     summer
7     summer
8       fall
9       fall
10      fall
11    winter
dtype: object

详细信息

print ((month % 12) // 3)
0     0
1     0
2     1
3     1
4     1
5     2
6     2
7     2
8     3
9     3
10    3
11    0
dtype: int64

性能:

#140k rows
#added 13 for test unknown
months = pd.Series([i for i in range(1,14)] * 10000)


In [199]: %timeit [season_for_month(m) for m in months]
58.3 ms ± 5.26 ms per loop (mean ± std. dev. of 7 runs,10 loops each)

In [200]: %timeit (((months % 12) // 3).map({0:'winter',3:'fall'}).fillna('unknown'))
14.5 ms ± 286 µs per loop (mean ± std. dev. of 7 runs,100 loops each)
,

如果您想使用pandas,则可以进行以下操作:

import pandas as pd

def season_for_month(month: int) -> str:
    """Returns the season as a string for a given month index.

    Args:
        month: The month index.
    Returns:
        The season for the given month index
    """
    if 3 <= month <= 5:
        return 'spring'
    elif 6 <= month <= 8:
        return 'summer'
    elif 9 <= month <= 11:
        return 'fall'
    elif month <= 2 or month == 12:
        return 'winter'
    else: 
        return 'unknown'

def main():
    months = pd.Series(range(1,13))
    seasons = [season_for_month(m) for m in months]
    print(f'months = {months}')
    print(f'seasons = {seasons}')

if __name__ == '__main__':
    main()

为了以字符串list的形式获取季节,我们需要使用列表推导功能,即seasons = [season_for_month(m) for m in months],并使用我们的函数season_for_month,该函数需要每月整数和返回相应的季节。