在两个轴上都带有熊猫的DateTimeIndex的热图

问题描述

我想用带有DateTimeIndex的pandas DataFrame(或Series)制作一个热图,这样我在x轴上有几小时,在y轴上有天,两个刻度标签都以DateTimeIndex样式显示

如果我执行以下操作:

    import pandas as pd
    import numpy as np
    import seaborn as sns

    df = pd.DataFrame(np.random.randint(10,size=4*24*200))
    df.index = pd.date_range(start='2019-02-01 11:30:00',periods=200*24*4,freq='15min')

    df['minute'] = df.index.hour*60 + df.index.minute
    df['dayofyear'] = df.index.month + df.index.dayofyear

    df = df.pivot(index='dayofyear',columns='minute',values=df.columns[0])
    sns.heatmap(df)

索引显然丢失了DateTime格式:

enter image description here

我想要的是这样的东西(我用一个复杂的,无法通用化的功能实现了,该功能显然无法正常工作):

enter image description here

有人知道用python创建这种热图的巧妙方法吗?


编辑:

我创建的函数

    def plot_heatmap(df_in,plot_column=0,figsize=(20,12),vmin=None,vmax=None,cmap='jet',xlabel='hour (UTC)',ylabel='day',rotation=0,freq='5s'):
        '''
        Plots heatmap with date labels

        df_in:    pandas DataFrame od pandas Series
        plot_column:  column to plot if DataFrame has multiple columns

        ...

        '''

        # convert to DataFrame in case a Series is passed:
        try:
            df_in = df_in.to_frame()
        except AttributeError:
            pass
        
        # make copy in order not to overrite input (in case input is an object attribute)
        df = df_in.copy()

        # pad missing dates:
        idx = pd.date_range(df_in.index[0],df_in.index[-1],freq=freq)
        df = df.reindex(idx,fill_value=np.nan)


        df['hour'] = df.index.hour*3600 + df.index.minute*60 + df.index.second
        df['dayofyear'] = df.index.month + df.index.dayofyear

        # Create mesh for heatmap plotting:
        pivot = df.pivot(index='dayofyear',columns='hour',values=df.columns[plot_column])

        # plot
        plt.figure(figsize=figsize)
        sns.heatmap(pivot,cmap=cmap)

        # set xticks
        plt.xticks(np.linspace(0,pivot.shape[1],25),labels=range(25))
        plt.xlabel(xlabel)

        # set yticks
        ylabels = []
        ypositions = []

        day0 = df['dayofyear'].unique().min()
        for day in df['dayofyear'].unique():
            day_delta = day-day0
            # create pandas Timestamp
            temp_tick = df.index[0] + pd.timedelta('%sD' %day_delta)
            # check wheter tick shall be shown or not
            if temp_tick.day==1 or temp_tick.day==15:
                temp_tick_nice = '%s-%s-%s' %(temp_tick.year,temp_tick.month,temp_tick.day)
                ylabels.append(temp_tick_nice)
                ypositions.append(day_delta)


        plt.yticks(ticks=ypositions,labels=ylabels,rotation=0)
        plt.ylabel(ylabel)

解决方法

日期格式消失了,因为您这样做了:

df['dayofyear'] = df.index.month + df.index.dayofyear

这里,两个系列都是整数,所以df['dayofyear']也是整数类型。

相反,请执行以下操作:

df['dayofyear'] = df.index.date

然后您将得到输出:

enter image description here

,

我现在发现的最佳解决方案如下:如果DatetimeIndex的频率为

import pandas as pd
import numpy as np
import seaborn as sns

freq = '30s'

df = pd.DataFrame(np.random.randint(10,size=4*24*200*20))
df.index = pd.date_range(start='2019-02-01 11:30:00',periods=200*24*4*20,freq=freq)

df['hour'] = df.index.strftime('%H:%M:%S')
df['dayofyear'] = df.index.date


df = df.pivot(index='dayofyear',columns='hour',values=df.columns[0])
df.columns = pd.DatetimeIndex(df.columns).strftime('%H:%M')
df.index = pd.DatetimeIndex(df.index).strftime('%m/%Y')

xticks_spacing = int(pd.Timedelta('2h')/pd.Timedelta(freq))
ax = sns.heatmap(df,xticklabels=xticks_spacing,yticklabels=30)
plt.yticks(rotation=0)

哪个会产生以下结果:

enter image description here

唯一的缺陷是,使用此方法无法很好地定义月份刻度位置并精确定位...