.reset_index 之后的 fillna() 用于创建 DataFrame

问题描述

我有 3 个 dfs，其中包含一些用户的通话、消息和互联网数据。我使用 groupby 来查找每个用户每月使用的呼叫（或消息，或 GB）数量，然后使用 .reset_index 将 MultiIndexes 转换为 DataFrames。通过进一步的分析，我注意到对于一些用户 id，有 NaN 值，因为有几个月，一些活跃用户没有拨打任何电话、发送任何消息或使用任何数据。为了解决这个问题，我尝试使用 .fillna() 但它不起作用，因此当我为 total_calls 提取具有已知 NaN 值的特定 user_id 时，它会打印一个空数据帧。

我试过了：

calls_mins_per_month.fillna({'duration':0},inplace=True)
calls_mins_per_month['duration'] = calls_mins_per_month['duration'].fillna(0)
calls_mins_per_month['duration'].fillna(0,inplace=True)

这是我为每个用户每月调用的 DataFrame 代码：

#For each user,find the number of calls made and minutes used per month:
calls_mins_per_month = megaline_calls.groupby(['user_id',"call_month"]).agg({"call_id": len,"duration": "sum"})
calls_mins_per_month.rename(columns={'call_id':'total_calls'},inplace=True)
calls_mins_per_month = calls_mins_per_month.reset_index()
#print(calls_mins_per_month['duration'].isna().count())
calls_mins_per_month.fillna({'duration':0},inplace=True)

有人能指出我做错了什么吗？

解决方法

您可以尝试一些操作：

使用 df.replace(' ','') 可能是一个空白空间，并且不能将 fillna() 识别为 'Nan' 值或检查 Nan 值实际上不是 'Nan' 字符串，如果是，那么您可以执行 df.replace('Nan','')

希望你能解决

dataframe missing-data pandas pandas python