问题描述
我有大约30万行包含UID / Email / Filename作为列的excel。我使用熊猫生成的这个excel,它具有所有重复的电子邮件。
文件名,电子邮件和UID均包含字母,数字和特殊字符,因此我无法为这些数据绘制图形。
我当前的代码是;
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_excel('C:/Users/duplicate.xlsx')
email = df["email"]
UUID = df["UUID"]
colors = ["#1f77b4","#ff7f0e","#2ca02c","#d62728","#8c564b"]
explode = (0.1,0)
plt.pie(email,labels=UuID,explode=explode,colors=colors,autopct='%1.1f%%',shadow=True,startangle=140)
plt.title("Duplicate")
plt.show()
ValueError:无法将字符串转换为float:'[email protected]'
我尝试过
df['email'].astype(float)
df['UUID'].astype(float)
但不起作用
解决方法
我们需要一个数字来表示饼图的比率。标签可以是电子邮件地址或UUID。
import matplotlib.pyplot as plt
import pandas as pd
# df = pd.read_excel('C:/Users/duplicate.xlsx')
email = ['[email protected]','[email protected]','[email protected]','[email protected]','[email protected]']
sizes = [15,25,36,50,41]
df = pd.DataFrame({'email':email,'sizes':sizes})
email = df["email"]
# UUID = df["UUID"]
colors = ["#1f77b4","#ff7f0e","#2ca02c","#d62728","#8c564b"]
explode = (0.1,0)
plt.pie(sizes,labels=email,explode=explode,colors=colors,autopct='%1.1f%%',shadow=True,startangle=140)
plt.title("Duplicate")
plt.show()