在python中使用plotly绘制具有百分比差异的漏斗图

问题描述

我的原始数据框看起来与下面的相似:

| Sales_entity | country | id |
|--------------|---------|----|
| sales B      | US      | 1  |
| sales C      | US      | 1  |
| sales C      | US      | 15 |
| sales B      | US      | 20 |
| sales D      | US      | 21 |
| sales C      | US      | 2  |
| sales D      | US      | 2  |
| sales D      | US      | 12 |
| sales D      | US      | 50 |
| sales A      | US      | 25 |
| sales A      | Canada  | 3  |
| sales A      | Canada  | 3  |
| sales A      | Canada  | 9  |
| sales A      | Canada  | 11 |
| sales C      | Canada  | 4  |
| sales C      | Canada  | 4  |
| sales B      | Canada  | 5  |
| sales D      | Canada  | 5  |
| sales B      | Canada  | 6  |
| sales B      | Canada  | 6  |
| sales B      | Canada  | 8  |
| sales C      | Canada  | 7  |
| sales A      | Canada  | 14 |

这是输入数据框:

df= pd.DataFrame({'id' : [1]*2+[15]+[20]+[21] +[2]*2+[12]+[50]+[25] + [3]*2+[9]+[11] + [4]*2 +[5]*2+[6]*2+[8]*1+[7]+[14],'Sales Entity' : ['sales B','sales C','sales B','sales D','sales A','sales A'],'country' : ['US']*10 + ['Canada']*13})

我想按国家和sales_entity分组,并分别计算ID。为此,我使用了以下代码 df1= df.groupby(['country','Sales Entity'])['id'].nunique().reset_index(name='count') 我的这段代码输出如下:

| country | sales_entity | id |
|---------|--------------|----|
| Canada  | sales A      | 4  |
| Canada  | sales B      | 3  |
| Canada  | sales C      | 2  |
| Canada  | sales D      | 1  |
| US      | sales A      | 1  |
| US      | sales B      | 2  |
| US      | sales C      | 3  |
| US      | sales D      | 4  |

从此输出中,我想分别为加拿大和美国生成一个漏斗图,以便该漏斗显示每个销售实体的价值以及增长/下降百分比。例如,如果我们看加拿大,则该渠道应以“销售额A”开头,其值为4,总百分比为100%。当我们将渠道下移至“销售额B”时,它应显示值3和百分比变化等。同样,对于美国,渠道应以“销售D”开头,然后是“销售C”,“销售B”和“销售A”,以显示其价值和百分比变化。在这种情况下,如何在Python中使用plotly创建漏斗图,而无需手动输入表格中的值?

解决方法

您可以从这里https://plotly.com/python/funnel-charts/获取参考。 对于显示百分比变化,您可以修改文本信息并从中选择任何内容 percent initialpercent previouspercent totalvaluetextinfo Docs

如果需要其他一些列,可以将其添加到数据中并使用 texttemplate Example

import plotly.graph_objects as go
from plotly.subplots import make_subplots

df= pd.DataFrame({'id' : [1]*2+[15]+[20]+[21] +[2]*2+[12]+[50]+[25] + [3]*2+[9]+[11] + [4]*2 +[5]*2+[6]*2+[8]*1+[7]+[14],'Sales Entity' : ['sales B','sales C','sales B','sales D','sales A','sales A'],'country' : ['US']*10 + ['Canada']*13})

df1= df.groupby(['country','Sales Entity'])['id'].nunique().reset_index(name='count')

USDF = df1[df1['country'] == 'US'].sort_values(['count'],ascending=[False])
CNDF = df1[df1['country'] == 'Canada'].sort_values(['count'],ascending=[False])

fig = make_subplots(rows=1,cols=2,subplot_titles=("US Funnel","Canada Funnel"))


fig.add_trace(
    go.Funnel(
    y = USDF['Sales Entity'],x = USDF['count'],textposition = "inside",textinfo = "value+percent previous",marker = {"color": "#1c1847"}
),row=1,col=1
)

fig.add_trace(
    go.Funnel(
    y = CNDF['Sales Entity'],x = CNDF['count'],marker = {"color": "#ff0000"}
),col=2
)

fig.update_layout(showlegend=False,height=600,width=1000)
fig.show()

输出

Output

,

我尝试了一下,指的是官方参考。 我制作了一个图表,比较了处于同一阶段的两个国家。 @venky同时回答了该问题,但我敢回答,因为措辞不同。感谢您有机会学习此内容。 从图中以图形方式导入graph_objects

canada_number = df1.loc[df1['country'] == 'Canada','count']
canada_stage = df1.loc[df1['country'] == 'Canada','Sales Entity']

us_number = df1.loc[df1['country'] == 'US','count']
us_stage = df1.loc[df1['country'] == 'US','Sales Entity']

fig = go.Figure()

fig.add_trace(go.Funnel(
    name = 'Canada',y = canada_stage,x = canada_number,textinfo = "value+percent initial"))

fig.add_trace(go.Funnel(
    name = 'US',orientation = "h",y = us_stage,x = us_number,textinfo = "value+percent previous"))

fig.show()

enter image description here