问题描述
我收到此错误:
TypeError: Object of type Interval is not JSON serializable
这是我的代码。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
from bokeh.io import output_file,show
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.models import NumeralTickFormatter
def construct_labels(start,end):
labels = []
for index,x in enumerate(start):
y = end[index]
labels.append('({},{}]'.format(x,y))
return labels
values = {'Length': np.random.uniform(0,4,10)}
df = pd.DataFrame(values,columns=['Length'])
bin_step_size = 0.5
# List of bin points.
p_bins = np.arange(0,(df['Length'].max() + bin_step_size),bin_step_size)
# Reduce the tail to create the left side bounds.
p_left_limits = p_bins[:-1].copy()
# Cut the head to create the right side bounds.
p_right_limits = np.delete(p_bins,0)
# Create the bins.
p_range_bins = pd.IntervalIndex.from_arrays(p_left_limits,p_right_limits)
# Create labels.
p_range_labels = construct_labels(p_left_limits,p_right_limits)
p_ranges_binned = pd.cut(
df['Length'],p_range_bins,labels=p_range_labels,precision=0,include_lowest=True)
out = p_ranges_binned
counts = out.value_counts(sort=False)
total_element_count = len(df.index)
foo = pd.DataFrame({'bins': counts.index,'counts': counts})
foo.reset_index(drop=True,inplace=True)
foo['percent'] = foo['counts'].apply(lambda x: x / total_element_count)
foo['percent_full'] = foo['counts'].apply(lambda x: x / total_element_count * 100)
bin_labels = p_range_labels
# Data Container
source = ColumnDataSource(dict(
bins=foo['bins'],percent=foo['percent'],count=foo['counts'],labels=pd.DataFrame({'labels': bin_labels})
))
p = figure(x_range=bin_labels,plot_height=600,plot_width=1200,title="Range Counts",toolbar_location=None,tools="")
p.vbar(x='labels',top='percent',width=0.9,source=source)
p.yaxis[0].formatter = NumeralTickFormatter(format="0.0%")
p.xaxis.major_label_orientation = math.pi / 2
p.xgrid.grid_line_color = None
p.y_range.start = 0
output_file("bars.html")
show(p)
解决方法
错误来自这里:
source = ColumnDataSource(dict(
bins=foo['bins'],percent=foo['percent'],count=foo['counts'],labels=pd.DataFrame({'labels': bin_labels})
))
您传入的bins
是interval
类型,无法进行JSON序列化。
查看代码后,此bins
变量未在绘图中使用,因此您可以将其更改为:
source = ColumnDataSource(dict(
percent=foo['percent'],labels=bin_labels
))
请注意,我还将您的标签更改为bin_labels
,这是一个列表,ColumnDataSource
可以将列表用作输入。但是您可能想要进一步格式化这些标签,就像现在一样
['(0.0,0.5]','(0.5,1.0]','(1.0,1.5]','(1.5,2.0]','(2.0,2.5]','(2.5,3.0]','(3.0,3.5]','(3.5,4.0]']
您可能希望将它们格式化为更漂亮的格式。
进行此小更改后,您应该可以看到条形图: