在python中将多个概率分布组合到一个分布中

问题描述

我有一个涉及使用传感器的实验，我有大约5个数据文件，其中包含在时域中从传感器收集的数据。为简单起见，假设我们集中于一个传感器，我将要求获取所有数据文件的概率分布。我上网查找并设法通过以下链接找到最合适的分布：

Fitting empirical distribution to theoretical ones with Scipy (Python)

对于我来说，事实证明正态分布适合我的数据。因此，我有多个发行版，并希望将它们全部组合成一个发行版。我所做的是，我通过获取每个密度值并将其除以5来平均每个概率密度。

平均代码使用以下代码完成：

def average(l):
    llen = len(l)
    def divide(x):
        return x / llen
    return map(divide,map(sum,zip(*l)))

for _ in range(5):
        # read sensor data
        # Obtain the probability distribution using code in the first link
        # Getting list of pdf:
        np_pdf = list(y_axis_pdf)

        lt.append(np_pdf)

Average_list = average(lt)
Average_list = list(Average_list)

但是，我问了几个人并在网上搜索，它说平均并不是最好的方法。那么，将几种概率分布组合为一个的正确方法是什么？

第二个问题是我在网上搜索并找到了这篇文章：

How to Combine Independent Data Sets for the Same Quantity

如何使用第一个链接到本文方法中的代码？

编辑1：

基于@SeverinPappadeux的评论，我编辑了代码，内容如下：

# Combining all PDF files into one dataset:
pdf_data = [np_pdf_01,np_pdf_02,np_pdf_03,np_pdf_04,np_pdf_05]
pdf_dataframe_ini = pd.DataFrame(pdf_data)
pdf_dataframe = pd.DataFrame.transpose(pdf_dataframe_ini)

# Creating one PDF from the PDF dataset:
gmm = GMM(n_components=1)
gmm.fit(pdf_dataframe)
x_pdf_data = [x_axis_pdf_01,x_axis_pdf_02,x_axis_pdf_03,x_axis_pdf_04,x_axis_pdf_05]
x_pdf = average(x_pdf_data)
x_pdf = list(x_pdf)
x = np.linspace(np.min(x_pdf),np.max(x_pdf),len(x_pdf)).reshape(len(x_pdf),1)
logprob = gmm.score_samples(x)
pdf = np.exp(logprob)

我不断遇到以下错误：

logprob = gmm.score_samples(x)
ValueError: Expected the input data X have 10 features,but got 1 features

如何解决此错误并获得组合pdf的pdf图？

来源：

How can I plot the probability density function for a fitted Gaussian mixture model under scikit-learn?

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

distribution distribution numpy python scipy scipy statistics