为什么 Tensorflow 伯努利分布总是返回 0？

问题描述

我正在根据单词出现次数对文本进行分类。中的一个步骤是估计每个特定文本的概率可能的班级。要做到这一点，我从一个文本中获得了 NSAMPLES NFEATURES 词的词汇，每个词都标有 NLABELS 类标签之一。由此，我构建了一个二进制文件出现矩阵，其中 entry(sample,feature) 为 1 iff 文本“sample” 包含由“特征”编码的词。

从出现矩阵，我们可以构造一个条件矩阵概率然后平滑它所以概率既不是 0.0 或 1.0，使用以下代码（从 Coursera notebook 复制）：

def laplace_smoothing(labels,binary_data,n_classes):
    # Compute the parameter estimates (adjusted fraction of documents in class that contain word)
    n_words = binary_data.shape[1]
    alpha = 1 # parameters for Laplace smoothing
    theta = np.zeros([n_classes,n_words]) # stores parameter values - prob. word given class
    for c_k in range(n_classes): # 0,1,...,19
        class_mask = (labels == c_k)
        N = class_mask.sum() # number of articles in class
        theta[c_k,:] = (binary_data[class_mask,:].sum(axis=0) + alpha)/(N + alpha*2)
    return theta

要查看问题，这里是模拟输入并调用结果：

import tensorflow_probability as tfp
tfd = tfp.distributions

NSAMPLES = 2000   # Size of corpus
NFEATURES = 10000 # Number of words in corpus
NLABELS = 10      # Number of classes
ONE_PROB = 0.02   # Probability that binary_datum will be 1

def mock_binary_data( nsamples,nfeatures,one_prob ):
    binary_data = ( np.random.uniform( 0,( nsamples,nfeatures ) ) < one_prob ).astype( 'int32' )
    return binary_data

def mock_labels( nsamples,nlabels ):
    labels = np.random.randint( 0,nlabels,nsamples )
    return labels

binary_data = mock_binary_data( NSAMPLES,NFEATURES,ONE_PROB )
labels = mock_labels( NSAMPLES,NLABELS )
smoothed_data = laplace_smoothing( labels,NLABELS )

bernoulli = tfd.Independent( tfd.Bernoulli( probs = smoothed_data ),reinterpreted_batch_ndims = 1 )

test_random_data = mock_binary_data( 1,ONE_PROB )[ 0 ]
bernoulli.prob( test_random_data )

当我执行这个时，我得到：

<tf.Tensor: shape=(10,),dtype=float32,numpy=array([0.,0.,0.],dtype=float32)>

也就是说，所有的概率都为零。这里有些步骤不正确，你能请帮我找到它？

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

bernoulli-probability python tensorflow tensorflow tensorflow tensorflow-probability