是否可以在没有循环的情况下找到矩阵中行之间的相似性?

问题描述

我有一个 2D numpy 数组。我正在尝试计算行之间的相似性并将其放入 similarities 数组中。这可能没有循环吗?感谢您的时间!

# ratings.shape = (943,1682)

arri = np.zeros(943)
arri = np.where(arri == 0)[0]

arrj = np.zeros(943)
arrj = np.where(arrj ==0)[0]

similarities = np.zeros((ratings.shape[0],ratings.shape[0]))

similarities[arri,arrj] = np.abs(ratings[arri]-ratings[arrj])

我想制作一个二维数组的相似度,因为相似度[i,j]是评分中第 i 行和第 j 行之间的区别

[ValueError: 形状不匹配:形状 (943,1682) 的值数组无法广播到形状 (943,) 的索引结果] [1][1]:https://i.stack.imgur.com/gtst9.png

解决方法

问题是当用两个数组索引一个二维数组时,numpy 如何遍历数组。


首先进行一些设置:

import numpy;

ratings = numpy.arange(1,6)

indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]

ratings[1 2 3 4 5]

indicesX[[0][1][2][3][4]]

indicesY[[0][1][2][3][4]]


现在让我们看看你的程序产生了什么:

similarities = numpy.zeros((ratings.shape[0],ratings.shape[0]))
similarities[indicesX,indicesY] = numpy.abs(ratings[indicesX]-ratings[0])

similarities

[[0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 2. 0. 0.]
 [0. 0. 0. 3. 0.]
 [0. 0. 0. 0. 4.]]

如您所见,numpy 迭代 similarities 基本上如下所示:

for i in range(5):
    similarities[indicesX[i],indicesY[i]] = numpy.abs(ratings[i]-ratings[0])

similarities

[[0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 2. 0. 0.]
 [0. 0. 0. 3. 0.]
 [0. 0. 0. 0. 4.]]

现在我们需要像下面这样的索引来遍历整个数组:

indecesX = [0,1,2,3,4,4]
indecesY = [0,4]

我们这样做:

# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile().
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX,ratings.shape[0])

indicesY = numpy.repeat(indicesY,ratings.shape[0])

indicesX[0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]

indicesY[0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]

完美!现在只需再次调用 similarities[indicesX,indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY]),我们就会看到:

similarities

[[0. 1. 2. 3. 4.]
 [1. 0. 1. 2. 3.]
 [2. 1. 0. 1. 2.]
 [3. 2. 1. 0. 1.]
 [4. 3. 2. 1. 0.]]

这里是完整的代码:

import numpy;

ratings = numpy.arange(1,1))[0]

similarities = numpy.zeros((ratings.shape[0],ratings.shape[0]))

indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX,ratings.shape[0])

similarities[indicesX,indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
print(similarities)

PS

您对自己的帖子发表了评论以改进它。当您想改进问题时,您应该编辑您的问题,而不是对其发表评论。

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...