在二值图像中找到连接斑点坐标的有效方法

问题描述

我正在寻找二进制图像（0 或 1 的二维 numpy 数组）中连接的 blob 的坐标。

skimage 库提供了一种非常快速的方法来标记数组中的 blob（我从类似的 SO 帖子中找到了它）。但是我想要一个 blob 坐标列表，而不是一个带标签的数组。我有一个从标记图像中提取坐标的解决方案。但它很慢。远比初始标记慢。

最小可重现示例：

import timeit
from skimage import measure
import numpy as np

binary_image = np.array([
        [0,1,1],[0,0],])

print(f"\n\n2d array of type: {type(binary_image)}:")
print(binary_image)

labels = measure.label(binary_image)

print(f"\n\n2d array with connected blobs labelled of type {type(labels)}:")
print(labels)

def extract_blobs_from_labelled_array(labelled_array):
    # The goal is to obtain lists of the coordinates
    # Of each distinct blob.

    blobs = []

    label = 1
    while True:
        indices_of_label = np.where(labelled_array==label)
        if not indices_of_label[0].size > 0:
            break
        else:
            blob =list(zip(*indices_of_label))
            label+=1
            blobs.append(blob)


if __name__ == "__main__":
    print("\n\nBeginning extract_blobs_from_labelled_array timing\n")
    print("Time taken:")
    print(
        timeit.timeit(
            'extract_blobs_from_labelled_array(labels)',globals=globals(),number=1
            )
        )
    print("\n\n")

输出：

2d array of type: <class 'numpy.ndarray'>:
[[0 1 0 0 1 1 0 1 1 0 0 1]
 [0 1 0 1 1 1 0 1 1 1 0 1]
 [0 0 0 0 0 0 0 1 1 1 0 0]
 [0 1 1 1 1 0 0 0 0 1 0 0]
 [0 0 0 0 0 0 0 1 1 1 0 0]
 [0 0 1 0 0 0 0 0 0 0 0 0]
 [0 1 0 0 1 1 0 1 1 0 0 1]
 [0 0 0 0 0 0 0 1 1 1 0 0]
 [0 1 1 1 1 0 0 0 0 1 0 0]]


2d array with connected blobs labelled of type <class 'numpy.ndarray'>:
[[ 0  1  0  0  2  2  0  3  3  0  0  4]
 [ 0  1  0  2  2  2  0  3  3  3  0  4]
 [ 0  0  0  0  0  0  0  3  3  3  0  0]
 [ 0  5  5  5  5  0  0  0  0  3  0  0]
 [ 0  0  0  0  0  0  0  3  3  3  0  0]
 [ 0  0  6  0  0  0  0  0  0  0  0  0]
 [ 0  6  0  0  7  7  0  8  8  0  0  9]
 [ 0  0  0  0  0  0  0  8  8  8  0  0]
 [ 0 10 10 10 10  0  0  0  0  8  0  0]]


Beginning extract_blobs_from_labelled_array timing

Time taken:
9.346099977847189e-05

9e-05 很小，但此示例图像也很小。实际上，我正在处理非常高分辨率的图像，该函数大约需要 10 分钟。

有没有更快的方法来做到这一点？

旁注：我只使用 list(zip()) 来尝试将 numpy 坐标转换为我习惯的东西（我只使用 Python 很少使用 numpy）。我应该跳过这个而只使用坐标按原样索引吗？这会加快速度吗？

解决方法

慢的部分代码在这里：

  const [messageRecord,setMessageRecord] = useState([]);
  const { userInfo } = useContext(UserInfoContext);
  const { projectInfo } = props.route.params;

  const sendMessage = (msg) => {
    const payload = {
      action: "message",msg,projectId: projectInfo.PK,senderId: userInfo.userId,senderName: userInfo.userGivenName,type: "update",};
    socket.send(JSON.stringify(payload));
  };

  const startConnection = () => {
    const payload = {
      action: "message",msg: "",senderId: "",senderName: "",type: "create",};
    socket.send(JSON.stringify(payload));
  };

  const getMessageRecord = () => {
    const payload = {
      action: "message",type: "get",};
    socket.send(JSON.stringify(payload));
  };

  socket.onmessage = (e) => {
    const server_message = e.data;
    const sortedRecord = sortMessageByDate(JSON.parse(server_message).message);
    setMessageRecord(sortedRecord);
  };

  useEffect(() => {
    startConnection();
    getMessageRecord();
  },[]);

首先，一个完整的旁白：当您知道将迭代的元素数量时，您应该避免使用 while True: indices_of_label = np.where(labelled_array==label) if not indices_of_label[0].size > 0: break else: blob =list(zip(*indices_of_label)) label+=1 blobs.append(blob)。这是难以找到的无限循环错误的秘诀。

相反，您应该使用：

while True

然后您可以忽略 for label in range(np.max(labels)):。

第二个问题确实是您使用的是 if ...: break，与 NumPy 函数相比，这很慢。在这里，您可以使用 list(zip(*)) 获得大致相同的结果，这将为您提供形状为 np.transpose(indices_of_label) 的二维数组，即 (n_coords,n_dim)。

但最大的问题是表达式 (n_coords,2)。这将为每个标签检查一次图像的每个像素。（实际上两次，因为然后您运行 labelled_array == label，这需要另一次传递。）这是很多不必要的工作，因为可以在一次传递中找到坐标。

scikit-image 函数 skimage.measure.regionprops 可以为您做到这一点。 np.where() 遍历图像一次并返回一个列表，每个标签包含一个 regionprops 对象。该对象有一个 RegionProps 属性，其中包含 blob 中每个像素的坐标。所以，这是您的代码，经过修改以使用该功能：

.coords

binary-image numpy python scikit-image