如何在 Python 文档字符串测试中处理舍入到负零

问题描述

与此问题相关的:How to have negative zero always formatted as positive zero in a python string?

我有以下使用 Numpy 实现 Matlab 的 orth.m 的函数我有一个依赖于 np.array2string 的 docstring 测试,使用suppress_small=True,这将使小值四舍五入为零。但是,有时它们会舍入到正零,有时会舍入到负零,这取决于答案是 1e-16 还是 -1e-17 或类似结果。发生哪种情况取决于 SVD 分解,并且可能因平台或 Python 版本而异,具体取决于所使用的底层线性代数求解器(BLAS、Lapack 等)

设计文档字符串测试以解决此问题的最佳方法是什么?

在最终的 doctest 中,有时 Q[0,1] 项是 -0。有时是 0。

import doctest
import numpy as np

def orth(A):
    r"""
    Orthogonalization basis for the range of A.

    That is,Q.T @ Q = I,the columns of Q span the same space as the columns of A,and the number
    of columns of Q is the rank of A.

    Parameters
    ----------
    A : 2D ndarray
        Input matrix

    Returns
    -------
    Q : 2D ndarray
        Orthogonalization matrix of A

    Notes
    -----
    #.  Based on the Matlab orth.m function.

    Examples
    --------
    >>> import numpy as np

    Full rank matrix
    >>> A = np.array([[1,1],[-1,-2,0],[0,1,-1]])
    >>> r = np.linalg.matrix_rank(A)
    >>> print(r)
    3

    >>> Q = orth(A)
    >>> with np.printoptions(precision=8):
    ...     print(Q)
    [[-0.12000026 -0.80971228  0.57442663]
     [ 0.90175265  0.15312282  0.40422217]
     [-0.41526149  0.5664975   0.71178541]]

    Rank deficient matrix
    >>> A = np.array([[1,[1,1]])
    >>> r = np.linalg.matrix_rank(A)
    >>> print(r)
    2

    >>> Q = orth(A)
    >>> print(np.array2string(Q,precision=8,suppress_small=True))  # Sometimes this fails
    [[-0.70710678 -0.        ]
     [ 0.          1.        ]
     [-0.70710678  0.        ]]

    """
    # compute the SVD
    (Q,S,_) = np.linalg.svd(A,full_matrices=False)
    # calculate a tolerance based on the first eigenvalue (instead of just using a small number)
    tol = np.max(A.shape) * S[0] * np.finfo(float).eps
    # sum the number of eigenvalues that are greater than the calculated tolerance
    r = np.sum(S > tol,axis=0)
    # return the columns corresponding to the non-zero eigenvalues
    Q = Q[:,np.arange(r)]
    return Q

if __name__ == '__main__':
    doctest.testmod(verbose=False)

解决方法

您可以打印 rounded 数组加上 0.0 以消除 -0

A = np.array([[1,1],[0,1,0],[1,1]])
Q = orth(A)
Q[0,1] = -1e-16   # simulate a small floating point deviation

print(np.array2string(Q.round(8)+0.0,precision=8,suppress_small=True))
#[[-0.70710678  0.        ]
# [ 0.          1.        ]
# [-0.70710678  0.        ]]

所以你的文档字符串应该是:

>>> Q = orth(A)
>>> print(np.array2string(Q.round(8)+0.0,suppress_small=True)) # guarantee non-negative zeros
[[-0.70710678  0.        ]
 [ 0.          1.        ]
 [-0.70710678  0.        ]]
,

这是我想出的另一种选择,尽管我认为我更喜欢将数组四舍五入到给定的精度。在这种方法中,您将整个数组移动了一些大于舍入误差但小于精度比较的量。那样的话,小数字仍然会稍微为正。

    >>> Q = orth(A)
    >>> print(np.array2string(Q + np.full(Q.shape,1e-14),suppress_small=True))
    [[-0.70710678  0.        ]
     [ 0.          1.        ]
     [-0.70710678  0.        ]]