Python 的模式匹配性能如何?是 O(1) 吗?

问题描述

我们知道,Python 将在 3.10 中支持 pattern matching。撇开用法和语法不谈,我似乎在 PEP 或 official tutorial 中找不到任何关于性能的词。

所以我想知道它的性能如何?是 O(1) 还是和 if..elif..else 一样?

解决方法

将脱糖模式匹配为链式 elif 语句和 isinsance 调用。它绝对不是 O(1) 或接近任何类型的查找表。

编辑:更准确地说,在 merged pull request 中,您可以看到 match 语句变成了负责处理各种模式的操作码链。这些操作码按顺序处理,很像链式 elif 语句。

,

仅就时间复杂度而言,这两种结构基本上是等价的。

但是,在考虑可读性和代码结构时,它们非常不同。作为一般规则,如果您要匹配或测试多个科目,if-elif-else 阶梯可能是工作:

def stringify_response(response: int) -> str:
    if 100 <= response < 200:
        kind = "info"
    elif 200 <= response < 300:
        kind = "success"
    elif 300 <= response < 400:
        kind = "redirect"
    elif 400 <= response < 500:
        kind = "client error"
    elif 500 <= response < 600:
        kind = "server error"
    else:
        kind = "unexpected response"
    return f"{kind} ({response})"

另一方面,如果您要匹配单个主题的结构,那么结构模式匹配可能是完成这项工作的最佳工具:>

from ast import *

def lint_function_def(function_def: FunctionDef) -> None:
    match function_def.body:
        case *_,Assign([Name(na)],e),Return(Name(nr,lineno=l)) if na == nr:
            print(f"line {l}: consider returning {unparse(e)} instead of "
                  f"assigning to {na!r} on line {e.lineno}")

未来的改进

话虽如此,对模式匹配的句法支持的好处之一是编译器和解释器拥有非凡的上下文知识。因此,他们可以做出普通控制流无法做出的假设。

考虑来自 PEP 636 的这个声明(稍微简化):

match command.split():
    case ["north"] | ["go","north"]:
        # Code for moving north.
    case ["get",obj] | ["pick","up",obj,"up"]:
        # Code for picking up the given object.

“当前”(CPython 3.10)模式编译器非常幼稚,将其编译为如下所示:

_subject = command.split()
import _collections_abc
if (
    isinstance(_subject,_collections_abc.Sequence)
    and len(_subject) == 1
    and _subject[0] == "north"
):
    # Code for moving north.
elif (
    isinstance(_subject,_collections_abc.Sequence)
    and len(_subject) == 2
    and _subject[0] == "go"
    and _subject[1] == "north"
):
    # Code for moving north.
elif (
    isinstance(_subject,_collections_abc.Sequence)
    and len(_subject) == 2
    and _subject[0] == "get"
):
    obj = _subject[1]
    # Code for picking up the given object.
elif (
    isinstance(_subject,_collections_abc.Sequence)
    and len(_subject) == 3
    and _subject[0] == "pick"
    and _subject[1] == "up"
):
    obj = _subject[2]
    # Code for picking up the given object.
elif (
    isinstance(_subject,_collections_abc.Sequence)
    and len(_subject) == 3
    and _subject[0] == "pick"
    and _subject[2] == "up"
):
    obj = _subject[1]
    # Code for picking up the given object.

尽管这个例子很简单,但生成的代码充满了冗余检查。匹配命令 "pick Spam up" 检查 isinstance(_subject,_collections_abc.Sequence) 五次,调用 len(_subject) 五次,检查 _subject[0] == "pick" 两次!

然而,编译器完全有可能生成更高效的“决策树”,而不执行不必要的工作。这可能是这样的:

_subject = command.split()
import _collections_abc
if isinstance(_subject,_collections_abc.Sequence):
    _len_subject = len(_subject)
    if _len_subject == 1:
        if _subject[0] == "north":
            # Code for moving north.
    elif _len_subject == 2:
        _subject_0 = _subject[0]
        if _subject_0 == "go":
            if _subject[1] == "north":
                # Code for moving north.
        elif _subject_0 == "get":
            obj = _subject[1]
            # Code for picking up the given object.
    elif _len_subject == 3:
        if _subject[0] == "pick":
            _subject_1 = _subject[1]
            if _subject_1 == "up":
                obj = _subject[2]
                # Code for picking up the given object.
            elif _subject[2] == "up":
                obj = _subject_1
                # Code for picking up the given object.

当你有守卫或嵌套模式时,它开始变得更加复杂......但我希望我们能够在 CPython 3.11 中加入这样的东西!