当访问ctypes返回的结构体中的指针时,Python崩溃

问题描述

我正在尝试访问此函数在Python中返回的树:

TokenTreeNode* generatetokenTree(const wchar_t** tokens,const unsigned int tokenCount,const BlockToken* blockTokens,const unsigned int blockTokenCount);

TokenTreeNode的定义:

typedef struct tree_node_struct {
    wchar_t* token;
    size_t childCount;
    struct tree_node_struct* parent;
    struct tree_node_struct** children;
} TokenTreeNode;

BlockToken的定义:

typedef struct {
    const wchar_t* begin;
    const wchar_t* end;
} BlockToken;

这是对应的ctypes python接口的代码

import ctypes
from typing import List

class CBlockToken(ctypes.Structure):
    _fields_ = [('begin',ctypes.c_wchar_p),('end',ctypes.c_wchar_p)]

class CTokenTreeNode(ctypes.Structure):
    _fields_ = [('token',('childCount',ctypes.c_size_t),('parent',ctypes.POINTER('CTokenTreeNode')),('children',ctypes.POINTER(ctypes.POINTER('CTokenTreeNode')))]

def generate_token_tree(tokens: List[str],blockTokens: List[CBlockToken]) -> CTokenTreeNode:
    nativelib = ctypes.WinDLL(source_path + '\\nativelib\\x64\\Release\\nativelib.dll')

    generatetokenTree = nativelib.generatetokenTree
    generatetokenTree.argtype = [
        ctypes.POINTER(ctypes.c_wchar_p),ctypes.c_uint,ctypes.POINTER(CBlockToken),ctypes.c_uint
    ]
    generatetokenTree.restype = ctypes.POINTER(CTokenTreeNode)

    c_arg_tokens             = (ctypes.c_wchar_p * len(tokens))(*tokens)
    c_arg_token_count        = len(tokens)
    c_arg_block_tokens       = (CBlockToken * len(blockTokens))(*blockTokens)
    c_arg_block_tokens_count = len(blockTokens)
    
    root_node = generatetokenTree(c_arg_tokens,c_arg_token_count,c_arg_block_tokens,c_arg_block_tokens_count)

    return root_node.contents

当我尝试访问返回的根节点的子代时,python崩溃。

>>> test_tokens  = ['var','test','=','"','string','"']
>>> block_tokens = [parser.CBlockToken('"','"')]
>>> root_node = parser.generate_token_tree(test_tokens,block_tokens)
>>> root_node.childCount
4
>>> root_node.children[0].contents

python crashes

如果我使用C ++访问生成的树,那么一切都会按预期进行。对我而言,问题似乎是由于某种原因不允许python访问外部库分配的内存。但是,我不知道如何解决甚至调试该问题。

编辑:更改了@MilesBudnek建议的TreetokenNode.childCount类型

编辑2: 这是根据调试器在generatetokenTreeNode()末尾的树的内存布局:

-       rootNode->children,10   0x000001e3c7454250 {0x000001e3c74544f0 {token=0x000001e3c7456b30 L"var" childCount=0 parent=0x000001e3c7454a90 {...} ...},...} tree_node_struct *[10]
+       [0] 0x000001e3c74544f0 {token=0x000001e3c7456b30 L"var" childCount=0 parent=0x000001e3c7454a90 {token=0x0000000000000000 <NULL> ...} ...}   tree_node_struct *
+       [1] 0x000001e3c74547f0 {token=0x000001e3c74565e0 L"test" childCount=0 parent=0x000001e3c7454a90 {token=0x0000000000000000 <NULL> ...} ...}  tree_node_struct *
+       [2] 0x000001e3c7454130 {token=0x000001e3c537d0e0 L"=" childCount=0 parent=0x000001e3c7454a90 {token=0x0000000000000000 <NULL> ...} ...} tree_node_struct *
-       [3] 0x000001e3c7454d30 {token=0x000001e3c7456360 L"\"\"" childCount=1 parent=0x000001e3c7454a90 {token=0x0000000000000000 <NULL> ...} ...}  tree_node_struct *
    -       children    0x000001e3c74569f0 {0x000001e3c7454df0 {token=0x000001e3c74567c0 L"string" childCount=0 parent=0x000001e3c7454d30 {...} ...}}   tree_node_struct * *
    -           0x000001e3c7454df0 {token=0x000001e3c74567c0 L"string" childCount=0 parent=0x000001e3c7454d30 {token=...} ...}  tree_node_struct *
        +       token   0x000001e3c74567c0 L"string"    wchar_t *
        childCount  0   unsigned __int64
        +       parent  0x000001e3c7454d30 {token=0x000001e3c7456360 L"\"\"" childCount=1 parent=0x000001e3c7454a90 {token=0x0000000000000000 <NULL> ...} ...}  tree_node_struct *
        +       children    0x0000000000000000 {???}    tree_node_struct * *

此处是从python代码调用函数,我们可以看到在python访问数据结构之前,内存布局是正确的。如果我们尝试读取根节点的子节点,则会发生访问冲突。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)