Boost 属性树无法在多线程上下文中检索简单的 JSON

问题描述

我正在尝试在我的 C/C++ 应用程序中使用 Boost.PropertyTree 解析一个简单的 JSON 字符串。

{"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"}

以下是我在 C/C++ 多线程应用程序中设置它的方式(手动定义 JSON 字符串以演示问题)。

ParseJson.cpp

#ifdef __cplusplus
extern "C"
{
#endif

#include "ParseJson.hpp"

#ifdef __cplusplus
}
#endif

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

extern "C" MyStruct * const parseJsonMessage(char * jsonMessage,unsigned int const messageLength) {
    MyStruct * myStruct = new MyStruct();
    // Create empty property tree object.
    ptree tree;

    if (myStruct != nullptr) {
        try {
            // Create an istringstream from the JSON message.
            std::string jsonMessageString("{\"header\":{\"version\":42,\"source\":1,\"destination\":2},\"coffee\":\"colombian\"}");   // doesn't work
            std::istringstream isstreamJson(jsonMessageString);

            // Parse the JSON into the property tree.
            std::cout << "Reading JSON ..." << jsonMessageString << "...";
            read_json(isstreamJson,tree);
            std::cout << " Done!" << std::endl;

            // Get the values from the property tree.
            printf("version: %d\n",tree.get<int>("header.version"));
            printf("source: %d\n",tree.get<int>("header.source"));
            printf("coffee: %s\n",tree.get<std::string>("coffee").c_str());
        }
        catch (boost::property_tree::ptree_bad_path badpathException) {
            std::cout << "Exception caught for bad path: " << badpathException.what() << std::endl;
            return nullptr;
        }
        catch (boost::property_tree::ptree_bad_data badDataException) {
            std::cout << "Exception caught for bad data: " << badDataException.what() << std::endl;
            return nullptr;
        }
        catch (std::exception exception) {
            std::cout << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
            return nullptr;
        }
    }
    return myStruct;
}

read_json() 调用似乎已完成,但从属性树中检索解析数据的 get() 调用失败:

Reading JSON ...{"header":{"version":42,"coffee":"colombian"}... Done!
Exception caught for bad path: No such node (header.version)

我在 RHEL 7 上使用 Boost 1.53(编译器是 gcc/g++ 版本 4.8.5),我已经尝试了 post 中提到的与 Boost.PropertyTree 和多线程相关的两个建议。我已经为项目全局定义了 BOOST_SPIRIT_THREADSAFE 编译定义。我还尝试了为该帖子建议的原子交换解决方案。这些都对症状没有任何影响。

奇怪的是,我可以使用另一个 public methods for Boost.Property 树手动获取值:

std::cout << "front.key: " << tree.front().first << std::endl;
std::cout << "front.front.key: " << tree.front().second.front().first << std::endl;
std::cout << "front.front.value: " << tree.front().second.front().second.get_value_optional<std::string>() << std::endl;

显示实际解析了 JSON:

front.key: header
front.front.key: version
front.front.value:  42

注意,我必须使用 std::string获取 header.version 值,因为尝试使用 get_value_optional<int>() 也会崩溃。

然而,这种手动方法不可扩展;我的应用程序需要接受几个更复杂的 JSON 结构。

当我尝试更复杂的 JSON 字符串时,它们也被成功解析,但使用 get() 方法访问值同样失败,这次使程序崩溃。这是我从崩溃中提取的 GDB 回溯之一,但我对 Boost 不够熟悉,无法从中获得任何有用的信息:

Program received signal SIGSEGV,Segmentation fault.
[Switching to Thread 0x7fffebfff700 (LWP 7176)]
0x00007ffff5aa8200 in std::locale::locale(std::locale const&) () from /lib64/libstdc++.so.6
Missing separate debuginfos,use: debuginfo-install boost-system-1.53.0-28.el7.x86_64 boost-thread-1.53.0-28.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libattr-2.4.46-13.el7.x86_64 libcap-2.22-10.el7.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 systemd-libs-219-67.el7_7.2.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007ffff5aa8200 in std::locale::locale(std::locale const&) () from /lib64/libstdc++.so.6
#1  0x00007ffff5ab6051 in std::basic_ios<char,std::char_traits<char> >::imbue(std::locale const&) () from /lib64/libstdc++.so.6
#2  0x000000000041e322 in boost::property_tree::stream_translator<char,std::char_traits<char>,std::allocator<char>,int>::get_value(std::string const&) ()
#3  0x000000000041c5b2 in boost::optional<int> boost::property_tree::basic_ptree<std::string,std::string,std::less<std::string> >::get_value_optional<int,boost::property_tree::stream_translator<char,int> >(boost::property_tree::stream_translator<char,int>) const ()
#4  0x000000000041aa61 in boost::enable_if<boost::property_tree::detail::is_translator<boost::property_tree::stream_translator<char,int> >,int>::type boost::property_tree::basic_ptree<std::string,std::less<std::string> >::get_value<int,int>) const ()
#5  0x000000000041985d in int boost::property_tree::basic_ptree<std::string,std::less<std::string> >::get_value<int>() const ()
#6  0x0000000000418673 in int boost::property_tree::basic_ptree<std::string,std::less<std::string> >::get<int>(boost::property_tree::string_path<std::string,boost::property_tree::id_translator<std::string> > const&) const ()
#7  0x0000000000414f4a in parseJsonMessage ()
#8  0x000000000040d8cd in Processthread () at ../../src/Processing.c:906
#9  0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff55538cd in clone () from /lib64/libc.so.6

FWIW,我试着把这段代码放到一个简单的(单线程)main.cpp 中:

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

int main(int numArgs,char * const * const args) {
    
    ptree tree;

    try {
        // Create an istringstream from the JSON message.
        std::string jsonMessageString("{\"header\":{\"version\":42,\"coffee\":\"colombian\"}");
        std::istringstream isstreamJson(jsonMessageString);

        // Parse the JSON into the property tree.
        std::cout << "Reading JSON..." << jsonMessageString << "...";
        read_json(isstreamJson,tree);
        std::cout << " Done!" << std::endl;
        // Print what we parsed.
        std::cout << "version: " << tree.get<int>("header.version") << std::endl;
        std::cout << "source: " << tree.get<int>("header.source") << std::endl;
        std::cout << "coffee: " << tree.get<std::string>("coffee") << std::endl;
    }
    catch (boost::property_tree::ptree_bad_path badpathException) {
        std::cout << "Exception caught for bad path: " << badpathException.what() << std::endl;
        return -1;
    }
    catch (boost::property_tree::ptree_bad_data badDataException) {
        std::cout << "Exception caught for bad data: " << badDataException.what() << std::endl;
        return -1;
    }
    catch (std::exception exception) {
        std::cout << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
        return -1;
    }
    std::cout << "Program completed!" << std::endl;
    return 0;
}

代码工作正常:

bash-4.2$ g++ -std=c++11 main.cpp -o main.exe
bash-4.2$ ./main.exe 
Reading JSON...{"header":{"version":42,"coffee":"colombian"}... Done!
version: 42
source: 1
coffee: colombian
Program completed!

那么,为什么 Boost.PropertyTree get() 方法不适用于多线程应用程序? 多线程应用程序是 C 和 C++ 的混合体这一事实可能吗?代码导致问题?我看到我的特定编译器版本 (GCC 4.8.5) 没有 explicitly verified 与这个 Boost 库......这可能是编译器问题吗?还是 Boost 1.53 版本有问题?


基于提供的答案的更新:

诚然,我为 parseJsonMessage 方法编写的原始代码很混乱(经过数十次调试迭代并删除与问题无关的代码)。一个没有干扰(和可能的红鲱鱼)的更简洁的版本如下:

#ifdef __cplusplus
extern "C"
{
#endif

#include "DirectIpRev3.hpp"

#ifdef __cplusplus
}
#endif

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

extern "C" void parseJsonMessage2() {
    // Create empty property tree object.
    ptree tree;
    std::string jsonMessageString("{\"header\":{\"version\":42,\"coffee\":\"colombian\"}");   //doesn't work
    std::istringstream isstreamJson(jsonMessageString);
    try {
        read_json(isstreamJson,tree);
        std::cout << tree.get<int>("header.version") << std::endl;
        std::cout << tree.get<int>("header.source") << std::endl;
        std::cout << tree.get<std::string>("coffee") << std::endl;
    }
    catch (boost::property_tree::ptree_bad_path const & badpathException) {
        std::cerr << "Exception caught for bad path: " << badpathException.what() << std::endl;
    }
    catch (boost::property_tree::ptree_bad_data const & badDataException) {
        std::cerr << "Exception caught for bad data: " << badDataException.what() << std::endl;
    }
    catch (std::exception const & exception) {
        std::cerr << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
    }
}

在我的多线程程序中运行这个压缩函数会产生一个异常:

Exception caught when parsing message into Boost.Property tree: <unspecified file>(1): expected object or array

没有异常处理,它会打印更多信息:

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::property_tree::json_parser::json_parser_error> >'
  what():  <unspecified file>(1): expected object or array

我仍然不太确定可能导致这里失败的原因,但倾向于按照建议使用 nlohmann

解决方法

请不要使用属性树来“解析”“JSON”。请参阅 nlohmannBoost.JSON

进一步

  • 您显然没有充分理由使用原始 newdelete
  • 您有未使用的参数
  • 您正在按值捕获多态异常
  • 您在发生任何异常时都在泄漏内存,并在出现分配错误时返回空指针

结合这些,我 99% 肯定您的崩溃是由其他原因引起的:Undefined Behaviour 有在内存损坏后出现在其他地方的趋势(例如堆栈抖动或删除后使用,超出范围) -边界等)。

使用我的水晶球

  1. 猜测:你没有显示,但结构可能看起来像

    typedef struct MyStructT {
        int version;
        int source;
        char const* coffee;
    } MyStruct;
    

    一个天真的错误是将 coffee 分配为与打印它相同的方式:

    myStruct->coffee = tree.get<std::string>("coffee").c_str();
    

    这里的“明显”(?) 问题是 c_str() 指向值节点拥有的内存,并由 ptree 传递。当函数返回时,该指针已失效。哎呀。 UB

  2. 您正在使用 new 分配结构(尽管由于 extern "C" 可能是 POD,因此它会给您一种错误的安全感,因为所有成员都有不确定的值反正)。

    另一个幼稚的错误是传递用 ::free 取消分配的 C 代码(就像它对所有 malloc-ed 一样,对)。这是 UB 的另一个潜在来源。

  3. 如果你已经“确定”了第一个想法,例如使用 strdup 您可能会遇到更多内存泄漏的问题。即使您正确使用了 delete myStruct(或开始使用 malloc),您也必须记住 ::free 分配给 strdup 的字符串。

  4. 您的 API 是典型的 C 风格(这可能是故意的),但为传递错误的 messageLength 导致越界读取敞开了大门。由于观察到您甚至没有在上面的示例代码中使用参数,因此发生这种情况的可能性增加了。

多线程应力测试

这是在 Coliru 上进行的多线程压力测试。它在 25 个线程上进行了 1000 次迭代。

Live On Coliru

#ifdef __cplusplus
extern "C"
{
#endif

typedef struct MyStructT {
    int version;
    int source;
    char* coffee;
} MyStruct;

//#include "ParseJson.hpp"

#ifdef __cplusplus
}
#endif

#include <iostream>
#include <sstream>
#include <string>

#define BOOST_BIND_GLOBAL_PLACEHOLDERS
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

extern "C" MyStruct* parseJsonMessage(char const* jsonMessage,unsigned int const messageLength) {
    auto myStruct = std::make_unique<MyStruct>(); // make it exception safe
    // Create empty property tree object.
    ptree tree;

    if (myStruct != nullptr) {
        try {
            // Create an istringstream from the JSON message.
            std::istringstream isStreamJson(std::string(jsonMessage,messageLength));

            // Parse the JSON into the property tree.
            //std::cout << "Reading JSON ..." << isStreamJson.str() << "...";
            read_json(isStreamJson,tree);
            //std::cout << " Done!" << std::endl;

            // Get the values from the property tree.
            myStruct->version = tree.get<int>("header.version");
            myStruct->source = tree.get<int>("header.source");
            myStruct->coffee = ::strdup(tree.get<std::string>("coffee").c_str());
            return myStruct.release();
        }
        catch (boost::property_tree::ptree_bad_path const& badPathException) {
            std::cerr << "Exception caught for bad path: " << badPathException.what() << std::endl;
        }
        catch (boost::property_tree::ptree_bad_data const& badDataException) {
            std::cerr << "Exception caught for bad data: " << badDataException.what() << std::endl;
        }
        catch (std::exception const& exception) {
            std::cerr << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
        }
    }
    return nullptr;
}

#include <cstdlib>
#include <string>
#include <thread>
#include <list>

int main() {
    static std::string_view msg = R"({"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"})";

    auto task = [] {
        for (auto i = 1000; --i;) {
            auto s = parseJsonMessage(msg.data(),msg.size());

            ::printf("version: %d\n",s->version);
            ::printf("source: %d\n",s->source);
            ::printf("coffee: %s\n",s->coffee);

            ::free(s->coffee);
            delete s; // not ::free!
        }
    };

    std::list<std::thread> pool;

    for (int i = 0; i < 25; ++i)
        pool.emplace_back(task);

    for (auto& t : pool)
        t.join();
}

输出(已排序和 uniq-ed):

  24975 coffee: colombian
  24975 source: 1
  24975 version: 42