C++ 如何使用正则表达式并用非拉丁文本迭代字符串？

问题描述

我正在开发一个使用不同密码对信息进行编码的应用程序，例如 Atbash、Scytale、Caesar 等。我决定将其分为两部分：带有 Qt 的 GUI 和一个具有所有编码功能的单独库，它可以以后在其他项目中使用。第一个密码 - Atbash - 需要正则表达式并迭代输入消息的代码点（字母）。常规 std::string 和 std::regex 似乎不适用于多字节 UTF-8 代码点，因此我想出了几种不同的解决方案：

将 std::string 和 std::regex 与我的语言的一字节编码一起使用，例如 CP1251 或 cp866。将这样的 std::string 转换为使用 UTF-16 的 QString 会很复杂。
使用 QString 和 QRegExp。可与我的特定应用程序无缝协作，但以后将难以在 Qt 之外使用此库。
使用 wchar_t、std::wstring 和 std::wregex。我了解到 wstring 不是跨平台的。
将 std::u32string 与来自 Boost 的 u32regex 结合使用。

我应该使用哪个选项？

代码示例：

std::wstring atbash::encode(const std::wstring& message) {
    if (!std::regex_match(message,std::wregex(L"^[а-яёА-ЯЁa-zA-Z .,]+$"))) {
        throw "A message can contain only cyrillic and latin letters,spaces,dots and commas.";
    }

    auto result = std::wstring();
    for (auto symbol : message) {
        if (std::regex_match(std::wstring(1,symbol),std::wregex(L"[ .,ёЁ]"))) {
            result += symbol;
        } else if (std::regex_match(std::wstring(1,std::wregex(L"[а-я]"))) {
            result += L'а' + L'я' - symbol;
        } else if (std::regex_match(std::wstring(1,std::wregex(L"[А-Я]"))) {
            result += L'А' + L'Я' - symbol;
        } else if (std::regex_match(std::wstring(1,std::wregex(L"[a-z]"))) {
            result += L'a' + L'z' - symbol;
        } else if (std::regex_match(std::wstring(1,std::wregex(L"[A-Z]"))) {
            result += L'A' + L'Z' - symbol;
        }
    }
    return result;
}

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

c++stl utf utf-16