包含Unicode字符的Unicode字符串始终为空

问题描述

我正在使用 -municode -DUNICODE -D_UNICODE标志进行编译，并使用_tmain来启用unicode支持。

但是，当我在任何包含Unicode字符的TCHAR数组上执行操作时，无论该字符所在的位置，字符串最终都会被截断。

例如：

TCHAR buffer[255];
wcscpy(buffer,L"test-");
wcscat(buffer,L"Азәрбајҹан");
/* buffer Now contains "test-" */

我的实际用例是检索用户名，如果它包含特殊字符，则无论它是GetEnvironmentvariable，GetUsername还是上面的硬编码字符串，都将为空。

编辑：

这是一个完整的最小可复制示例：

根据gcc -o error.exe error.c -municode进行了以下编译：

gcc.exe (Rev3,Built by MSYS2 project) 10.1.0
copyright (C) 2020 Free Software Foundation,Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or fitness FOR A PARTIculaR PURPOSE.

代码：

#define _UNICODE
#define UNICODE

#include <tchar.h>
#include <stdio.h>
#include <stdlib.h>

int _tmain(int argc,TCHAR* argv[]) {
    FILE* fp;
    TCHAR buffer[255];

    _tcscpy(buffer,_T("test-"));
    _tcscat(buffer,_T("Азәрбајҹан"));
    _tprintf(_T("Length: %d,Content: %ls\n"),_tcslen(buffer),buffer);

    fp = _tfopen(_T("test.txt"),_T("w"));
    _ftprintf(fp,_T("%ls"),buffer);
    fclose(fp);
    return 0;
}

此示例打印15 test-并将test-放入test.txt。

解决方法

对于宽字符，我通常使用wchar_t。

如果这是一个选项，则可以使用类似的内容：

#include <tchar.h>
#include <fcntl.h>
#include <io.h>
#include <stdio.h>

int _tmain() {

#ifdef UNICODE
    _setmode(_fileno(stdin),_O_WTEXT);
    _setmode(_fileno(stdout),_O_WTEXT);
#endif

    wchar_t buffer[255];

    wcscpy(buffer,L"test-");
    wcscat(buffer,L"Азәрбајҹан");
    wprintf(L"%s\n",buffer);

    return 0;
}

输出：

启用 VS 2019 with MSVC 和 Use Unicode Character Set ：

使用 gcc version 9.2.0 (tdm64-1) ：

c mingw unicode windows windows