为什么我们不能使用预处理器来创建自定义分解字符串？

我在C预处理器中玩了一下,当看起来如此简单的事情失败了：

#define STR_START "
#define STR_END "

int puts(const char *);

int main() {
    puts(STR_START hello world STR_END);
}

当我使用gcc编译它(注意：与clang类似的错误),它失败,与这些错误：

$gcc test.c
test.c:1:19: warning: missing terminating " character
test.c:2:17: warning: missing terminating " character
test.c: In function ‘main’:
test.c:7: error: missing terminating " character
test.c:7: error: ‘hello’ undeclared (first use in this function)
test.c:7: error: (Each undeclared identifier is reported only once
test.c:7: error: for each function it appears in.)
test.c:7: error: expected ‘)’ before ‘world’
test.c:7: error: missing terminating " character

有什么困惑我,所以我通过预处理器运行：

$gcc -E test.c
# 1 "test.c"
# 1 ""
# 1 ""
# 1 "test.c"
test.c:1:19: warning: missing terminating " character
test.c:2:17: warning: missing terminating " character

int puts(const char *);

int main() {
    puts(" hello world ");
}

尽管有警告,尽管生成完全有效的代码(以粗体显示)！

如果,C中的宏只是一个文本替换,为什么我的初始例子会失败？这是编译器的错误吗？如果没有,在标准中哪里有与这种情况有关的信息？

注意：我不是在寻找如何使我的初始代码段编译.我只是想了解为什么这种情况失败的信息.

解决方法

问题是即使代码扩展为“hello,world”,它也不被预处理器识别为单个字符串文字标记;相反,它被认为是令牌的“(无效的)序列”,你好,世界“.

N1570：

6.4 Lexical elements
…
3 A token is the minimal lexical element of the language in translation phases 7 and 8. The
categories of tokens are: keywords,identiﬁers,constants,string literals,and punctuators.
A preprocessing token is the minimal lexical element of the language in translation
phases 3 through 6. The categories of preprocessing tokens are: header names,
identiﬁers,preprocessing numbers,character constants,punctuators,and
single non-white-space characters that do not lexically match the other preprocessing
token categories.⁶⁹⁾ If a ' or a " character matches the last category,the behavior is
undeﬁned. Preprocessing tokens can be separated by white space; this consists of
comments (described later),or white-space characters (space,horizontal tab,new-line,
vertical tab,and form-Feed),or both. As described in 6.10,in certain circumstances
during translation phase 4,white space (or the absence thereof) serves as more than
preprocessing token separation. White space may appear within a preprocessing token
only as part of a header name or between the quotation characters in a character constant
or string literal.
^{69) An additional category,placemarkers,is used internally in translation phase 4 (see 6.10.3.3); it cannot
occur in source ﬁles.}

请注意,此定义下,“或”不是标点符号.

为什么我们不能使用预处理器来创建自定义分解字符串？

解决方法

相关文章