使用Clang工具提取自定义源代码属性

问题描述

我正在开发一种使用clang解析和类型检查代码的工具,并且我试图找出是否有一种方法可以从源代码获取非标准clang属性。例如,我想用循环不变式注释循环或用[[not_null]]注释表达式。这可能吗?也许有些选项会说

这是一个具体的例子:

struct custom { };
int test() {
  int x;
  switch (1) {
  case 1:
    x = 1;
    [[fallthrough]] ;
  case 2:
    x = 2;
    break;
  }
  [[custom()]]
  if (1) { return 1; } else { return 0; }
}

最终的AST如下:

`-FunctionDecl 0x55a9780ebba0 <line:2:1,line:14:1> line:2:5 test 'int ()'
  `-CompoundStmt 0x55a9780ec060 <col:12,line:14:1>
    |-DeclStmt 0x55a9780ebd08 <line:3:3,col:8>
    | `-VarDecl 0x55a9780ebca0 <col:3,col:7> col:7 used x 'int'
    |-SwitchStmt 0x55a9780ebd40 <line:4:3,line:11:3>
    | |-IntegerLiteral 0x55a9780ebd20 <line:4:11> 'int' 1
    | `-CompoundStmt 0x55a9780ebf40 <col:14,line:11:3>
    |   |-CaseStmt 0x55a9780ebda0 <line:5:3,line:6:9>
    |   | |-ConstantExpr 0x55a9780ebd80 <line:5:8> 'int' Int: 1
    |   | | `-IntegerLiteral 0x55a9780ebd60 <col:8> 'int' 1
    |   | `-BinaryOperator 0x55a9780ebe08 <line:6:5,col:9> 'int' lvalue '='
    |   |   |-DeclRefExpr 0x55a9780ebdc8 <col:5> 'int' lvalue Var 0x55a9780ebca0 'x' 'int'
    |   |   `-IntegerLiteral 0x55a9780ebde8 <col:9> 'int' 1
    |   |-AttributedStmt 0x55a9780ebe58 <line:7:5,col:21>  <<<<<<<<<<<<<<<<<<<<<<<<<<< GOT Fallthrough
    |   | |-FallThroughAttr 0x55a9780ebe30 <col:7>
    |   | `-NullStmt 0x55a9780ebe28 <col:21>
    |   |-CaseStmt 0x55a9780ebeb0 <line:8:3,line:9:9>
    |   | |-ConstantExpr 0x55a9780ebe90 <line:8:8> 'int' Int: 2
    |   | | `-IntegerLiteral 0x55a9780ebe70 <col:8> 'int' 2
    |   | `-BinaryOperator 0x55a9780ebf18 <line:9:5,col:9> 'int' lvalue '='
    |   |   |-DeclRefExpr 0x55a9780ebed8 <col:5> 'int' lvalue Var 0x55a9780ebca0 'x' 'int'
    |   |   `-IntegerLiteral 0x55a9780ebef8 <col:9> 'int' 2
    |   `-BreakStmt 0x55a9780ebf38 <line:10:5>
    `-IfStmt 0x55a9780ec038 <line:13:3,col:41> has_else  <<<<<<<<<<<<<<<<<<<<<<<<<<< MISSING custom
      |-ImplicitCastExpr 0x55a9780ebf90 <col:7> 'bool' <IntegralToBoolean>
      | `-IntegerLiteral 0x55a9780ebf70 <col:7> 'int' 1
      |-CompoundStmt 0x55a9780ebfd8 <col:10,col:22>
      | `-ReturnStmt 0x55a9780ebfc8 <col:12,col:19>
      |   `-IntegerLiteral 0x55a9780ebfa8 <col:19> 'int' 1
      `-CompoundStmt 0x55a9780ec020 <col:29,col:41>
        `-ReturnStmt 0x55a9780ec010 <col:31,col:38>
          `-IntegerLiteral 0x55a9780ebff0 <col:38> 'int' 0

解决方法

是的,有可能。 AFAIK有两种方法。

首先是将您的新定义添加到Clang中。您可以在tools/clang/include/clang/Basic/Attr.td中修改Clang源。然后Clang知道您的新属性。缺点可能是您需要每次更改都重新构建整个源代码。

第二个是使用可用属性annotate。例如[[clang::annotate("not_null")]]。我个人更喜欢这种方法的灵活性。

这是一个MWE,假设我想搜索具有自定义属性的函数decl:

#define MY_ATTR [[clang::annotate("my_attr")]]

...

// i want to extract this attribute
MY_ATTR void foo(...) { ... }

我可以实现以下访问者功能:

bool VisitFunctionDecl(FunctionDecl *D) {                                                                                                                                                                                 
   if (D->hasAttrs()) {                                                                                                                                                                                                             
     for (auto attr : D->getAttrs()) {                                                                                                                                                                                              
       std::string attr_name(attr->getSpelling());                                                                                                                                                                                  
       std::string attr_annotate("annotate");                                                                                                                                                                                       
       std::string attr_my("annotate(\"my_attr\")");                                                                                                                                                                     
       if (attr_name == attr_annotate) {                                                                                                                                                                                            
         std::string SS; llvm::raw_string_ostream S(SS);                                                                                                                                                                            
         attr->printPretty(S,Policy);                                                                                                                                                                                              
         std::string attr_string(S.str());                                                                                                                                                                                          
         if (attr_string.find(attr_my) != std::string::npos) { 
           // found it! do stuff here!
           // ...
         }
       }                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
     }                                                                                                                                                                                                                              
   }
  return true;
}          

我认为您可以轻松地使其适应循环注释。