如何使用AST从Java方法中提取元数据?

问题描述

是否有一个AST工具可以轻松地从Java方法提取元数据?

例如,使用以下代码

/*
 Checks if a target integer is present in the list of integers.
*/
public Boolean contains(Integer target,List<Integer> numbers) {
    for(Integer number: numbers){
        if(number.equals(target)){
            return true;
        }
    }
    return false;
}

元数据为:

Metadata = {
    "comment": "Checks if a target integer is present in the list of integers.","identifier": "contains","parameters": "Integer target,List<Integer> numbers","return_statement": "Boolean false"

}

解决方法

该类是很久以前写的。实际上,它涉及大约四个不同的类-分布在名为JavaParserBridge的程序包中。它极大地简化了您要尝试做的事情。我剔除了所有不必要的内容,并将其煮沸到100行。花了大约一个小时...

我希望这一切都有道理。我通常会在代码中添加很多注释,但是有时在处理其他库时-并在Stack Overflow上发布-因为这实际上只是一个大的构造函数-我将为您提供Java Parser

的文档页面>

要使用此类,只需将 Java类的源代码文件作为单个java.lang.String传递,名为getMethods(String)的方法将返回 Java Vector<Method>。返回的Vector的每个元素都有一个Method的实例,该实例应包含您在问题中要求的所有元信息

重要提示:您可以从github页面上获取此软件包的 JAR 文件。您需要将 JAR 命名为: javaparser-core-3.16.2.jar

import com.github.javaparser.StaticJavaParser;
import com.github.javaparser.ast.CompilationUnit;
import com.github.javaparser.ast.body.TypeDeclaration;
import com.github.javaparser.ast.body.MethodDeclaration;
import com.github.javaparser.ast.body.Parameter;
import com.github.javaparser.ast.type.ReferenceType;
import com.github.javaparser.ast.type.TypeParameter;
import com.github.javaparser.ast.Node;
import com.github.javaparser.ast.NodeList;
import com.github.javaparser.ast.Modifier; // Modifiers are the key-words such as "public,private,static,etc..."
import com.github.javaparser.printer.lexicalpreservation.LexicalPreservingPrinter;
import com.github.javaparser.printer.lexicalpreservation.PhantomNodeLogic;

import java.io.IOException;
import java.util.Vector;


public class Method
{
    public final String name,signature,jdComment,body,returnType;
    public final String[] modifiers,parameterNames,parameterTypes,exceptions;

    private Method (MethodDeclaration md)
    {

        NodeList<Parameter>     paramList       = md.getParameters();
        NodeList<ReferenceType> exceptionList   = md.getThrownExceptions();
        NodeList<Modifier>      modifiersList   = md.getModifiers();

        this.name           = md.getNameAsString();
        this.signature      = md.getDeclarationAsString();
        this.jdComment      = (md.hasJavaDocComment() ? md.getJavadocComment().get().toString() : null);
        this.returnType     = md.getType().toString();
        this.modifiers      = new String[modifiersList.size()];
        this.parameterNames = new String[paramList.size()];
        this.parameterTypes = new String[paramList.size()];
        this.exceptions     = new String[exceptionList.size()];
        this.body           = (md.getBody().isPresent()
                                ?   LexicalPreservingPrinter.print
                                        (LexicalPreservingPrinter.setup(md.getBody().get()))
                                :   null);

        int i=0;
        for (Modifier modifier : modifiersList) modifiers[i++] = modifier.toString();

        i=0;
        for (Parameter p : paramList)
        {
            parameterNames[i]           = p.getName().toString();
            parameterTypes[i]           = p.getType().toString();
            i++;
        }

        i=0;
        for (ReferenceType r : exceptionList) this.exceptions[i++] = r.toString();
    }

    public static Vector<Method> getMethods(String sourceFileAsString) throws IOException
    {
        // This is the "Return Value" for this method (a Vector)
        final Vector<Method> methods = new Vector<>();

        // This asks Java Parser to parse the source code file
        // The String-parameter 'sourceFileAsString' should have this

        CompilationUnit cu = StaticJavaParser.parse(sourceFileAsString);

        // This will "walk" all of the methods that were parsed by
        // StaticJavaParser,and retrieve the method information.
        // The method information is stored in a class simply called "Method"

        cu.walk(MethodDeclaration.class,(MethodDeclaration md) -> methods.add(new Method(md)));

        // There is one important thing to do: clear the cache
        // Memory leaks shall occur if you do not.

        PhantomNodeLogic.cleanUpCache(); 

        // return the Vector<Method>
        return methods;
    }
}
,

您需要将此方法添加到上面的类中...我很少(如果有的话)为单个Stack Overflow问题添加多个答案。但是,由于使这变成了很多代码,所以没有使它变得过于复杂,而是将这种main方法发布为您问题的单独答案。

您需要在上面的类中包含此方法,它将正确处理您从我的网站下载的文件functions.json。正在处理的文件是一个名为functions.json的文件,它是包含方法列表及其数据库ID的文件。

ALSO::请确保添加以下行:import java.util.regex.*,因为此方法使用了Java class Patternclass Matcher


    public static void main(String[] argv) throws IOException
    {
        // "321": "\tpublic int getPushesLowerbound() {\n\t\treturn pushesLowerbound;\n\t}\n",// If you have not used "Regular Expressions" before,you are just
        // going to have to read about them.  This "Regular Expression" parses your
        // JSON "functions.json" file.  It is a little complicated,but not too bad.

        Pattern         P1          = Pattern.compile("^\\s+\"(\\d+)\"\\:\\s+\"(.*?)\\\\n\",$");
        BufferedReader  br          = new BufferedReader(new FileReader(new File("functions.json")));
        String          s           = br.readLine();

        // Any time you have a "Constructor" instead of a method,you should
        // use some other method in `StaticJavaParser` to deal with it.
        // for now,I am just going to keep a "Fail List" instead..

        int             failCount   = 0;
        Vector<String>  failIDs     = new Vector<>();
 
        while (! (s = br.readLine()).equals("}"))
        {
            // Parse the JSON using a Regular Expression.  It is just easier to do it this way
            // You have a VERY BASIC json file.

            Matcher m = P1.matcher(s);
            
            // I do not think any of the String's will fail the regular expression matcher.
            // Just in case,continue if the Regular Expression Match Failed.
            if (! m.find()) { System.out.print("."); continue; }
            
            // The ID is the first JSON element matched by the regular expression
            String id = m.group(1);
            
            // The source code is the second JSON element matched by the regular-expression
            // NOTE: Your source-code is not perfect... It has "escape sequences",so these sequennces
            //       have to be "unescaped"
            // ALSO: this is not the most efficient way to "un-escape" an escape-sequence,but I would
            //       have to include an external library to do it the right way,so I'm going to leave
            //       this version here for your to think about.
            String src = m.group(2)
                .replace("\\\\","" + ((char) 55555))
                .replace("\\n","\n")
                .replace("\\t","\t")
                .replace("\\\"","\"")
                .replace("" + ((char) 55555),"\\");

            // Java Parser has a method EXPLICITLY FOR parsing method Declarations.
            // Your "functions.json" file has a list of method-declarations.
            MethodDeclaration   md          = null;

            // I found one that failed - it was a constructor..
            try
                { md = StaticJavaParser.parseMethodDeclaration(src); }
            catch (Exception e)
                { System.out.println(src); e.printStackTrace(); failCount++; continue; }

            Method method = new Method(md);

            System.out.print(
                "ID:           " + id + '\n' +
                "Name:         " + method.name + '\n' +
                "Return Type:  " + method.returnType + '\n' +
                "Parameters:   "
            );

            for (int i=0; i < method.parameterNames.length; i++)
                System.out.print(method.parameterNames[i] + '(' + method.parameterTypes[i] + ")  ");

            System.out.println("\n");

            PhantomNodeLogic.cleanUpCache();
        }
        
        System.out.print(
            "Fail Count: " + failCount + "\n" +
            "Failed ID's: "
        );
        for (String failID : failIDs) System.out.print(failID + " ");
        System.out.println();
    }

以上方法将产生这种类型的输出。既然您拥有一百万种方法,那么它将运行一段时间。

注意::并非该列表中的每个方法都是有效方法。如果有构造函数而不是方法,则需要将其解析为构造函数。对于 JavaParser 无法解析的方法,有一个“失败列表”-我将作为练习来留给您,以帮助您弄清楚如何处理Constructors (未通过名为StaticJavaParser的{​​{1}}方法进行解析

注意::此操作将运行 很长一段时间 -我仅发布了此{的非常小的子集{ {1}}方法...

parseMethodDeclaration

重要提示 :(再次)任何时候,您的任何数据库功能都是 构造函数 而不是 方法 main(String[] argv)中使用的 JavaParser 方法将引发异常。

参见此处:这是一个构造函数:


ID:           32808641
Name:         addUnboundTypePropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808649
Name:         addNamePropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808650
Name:         addInputParameterPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808651
Name:         addQualifiedNamePropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808652
Name:         addOutputParameterPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808656
Name:         addReturnParameterPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808658
Name:         addSignatureParameterPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808659
Name:         getLabelProvider
Return Type:  IItemLabelProvider
Parameters:   namedElement(NamedElement)

ID:           32808661
Name:         getLabel
Return Type:  String
Parameters:   namedElement(NamedElement)

ID:           32808677
Name:         addBodyPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808678
Name:         addLanguagePropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808696
Name:         addKindPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808707
Name:         addStaticPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808708
Name:         addKindPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808709
Name:         addSemanticsPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808711
Name:         addConstrainedElementPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808713
Name:         addDefinedFeaturePropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808727
Name:         addNestingNamespacePropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808741
Name:         addKindPropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32808749
Name:         addSuperTypePropertyDescriptor
Return Type:  void
Parameters:   object(Object)

ID:           32814359
Name:         getResource
Return Type:  ResourceBundle
Parameters:   name(String)  language(String)

ID:           32814360
Name:         store
Return Type:  void
Parameters:   resource(ResourceBundle)  name(String)  language(String)

ID:           32814364
Name:         getString
Return Type:  String
Parameters:   key(String)  resourceName(String)  language(String)

ID:           32814400
Name:         getGlobalCompletionRate
Return Type:  double
Parameters:

ID:           32814409
Name:         setCurrentSubTask
Return Type:  void
Parameters:   subTask(TaskMonitor)  subTaskShare(double)

ID:           32814429
Name:         enforceCompletion
Return Type:  void
Parameters:

ID:           32814431
Name:         getCurrentActiveSubTask
Return Type:  TaskMonitor
Parameters:

ID:           32814469
Name:         checkTaskState
Return Type:  void
Parameters:

ID:           32814619
Name:         getReportAsText
Return Type:  String
Parameters:   report(ProcessReport)

ID:           32815305
Name:         showRecoveryResultWindow
Return Type:  void
Parameters:   context(ProcessContext)

ID:           32815353
Name:         validateStructure
Return Type:  void
Parameters:

ID:           32815413
Name:         buildArchive
Return Type:  void
Parameters:   context(ProcessContext)

ID:           32815445
Name:         checkArchiveCompatibility
Return Type:  boolean
Parameters:   archive(File)

ID:           32815446
Name:         checkStupidConfigurations
Return Type:  boolean
Parameters:

ID:           32815472
Name:         getDescription
Return Type:  String
Parameters:

ID:           32815501
Name:         getDataDirectory
Return Type:  File
Parameters:   archive(File)

我发布的代码在遇到此消息时会打印此消息:

class StaticJavaParser