You are currently logged into Student View
    CIS 410/510
    Assignment 2
    Skip To Content
    Dashboard
    • Dashboard
    • Calendar
    • Inbox
    • Help
    • My Dashboard
    • CIS 410/510
    • Assignments
    • Assignment 2
    Spring 2021
    • Home
    • Announcements
    • Zoom Meetings
    • Syllabus
    • Assignments
    • Grades
    • Modules
    • Files
    • Pages
    • People
    • Perusall

    Assignment 2

    • Due Apr 21 by 11:59pm
    • Points 100
    • Submitting a text entry box

    In this assignment, you will learn about and use Clang's AST visitors to perform several different code analyses. In your uoregon-pat Bitbucket repository, create a subdirectory named a2 and place all files for this assignment there. Make sure to add source code (not object or binary files), commit, and push frequently. 

    1. Warm-up: Simple matching

    (30 pts)

    Implement a simple AST traversal  -- you can choose any of the mechanisms described in week2 materials. The goal of your traversal is to output the locations (line and column) of all variables (declarations or definitions) that match a string provided as input. For example, if your binary is called find-waldo, you could apply it to the source file in problem 2 and produce the following output:

    ./find-waldo -match y example.c
    y: 2,32
    y: 5,5

    (You don't have to match this exactly, but you should get approximately similar results.)

    2. A more useful traversal: Counting ops

    a. (30 pts)

    Write a Clang tool named ops-counter that traverses the AST and counts the number of memory operations (i.e., variable references) and floating-point operations in the input source code. You do not have to consider loop iterations, so for example, given input code like this:

    // File: example.c
    void foo( double x[10], double y[10] ) {
    y[0] = 0;
    for (int i = 1; i < 10; i++) {
    y[i] = ( x[i] - x[i-1] ) / 2;
    }
    }

    Your tool could produce output such as:

    ./ops-counter example.c 
    foo: FLOPS=2, MemOp=4

    Clearly, this is not the actual number of operations that would be performed were you to run the code. To get the right number, we need to know how many iterations the loop has, which looks trivial here, but is more difficult in general and will be covered later in the term. You can, however, decide to optionally handle certain loops with simple bounds (like the above example), producing 18 FLOPS, and 25 MemOps in this case (+5 extra credit). That is still not completely right since it does not account for the fact that x[i-1] is the same memory location as x[i] in the previous iteration, but that's yet another type of analysis that is beyond the scope of this problem.

    3. Finding paths

    (40 pts)

    One of the applications of code analysis and transformation is documentation generation. In a recent paper we were reading  (https://openreview.net/pdf?id=H1gKYo09tX), ASTs are used to generate code representations suitable for different tasks. In this paper, the authors extract paths from the AST of Java or C# code, and use these paths to train a model that can be used for different generation tasks, such as summarization (Section 4.1), in which they predict Java methods’ names from their bodies, and captioning (Section 4.2), where they generate natural language descriptions of C# code snippets. You will not do all this in just one problem of your homework assignment, but you will try to produce one of the critical components of this approach -- the AST paths, described in Section 2 of the paper. The main idea is to generate all pairwise terminal-to-terminal paths through the AST, which are then encoded as numerical vectors and used as input to an encoder-decoder model. Since this is not a machine learning class, we will not worry about any aspects of this, other than producing the paths of tokens. 

    Write a tool called path-finder that prints out a random sample of k pairwise terminal-to-terminal paths (recall that all leaf nodes are terminals). The number of paths should be a command-line argument to your tool. Do not implement this as a text processor over an existing clang -ast-dump text output. Feel free to visualize the ASTs for any examples you use for debugging and testing, e.g., with clang -cc1 -ast-view example.c. For the example in Problem 2, you will get a dot (Graphviz) graph file that can be visualized with various tools, but most easily with GraphvizOnline to produce:

    example.c.png    

    Some of the paths you can generate for this example (highlighted above) include:

    ./path-finder -k 3
    DeclRefExpr ImplicitCastExpr ArraySubscriptExpr BinaryOperator ImplicitCastExpr IntegerLiteral
    DeclRefExpr ImplicitCastExpr ArraySubscriptExpr BinaryOperator BinaryOperator ParenExpr BinaryOperator ImplicitCastExpr ArraySubscriptExpr ImplicitCastExpr DeclRefExpr
    DeclRefExpr ImplicitCastExpr ArraySubscriptExpr BinaryOperator ImplicitCastExpr DeclRefExpr

    How to submit:

    Simply add, commit, and push your solutions to your Bitbucket repository. It's fine to push incremental work (in fact, I strongly encourage it!). When you are completely done, "Submit" the assignment in Canvas by simply copying and pasting your repository's URL in the submission text field. 

    Feel free to discuss any details on Discord, but please avoid sharing implementation details (code) directly among yourselves.

    Additional resources

    • Clang compilation databases: https://eli.thegreenplace.net/2014/05/21/compilation-databases-for-clang-based-tools (you probably won't need to create them, but in case you get a compilation database error, check this)
    1619074799 04/21/2021 11:59pm
    • Text Entry
    Copy and paste or type your submission right here.
    Additional Comments:
    Rating max score to > pts

    Rubric

     
     
     
     
     
     
     
         
    Can't change a rubric once you've started using it.  
    Find a Rubric
    Find Rubric
    Title
    You've already rated students with this rubric. Any major changes could affect their assessment results.
    Title
    Criteria Ratings Pts
    Edit criterion description Delete criterion row
    This criterion is linked to a Learning Outcome Description of criterion
    threshold: 5 pts
    Edit rating Delete rating
    5 to >0 pts
    Full Marks
    blank
    Edit rating Delete rating
    0 to >0 pts
    No Marks
    blank_2
    This area will be used by the assessor to leave comments related to this criterion.
    pts
      / 5 pts
    --
    Additional Comments
    Total Points: 5 out of 5
    24dfa92b-7c5d-41b0-83c1-59a20479d20c