ROSE
0.9.6a
|
Consider making this a namespace.
For the general file IO we should consider a special file name, similar to "rose_..."
utilize BOOST visitors to take advantage of the BOOST graph structures abilities
*One improvement that should be implemented ASAP is changing the algorithm from a recursive algorithm to an iterative algorithm. Keeping the memory requirements down is much easier in this form and would probably increase the size of graph that the algorithm can handle.
*Another improvement that should be implemented when possible is to allow for loop analysis. This could be implemented by simply running the algorithm on the loop, but there would need to be a provision that kept the algorithm from stopping as soon as it starts. This could be done by separating the node into two nodes, one with all the inedges and one with all the outedges. OR one could collect the loops when they are deleted (the whole loop is calculated necessarily), though nested loops would have to be considered further in order to find a way to deal with them.
*It is possible that graph matching algorithms might prove useful to distinguish different types of graphs contained within the CFG and optimize traversal over them. Look up graph matching algorithms or pattern matching (potentially) for more information on such algorithms, though I do not believe there is an existant literature on matching subgraphs for this purpose.
*The parallelism in this program should be optimized by someone experienced in parallelization optimization
Most types are shared, but named types are copied, and the copies need to have there declarations reset to the new AST.
base class modifiers are shared and this should be fixed.
Friend function in classes are not represented by symbols in the global scope. Not that this is always the rule, it is the default for ROSE and it is setup inconsistantly in the generated AST copy. See copytest2007_39.C.
copytest2007_46.C is too difficult to figure out (likely because the SgTemplateArguments are shared).
copytest2007_47.C is too complex and likely demonstrates an error.
copytest2007_49.C is too complex and likely demonstrates an error.
Add Constant folding since we currently unfold all folded expressions in the code generation phase. This would be something that could be verified as well since it EDG has also computed the constant folded result and we have stored that explicitly as well as the expression tree from which it was folded. The proposed constant folding would also work on AST fragments constructed explicitly in ROSE from lower level mechanisms.
The ROSE source code might be easier to organize if we have an include directory just for the included files from the transformations. Then use separate directories for the implementations (we want to always separate the implementation if possible from the header file and place it in a separate file. This would make it easier to add transformations and make the maintenance of the development tree a bit easier. As it is each new directory forces a new include path to be specified and a new library to be generated (in /config/Makefile.for.ROSE.includes.and.libs).
isOutputInCodeGeneration() is orthogonal to isCompilerGenerated and isTransformation(). Currently IR nodes that are marked as isTransformation() are output, but these need to be marked as also being isOutputInCodeGeneration() so that orthogonality of the concepts is maintained.
It is possible to call get_file_info() on a SgFileInfo object and this needs to be fixed because it does not make any sense. This is because get_file_info is defined as a virtual function on SgNode. Not sure this is a great design, but maybe it just needs a local implementation of a private get_file_info() member function so that it can't be called (can be hidden).
Should there be a simpler way to copy a SgFileInfo object than: "new Sg_File_Info(*fileInfo);" or "fileInfo->copy();"; likely not!
Define the subset of IR nodes which would all have:
Remove the functions: isCompilerGeneratedNodeToBeUnparsed(), setCompilerGeneratedNodeToBeUnparsed(), and unsetCompilerGeneratedNodeToBeUnparsed() from where they are called.
Consider putting the endOfConstruct information into the single Sg_File_Info object. Currently the SgLocatedNode stores two Sg_File_Info objects, one for the beginning and the end of each construct. This would save significant space in the AST. Additional information in the Sg_File_Info could be:
Consider using "short int" instead of "int" for the file_id, line, and col (and maybe the classificationBitField) to reduce the size of the data structure. Padding is not a significant issue since data structures are allocated in contiguious memory (except for padding to at least the nearest byte if bit field widths are used.
Consider placing the VARARGS expression nodes into a common base class derived from SgExpression.
I have removed the access functions from the explicit storage of type information in SgExpression objects as phase 1 of a 2 phase approach to eliminate the storage of the type in the SgExpression IR nodes. This type should be computed where required. This would avoid it being held redundently. This mechanism is being redone internally. Some IR nodes will have likely have to store there type explicaitly (function expressions for example, though it might be better computed through the symbol). It is not clear it this computing of the type will be better than stroing of the type explicitly. It might be required for SgBinaryOp IR nodes to store the type if it is not clearly from either the lhs or rhs (if no simple rule exists).
SgScopeOp is deprecated and will be removed in a future version of ROSE. It is a hold over from support for CC++ which is not supported in SAGE III anymore.
SgRefExp is deprecated and will be removed in a future version of ROSE. It is not used anywhere within SAGE III and I don't know why it is there.
Need to find an example of where SgClassNameRefExp is used. It is build in the EDG/Sage III translation, but not in a way that it is obvious that it is still used within Sage III. So this may have to be removed at a latr date.
To support Fortran parser we need an IR node which will represent the ambiguity of an array access or function call expression. These are then resolved within the AST after parsing (requires AST Fixup rule).
Fortran support requires support for function call using: "foo(temp=*<label>)" this might force the development of a label expression to support this. Code using this compiles with gfortran, so it appears to be F90.
The ROSEAttributesListContainerPtr p_preprocessorDirectivesAndCommentsList should be implemented a list instead of a pointer to a list. This might require a list copy in the internal hand,ing, but would simplify the design and there is not the same memory constraint of having a pointer to a list vs. a list here because the list is almost always valid (most source code incluses at least one comment or CPP directive) and there is only one SgFile object per source file (so there are relatively few SgFile nodes in even a very large AST).
This IR nodes now has a Sg_File_Info pointer, however it needs to be made consistant with the filename that is returned from SgFile::get_fileName().
The default constructor for SgFile sets the SgGlobal pointer to NULL and perhaps it would be better if it set it to a valid SgGlobal object then we would have a better defined empty list of declarations.
Yarden has suggested we provide a way to modify the link line that would be generated to support the backend compilation. I think we should have a list of strings that could be added to the link line (appended to the end would be the simplist). Else we need a virtual function that could be overloaded to customize the control over the link command generation (however we want to discourage the derivation of user defined IR nodes from existing IR nodes since this would break some of the internal mechanisms that use the memory pools).
Evaluate if this should be derived from SgSupport, like other "list" based IR nodes.
Evaluate if we should even have this IR node. If the SgVariableDeclaration were to be fixed to really use the list of SgInitializedName objects where multiple variables are declared in the same variable declaration then we might not need this (I think). And if it didn't exist it would make the use of the SgForStatement a little bit simpler.
The conditional in this test is currently an expression, but should be a SgConditional or a SgStatement (e.g. so that it can be a variable declaration).
Now that the test is a SgStatement, perhaps the name of the field should be "test" instead of "test_expr".
Need to mark function declarations appearing in the file rose_edg_required_macros_and_functions.h as compiler generated since they are either builtin functions for gcc and g++ or those those builtin function that gcc and g++ required and which EDG fails to include as builtin when compiling with EDG's GNU_COMPATABILITY_MODE (current default for ROSE).
Need to better handle fiend injection rules, currently the SgFunctionSymbol for a friend function is placed into the global scope. It likely should be the outer scope for a non-defining declaration and the class scope for a defining declaration. But the exact rules for this are more complex. So the location of the SgFunctionSymbol in the symbol table of SgGlobal is a poor approximation.
Check scopes of variables in function parameter list, should point to function definition, if the function definition exists, else they are undefined. If they are undefined then we still have to have something for them to point to, we could propose that this be the scope of the function declaration (I think this is what is done). The test in the tutorial tests this and it seems to be correct.
Not clear if this should be a declaration statement (might make more sense derived from SgSupport, or perhaps from SgLocatedNode (with other IR nodes that are currently derived from SgSupport, see SgLocatedNode for details).
If this should be a SgDeclarationStatement (and there is a reasonable argument for this) then perhaps the declaration containing any default parameters should be the defining declaration, independent of the defining declaration of the associated function declaration.
Figure out why SgMemberFunctionRefExp is required instead of just SgFunctionRefExp.
Make the use of a SgMemberFunctionSymbol in a SgFunctionRefExp an error. The result will not unparse correctly (suggested by Jeremiah).
Evaluate if this should be derived from SgSupport (consistant with SgSymbolTable).
Evaluate if we might like to have the p_function_type_table be a SgSymbolTable rather than a pointer to a SgSymbolTable (see implementation note).
The AstAttributeMechanism type should be handed as other IR nodes with it's own memory pool, except that in all cases where it would be used, it would be a base class to a user-defined derived type and thus would not fix in our memory pool.
Consider name change of "SgLocatedNode" to "SgSourceNode".
Consider moving some of the IR nodes currently in SgSupport to this IR node. IR nodes that might be moved would include:
Define a string conversion operator so that we can handle "SgName name; string s = name;" This would start the process of internally having SgName contain a C++ style string.
Change SgName to store a C++ style std::string, instead of a C style char*.
Some of the member functions defined in this class will be removed (head(), tail(), etc.) because they represent low level string handling which is best done on a C++ style string more directly using C++ string operators.
Consider having a function which could generate a list of all the SgNamespaceDeclarationStatement IR nodes that match the same namspace. This would make a good first project for a new student.
Include a graph to show how scopes are handled within the AST.
Provide some examples to detail the difference between placement, constructor, and builtin arguments.
I believe we can associate the constructors from the class with new operators.
It should be moved to only those IR nodes were it makes sense, e.g. excluded from:
Consider eliminating the set_freepointer() function since only the internal memory allocation mechanisms should use it (and they are forced to access the data member directly since they traverse the memory pools directly and member function can onl be called on allocated object initialized vi the new operator (with a proper constructor call, so that the this pointer is set properly)). Perhaps we don't need access functions for this data member at all.
This function needs a better name since it is unclear what the "complete" string is.
This function needs a better name since it is unclear what the "complete" string is.
This function needs a better name since it is unclear what the "complete" string is.
This function needs a better name since it is unclear what the "complete" string is.
Make the "*PtrList" typed objects non-pointer data members (lists) instead of pointer to lists.
Implement a "-dumpversion" for compatibility with GNU (icc does the same).
There are a number of statements that contain a SgBasicBlock where they should contain a SgStatement. In each case changing the data member to be a SgStatement will unfortunately change the constructor parameter list and thus the ROSE API. So these changes have to be organized a a point where it is clear we will be changing some details of the ROSE API (prior to external release). Problem IR nodes are:
Fortran support for modifiers can be used as statement (must be added to IR): see section 5.2, 5.3 in Fortran 2003 standard. Note that type modifiers can be used as statements.
Fortran support requires statements in section 6.3.
Fortran support requires for "where" and "forall" statements.
Fortran support requires for case statement ranges (gnu extension for C, but standard in Fortran).
Fortran support requires statements in section 8.1.4, 8.1.5.
Fortran support requires statements in section 15 (modifiers for ISO_C_BINDING).
FIXED: The conditional test should be a SgStatement so that a declaration can be used, it is currently an SgExpression (specifically a SgExpressionRoot).
The body of the SgSwitch should really be a SgStatement not a SgBasicBlock. DuffsDevice can be modified to should an example of this but there are also much more trivial examples. See comment about this in the SgStatement todo list.
The rose_hash_multimap should perhaps be included as a data member instead of implemented as a pointer. We should consider this detail.
We should decide if we want to give Symbol Tables a name or not, it seems that we rarely if ever do this so perhaps we should not have such a field.
Template declarations marked as friend don't seem to be marked as friend internally.
The scope of a SgTemplateDeclaration should be a SgTemplateInstantiationDefn, since it could be associated which more than one definition. What we need, and don't have yet, is a SgTemplateDefinition to accompany the SgTemplateDeclaration then a SgTemplateDeclaration could have a SgTemplateDefinition for a parent and or scope when it is a member function or namespace or global scope (typically), otherwise.
Make the "*PtrList" typed objects non-pointer data members (lists) instead of pointer to lists.
Make the "*PtrList" typed objects non-pointer data members (lists) instead of pointer to lists.
Consider that get_type() returns a SgDefalutType and should return the SgType associated with the last expression in the list (research details of the list of pointers in the C++ throw operator).
Several classes derived from SgType are not used and can be removed:
The signed types (except for signed char) are not used in SAGE III and do not exist in C or C++. These IR nodes should be removed, specifically SgTypeSignedShort, SgTypeSignedInt, SgTypeSignedLong.
For Fortran support we need to add the kind, length data member to specify the width. To support handling of kind, length parameters we should use the information about the target backend compiler and map kind information to bit widths (not a high priority).
Labels appear to be used as types in "foo(*,*)", see example from Chris (LANL, 4/19/2007).
It might be better for this to be a list of SgTypedefTypes
Think about if we could also store a reference to all pointers and reference types where they share a common base_type.
Finish explaination of variable declaration, relationship to variable definition, and the scope issue.
template static variable declaration are instantiated and this is at least sometimes an error (at least when not part of a transformation). See test2005_69.C for example of this problem.
If a SgVariableDefinition is built internally as part of a SgVariableDeclaration it should be marked as compiler generated if the "extern" keyword was not used in the SgVariableDeclaration. This needs to be looked into.
Constant folding happens when the bitfield is a variable and the variable name is lost. This IR nodes needs to be modified to alternatively store the associated SgExpression (in case it is a root of an expression tree).
Need to figure out if it is such a great idea of a single symbol to be in two scopes or if it would be better to use two different symbols (since there are two different SgInitializedName object built (the last one referencing the previous one through the p_prev_decl_item pointer)).
The get_type() function can return NULL when the get_definition() is NULL. I think we should have assertiosn to make sure that get_definition is a valid pointer and that get_type() should not return NULL.
Test to verify that each variable reference is associated with the inner most scoped variable with that name, except where name qualified. Applies most easily to local variables. The same test could be used for function references, actually all references.
Make sure that declarations appear before variable references.
Finish documentation!
Finish documentation!
Finish documentation!