ROSE  0.9.6a
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
taintAnalysis.h
Go to the documentation of this file.
1 #ifndef ROSE_TaintAnalysis_H
2 #define ROSE_TaintAnalysis_H
3 
5 // Tainted flow analysis.
6 //
7 // The original version of this tainted flow analysis was written 2012-09 by someone other than the author of the
8 // genericDataflow framework. It is based on the sign analysis (sgnAnalysis.[Ch]) in this same directory since documentation
9 // for the genericDataflow framework is fairly sparse: 5 pages in the tutorial, not counting the code listings) and no doxygen
10 // documentation.
11 //
12 // This file contains two types of comments:
13 // 1. Comments that try to document some of the things I've discovered through playing with the genericDataflow framework.
14 // 2. Comments and suggestions about usability, consistency, applicability to binary analysis, etc.
15 //
16 // [RPM 2012-09]
18 
19 
20 // USABILITY: Names of header files aren't consistent across the genericDataflow files. E.g., in the "lattice" directory we
21 // have "lattice.h" that defines the Lattice class, but "ConstrGraph.h" that defines "ConstrGraph" (and apparently no
22 // documentation as to what "Constr" means).
23 #include "lattice.h"
24 #include "dataflow.h"
25 #include "liveDeadVarAnalysis.h" // misspelled? Shouldn't it be liveDeadVarsAnalysis or LiveDeadVarsAnalysis?
26 
27 
28 // USABILITY: The abundant use of dynamic_cast makes it seem like something's wrong with the whole dataflow design. And I
29 // couldn't find any documentation about when it's okay to cast from Lattice to one of its subclasses, so I've made
30 // the assumption throughout that when the dynamic_cast returns null, the node in question points to a variable or
31 // expression that the live/dead analysis has determined to be dead.
32 //
33 // USABILITY: No doxygen comments throughout genericDataflow framework?!? But it looks like there's some doxygen-like stuff
34 // describing a few function parameters, so is it using some other documenting system? I at least added the headers
35 // to the docs/Rose/rose.cfg file so doxygen picks up the structure.
36 //
37 // USABILITY: The genericDataflow framework always produces files named "index.html", "summary.html", and "detail.html" and
38 // a directory named "dbg_imgs" regardless of any debug settings. These are apprently the result of the Dbg::init()
39 // call made from the main program, and this call is required (omitting it results in segmentation faults). The
40 // file names are constant and therefore one should expect tests to clobber each other's outputs when run in
41 // parallel. The contents of the files cannot be trusted. Furthermore, since these are HTML files, an aborted test
42 // will generate only a partial HTML file which some browsers will choke on.
43 
44 
45 /******************************************************************************************************************************
46  * Taint Lattice
47  ******************************************************************************************************************************/
48 
54 class TaintLattice: public FiniteLattice {
55 public:
56 
60  enum Vertex {
64  // no need for a top since that would imply that the value is tainted. I.e., VERTEX_TAINTED *is* our top.
65  };
66 
67 protected:
70 public:
71 
74 
76  virtual void initialize() /*override*/ {
77  *this = TaintLattice();
78  }
79 
83  Vertex get_vertex() const { return vertex; }
84  bool set_vertex(Vertex v);
89  virtual Lattice *copy() const /*override*/ {
90  return new TaintLattice(*this);
91  }
92 
93  // USABILITY: The base class defines copy() without a const argument, so we must do the same here.
96  virtual void copy(/*const*/ Lattice *other_) /*override*/;
97 
98 
99  // USABILITY: The base class defines '==' with non-const argument and "this", so we must do the same here.
100  // USABILITY: This is not a real equality predicate since it's not reflexive. In other words, (A==B) does not imply (B==A)
101  // for all values of A and B.
103  virtual bool operator==(/*const*/ Lattice *other_) /*const*/ /*override*/;
104 
105  // USABILITY: The base class defines str() with non-const "this", so we must do the same here. That means that if we want
106  // to use this functionality from our own methods (that have const "this") we have to distill it out to some
107  // other place.
108  // USABILITY: The "prefix" argument is pointless. Why not just use StringUtility::prefixLines() in the base class rather
109  // than replicate this functionality all over the place?
113  virtual std::string str(/*const*/ std::string /*&*/prefix) /*const*/ /*override*/ {
114  return prefix + to_string();
115  }
116 
117  // USABILITY: We define this only because of deficiencies with the "str" signature in the base class. Otherwise our
118  // printing method (operator<<) could just use str(). We're trying to avoid evil const_cast.
121  std::string to_string() const;
122 
123  // USABILITY: The base class defines meetUpdate() with a non-const argument, so we must do the same here.
125  virtual bool meetUpdate(/*const*/ Lattice *other_) /*override*/;
126 
127  friend std::ostream& operator<<(std::ostream &o, const TaintLattice &lattice);
128 };
129 
130 /******************************************************************************************************************************
131  * Taint Flow Analysis
132  ******************************************************************************************************************************/
133 
135 protected:
137  std::ostream *debug;
138 
139 public:
140  // USABILITY: Documentation as to why a live/dead analysis is used in SgnAnalysis would be nice. I tried doing it without
141  // originally to make things simpler, but it seems that the FiniteVarsExprProductLattice depends on it even
142  // though I saw commented out code and comments somewhere(?) that indicated otherwise.
144  : ldv_analysis(ldv_analysis), debug(NULL) {}
145 
149  std::ostream *get_debug() const { return debug; }
150  void set_debug(std::ostream *os) { debug = os; }
153  // BINARIES: The "Function" type is a wrapper around SgFunctionDeclaration and the data flow traversals depend on this
154  // fact. Binaries don't have SgFunctionDeclaration nodes (they have SgAsmFunction, which is a bit different).
155  //
156  // NOTE: The "DataflowNode" is just a VirtualCFG::DataflowNode that contains a VirtualCFG::CFGNode pointer and a
157  // "filter". I didn't find any documentation for how "filter" is used.
158  //
159  // USABILITY: The "initLattices" and "initFacts" are not documented. They're apparently only outputs for this function
160  // since they seem to be empty on every call and are not const. They're apparently not parallel arrays since
161  // the examples I was looking at don't push the same number of items into each vector.
162  //
163  // USABILITY: Copied from src/midend/programAnalysis/genericDataflow/simpleAnalyses/sgnAnalysis.C. I'm not sure what
164  // it's doing yet since there's no doxygen documentation for FiniteVarsExprsProductLattice or any of its
165  // members.
166  //
167  // BINARIES: This might not work for binaries because FiniteVarsExprsProductLattice seems to do things in terms of
168  // variables. Variables are typically lacking from binary specimens and most existing binary analysis
169  // describes things in terms of static register names or dynamic memory locations.
172  void genInitState(const Function& func, const DataflowNode& node, const NodeState& state,
173  std::vector<Lattice*>& initLattices, std::vector<NodeFact*>& initFacts);
174 
175  // USABILITY: Not documented in doxygen, so I'm more or less copying from the SgnAnalysis::transfer() method defined in
176  // src/midend/programAnalysis/genericDataflow/sgnAnalysis.C.
184  bool transfer(const Function& func, const DataflowNode& node_, NodeState& state, const std::vector<Lattice*>& dfInfo);
185 
186 protected:
189  static std::string lattice_info(const TaintLattice *lattice) {
190  return lattice ? lattice->to_string() : "dead";
191  }
192 
209  bool magic_tainted(SgNode *node, FiniteVarsExprsProductLattice *prodLat);
210 };
211 
212 #endif