ROSE  0.9.6a
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Partitioner::CodeCriteria Class Reference

Criteria to decide whether a region of memory contains code. More...

#include <Partitioner.h>

Classes

struct  Criterion
 
struct  DictionaryEntry
 

Public Member Functions

 CodeCriteria ()
 
 CodeCriteria (const RegionStats *mean, const RegionStats *variance, double threshold)
 
virtual ~CodeCriteria ()
 
virtual CodeCriteriacreate () const
 
virtual double get_mean (size_t id) const
 
virtual void set_mean (size_t id, double mean)
 
virtual double get_variance (size_t id) const
 
virtual void set_variance (size_t id, double variance)
 
virtual double get_weight (size_t id) const
 
virtual void set_weight (size_t id, double weight)
 
void set_value (size_t id, double mean, double variance, double weight)
 
double get_threshold () const
 
void set_threshold (double th)
 
virtual double get_vote (const RegionStats *, std::vector< double > *votes=NULL) const
 
virtual bool satisfied_by (const RegionStats *, double *raw_vote_ptr=NULL, std::ostream *debug=NULL) const
 
virtual void print (std::ostream &, const RegionStats *stats=NULL, const std::vector< double > *votes=NULL, const double *total_vote=NULL) const
 

Static Public Member Functions

static size_t define_criterion (const std::string &name, const std::string &desc, size_t id=(size_t)(-1))
 
static size_t find_criterion (const std::string &name)
 
static size_t get_ncriteria ()
 
static const std::string & get_name (size_t id)
 
static const std::string & get_desc (size_t id)
 

Protected Member Functions

virtual void init (const RegionStats *mean, const RegionStats *variance, double threshold)
 

Static Protected Member Functions

static void init_class ()
 

Private Attributes

std::vector< Criterioncriteria
 
double threshold
 

Static Private Attributes

static std::vector
< DictionaryEntry
dictionary
 

Friends

std::ostream & operator<< (std::ostream &, const CodeCriteria &)
 

Detailed Description

Criteria to decide whether a region of memory contains code.

Ultimately, one often needs to answer the question of whether an arbitrary region of memory contains code or data. A CodeCriteria object can be used to help answer that question. Such an object contains criteria for multiple analyses. The criteria can be initialized by hand, or by running the analyses over parts of the program that we already know to be code (see Partitioner::aggregate_statistics()). In the latter case, the criteria are automatically fine tuned based on characteristics of the specimen executable itself.

Each criterion is assumed to have a Gaussian distribution (this class can be specialized if something else is needed) and therefore stores a mean and variance. Each criterion also stores a weight relative to the other criteria.

To determine the probability that a sample contains code, the analyses, $A_i$, are run over the sample to produce a set of analysis results $R_i$. Each analysis result is compared against the corresponding probability density function $f_i(x)$ to obtain the likelihood (in the range zero to one) that the sample is code. The probability density function is characterized by the criterion mean, $\mu_i$, and variance $\sigma_i^2$. The Guassian probability distribution function is:

\[ f_i(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2\sigma^2}} \]

The likelihood, $C_i(R_i)$ that $R_i$ is representative of valid code is computed as the area under the probability density curve further from the mean value than $R_i$. In other words:

\[ C_i(x) = 2 \int_{-\inf}^{\mu_i-|\mu_i-x|} f_i(x) dx = 1 - {\rm erf}(-\frac{|x-\mu_i|}{\sqrt{2\sigma_i^2}}) \]

A criterion that has an undefined $R_i$ value does not contribute to the final vote. Similarly, criteria that have zero variance contribute a vote of zero or one:

\[ C_i(x) = \left\{ \begin{array}{cl} 1 & \quad \mbox{if } x = \mu_i \\ 0 & \quad \mbox{if } x \ne \mu_i \end{array} \right. \]

The individual probabilities from each analysis are weighted relative to one another to obtain a final probability, which is then compared against a threshold. If the probability is equal to or greater than the threshold, then the sample is considered to be code.

The Partitioner never instantiates a CodeCriteria object directly, but rather always uses the new_code_criteria() virtual method. This allows the user to easily augment this class to do something more interesting.

Here's an example of using this class to determine if some uncategorized region of memory contains code. First we compute aggregate statistics across all the known functions. Then we use the mean and variance in those statistics to create a code criteria specification. Then we run the same analyses over the uncategorized region of memory and ask whether the results satisfy the criteria. This example is essentially the implementation of Partitioner::is_code().

partitioner->aggregate_statistics(); // compute stats if not already cached
Partitioner::RegionStats *mean = partitioner->get_aggregate_mean();
Partitioner::RegionStats *variance = partitioner->get_aggregate_variance();
Partitioner::CodeCriteria *cc = partitioner->new_code_criteria(mean, variance);
ExtentMap uncategorized_region = ....;
Partitioner::RegionStats *stats = region_statistics(uncategorized_region);
if (cc->satisfied_by(stats))
std::cout <<"this looks like code" <<std::endl;
delete stats;
delete cc;

Definition at line 841 of file Partitioner.h.

Constructor & Destructor Documentation

Partitioner::CodeCriteria::CodeCriteria ( )
inline

Definition at line 862 of file Partitioner.h.

References init_class().

Partitioner::CodeCriteria::CodeCriteria ( const RegionStats mean,
const RegionStats variance,
double  threshold 
)
inline

Definition at line 863 of file Partitioner.h.

References init(), and init_class().

virtual Partitioner::CodeCriteria::~CodeCriteria ( )
inlinevirtual

Definition at line 867 of file Partitioner.h.

Member Function Documentation

virtual CodeCriteria* Partitioner::CodeCriteria::create ( ) const
virtual
static size_t Partitioner::CodeCriteria::define_criterion ( const std::string &  name,
const std::string &  desc,
size_t  id = (size_t)(-1) 
)
static
static size_t Partitioner::CodeCriteria::find_criterion ( const std::string &  name)
static
static size_t Partitioner::CodeCriteria::get_ncriteria ( )
static
static const std::string& Partitioner::CodeCriteria::get_name ( size_t  id)
static
static const std::string& Partitioner::CodeCriteria::get_desc ( size_t  id)
static
virtual double Partitioner::CodeCriteria::get_mean ( size_t  id) const
virtual
virtual void Partitioner::CodeCriteria::set_mean ( size_t  id,
double  mean 
)
virtual

Referenced by set_value().

virtual double Partitioner::CodeCriteria::get_variance ( size_t  id) const
virtual
virtual void Partitioner::CodeCriteria::set_variance ( size_t  id,
double  variance 
)
virtual

Referenced by set_value().

virtual double Partitioner::CodeCriteria::get_weight ( size_t  id) const
virtual
virtual void Partitioner::CodeCriteria::set_weight ( size_t  id,
double  weight 
)
virtual

Referenced by set_value().

void Partitioner::CodeCriteria::set_value ( size_t  id,
double  mean,
double  variance,
double  weight 
)
inline

Definition at line 882 of file Partitioner.h.

References set_mean(), set_variance(), and set_weight().

double Partitioner::CodeCriteria::get_threshold ( ) const
inline

Definition at line 888 of file Partitioner.h.

References threshold.

void Partitioner::CodeCriteria::set_threshold ( double  th)
inline

Definition at line 889 of file Partitioner.h.

References threshold.

virtual double Partitioner::CodeCriteria::get_vote ( const RegionStats ,
std::vector< double > *  votes = NULL 
) const
virtual
virtual bool Partitioner::CodeCriteria::satisfied_by ( const RegionStats ,
double *  raw_vote_ptr = NULL,
std::ostream *  debug = NULL 
) const
virtual
virtual void Partitioner::CodeCriteria::print ( std::ostream &  ,
const RegionStats stats = NULL,
const std::vector< double > *  votes = NULL,
const double *  total_vote = NULL 
) const
virtual
static void Partitioner::CodeCriteria::init_class ( )
staticprotected

Referenced by CodeCriteria().

virtual void Partitioner::CodeCriteria::init ( const RegionStats mean,
const RegionStats variance,
double  threshold 
)
protectedvirtual

Referenced by CodeCriteria().

Friends And Related Function Documentation

std::ostream& operator<< ( std::ostream &  ,
const CodeCriteria  
)
friend

Member Data Documentation

std::vector<DictionaryEntry> Partitioner::CodeCriteria::dictionary
staticprivate

Definition at line 857 of file Partitioner.h.

std::vector<Criterion> Partitioner::CodeCriteria::criteria
private

Definition at line 858 of file Partitioner.h.

double Partitioner::CodeCriteria::threshold
private

Definition at line 859 of file Partitioner.h.

Referenced by get_threshold(), and set_threshold().


The documentation for this class was generated from the following file: