Criteria to decide whether a region of memory contains code. More...

#include <Partitioner.h>

Classes
struct	Criterion

struct	DictionaryEntry

Public Member Functions
	CodeCriteria ()

	CodeCriteria (const RegionStats mean, const RegionStats variance, double threshold)

virtual	~CodeCriteria ()

virtual CodeCriteria *	create () const

virtual double	get_mean (size_t id) const

virtual void	set_mean (size_t id, double mean)

virtual double	get_variance (size_t id) const

virtual void	set_variance (size_t id, double variance)

virtual double	get_weight (size_t id) const

virtual void	set_weight (size_t id, double weight)

void	set_value (size_t id, double mean, double variance, double weight)

double	get_threshold () const

void	set_threshold (double th)

virtual double	get_vote (const RegionStats , std::vector< double > votes=NULL) const

virtual bool	satisfied_by (const RegionStats , double raw_vote_ptr=NULL, std::ostream *debug=NULL) const

virtual void	print (std::ostream &, const RegionStats stats=NULL, const std::vector< double > votes=NULL, const double *total_vote=NULL) const

Static Public Member Functions
static size_t	define_criterion (const std::string &name, const std::string &desc, size_t id=(size_t)(-1))

static size_t	find_criterion (const std::string &name)

static size_t	get_ncriteria ()

static const std::string &	get_name (size_t id)

static const std::string &	get_desc (size_t id)

Protected Member Functions
virtual void	init (const RegionStats mean, const RegionStats variance, double threshold)

Static Protected Member Functions
static void	init_class ()

Private Attributes
std::vector< Criterion >	criteria

double	threshold

Static Private Attributes
static std::vector < DictionaryEntry >	dictionary

Friends
std::ostream &	operator<< (std::ostream &, const CodeCriteria &)

Detailed Description

Criteria to decide whether a region of memory contains code.

Ultimately, one often needs to answer the question of whether an arbitrary region of memory contains code or data. A CodeCriteria object can be used to help answer that question. Such an object contains criteria for multiple analyses. The criteria can be initialized by hand, or by running the analyses over parts of the program that we already know to be code (see Partitioner::aggregate_statistics()). In the latter case, the criteria are automatically fine tuned based on characteristics of the specimen executable itself.

Each criterion is assumed to have a Gaussian distribution (this class can be specialized if something else is needed) and therefore stores a mean and variance. Each criterion also stores a weight relative to the other criteria.

To determine the probability that a sample contains code, the analyses, $A_i$ , are run over the sample to produce a set of analysis results $R_i$ . Each analysis result is compared against the corresponding probability density function $f_i(x)$ to obtain the likelihood (in the range zero to one) that the sample is code. The probability density function is characterized by the criterion mean, $\mu_i$ , and variance $\sigma_i^2$ . The Guassian probability distribution function is:

$f_i(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}$

The likelihood, $C_i(R_i)$ that $R_i$ is representative of valid code is computed as the area under the probability density curve further from the mean value than $R_i$ . In other words:

$C_i(x) = 2 \int_{-\inf}^{\mu_i-|\mu_i-x|} f_i(x) dx = 1 - {\rm erf}(-\frac{|x-\mu_i|}{\sqrt{2\sigma_i^2}})$

A criterion that has an undefined $R_i$ value does not contribute to the final vote. Similarly, criteria that have zero variance contribute a vote of zero or one:

$C_i(x) = \left\{ \begin{array}{cl} 1 & \quad \mbox{if } x = \mu_i \\ 0 & \quad \mbox{if } x \ne \mu_i \end{array} \right.$

The individual probabilities from each analysis are weighted relative to one another to obtain a final probability, which is then compared against a threshold. If the probability is equal to or greater than the threshold, then the sample is considered to be code.

The Partitioner never instantiates a CodeCriteria object directly, but rather always uses the new_code_criteria() virtual method. This allows the user to easily augment this class to do something more interesting.

Here's an example of using this class to determine if some uncategorized region of memory contains code. First we compute aggregate statistics across all the known functions. Then we use the mean and variance in those statistics to create a code criteria specification. Then we run the same analyses over the uncategorized region of memory and ask whether the results satisfy the criteria. This example is essentially the implementation of Partitioner::is_code().

partitioner->aggregate_statistics(); // compute stats if not already cached
Partitioner::RegionStats *mean = partitioner->get_aggregate_mean();
Partitioner::RegionStats *variance = partitioner->get_aggregate_variance();
Partitioner::CodeCriteria *cc = partitioner->new_code_criteria(mean, variance);
ExtentMap uncategorized_region = ....;
Partitioner::RegionStats *stats = region_statistics(uncategorized_region);
if (cc->satisfied_by(stats))
    std::cout <<"this looks like code" <<std::endl;
delete stats;
delete cc;

Definition at line 841 of file Partitioner.h.

Constructor & Destructor Documentation

Partitioner::CodeCriteria::CodeCriteria ( )

inline

Definition at line 862 of file Partitioner.h.

References init_class().

Partitioner::CodeCriteria::CodeCriteria	(	const RegionStats *	mean,
		const RegionStats *	variance,
		double	threshold
	)

inline

Definition at line 863 of file Partitioner.h.

References init(), and init_class().

virtual Partitioner::CodeCriteria::~CodeCriteria ( )

inlinevirtual

Definition at line 867 of file Partitioner.h.

Member Function Documentation

virtual CodeCriteria* Partitioner::CodeCriteria::create ( ) const

virtual

static size_t Partitioner::CodeCriteria::define_criterion	(	const std::string &	name,
		const std::string &	desc,
		size_t	id = `(size_t)(-1)`
	)

static

static size_t Partitioner::CodeCriteria::find_criterion ( const std::string & name)

static

static size_t Partitioner::CodeCriteria::get_ncriteria ( )

static

static const std::string& Partitioner::CodeCriteria::get_name ( size_t id)

static

static const std::string& Partitioner::CodeCriteria::get_desc ( size_t id)

static

virtual double Partitioner::CodeCriteria::get_mean ( size_t id) const

virtual

virtual void Partitioner::CodeCriteria::set_mean	(	size_t	id,
		double	mean
	)

virtual

Referenced by set_value().

virtual double Partitioner::CodeCriteria::get_variance ( size_t id) const

virtual

virtual void Partitioner::CodeCriteria::set_variance	(	size_t	id,
		double	variance
	)

virtual

Referenced by set_value().

virtual double Partitioner::CodeCriteria::get_weight ( size_t id) const

virtual

virtual void Partitioner::CodeCriteria::set_weight	(	size_t	id,
		double	weight
	)

virtual

Referenced by set_value().

void Partitioner::CodeCriteria::set_value	(	size_t	id,
		double	mean,
		double	variance,
		double	weight
	)

inline

Definition at line 882 of file Partitioner.h.

References set_mean(), set_variance(), and set_weight().

double Partitioner::CodeCriteria::get_threshold ( ) const

inline

Definition at line 888 of file Partitioner.h.

References threshold.

void Partitioner::CodeCriteria::set_threshold ( double th)

inline

Definition at line 889 of file Partitioner.h.

References threshold.

virtual double Partitioner::CodeCriteria::get_vote	(	const RegionStats *	,
		std::vector< double > *	votes = `NULL`
	)		const

virtual

virtual bool Partitioner::CodeCriteria::satisfied_by	(	const RegionStats *	,
		double *	raw_vote_ptr = `NULL`,
		std::ostream *	debug = `NULL`
	)		const

virtual

virtual void Partitioner::CodeCriteria::print	(	std::ostream &	,
		const RegionStats *	stats = `NULL`,
		const std::vector< double > *	votes = `NULL`,
		const double *	total_vote = `NULL`
	)		const

virtual

static void Partitioner::CodeCriteria::init_class ( )

staticprotected

Referenced by CodeCriteria().

virtual void Partitioner::CodeCriteria::init	(	const RegionStats *	mean,
		const RegionStats *	variance,
		double	threshold
	)

protectedvirtual

Referenced by CodeCriteria().

Friends And Related Function Documentation

std::ostream& operator<<	(	std::ostream &	,
		const CodeCriteria &
	)

friend

Member Data Documentation

std::vector<DictionaryEntry> Partitioner::CodeCriteria::dictionary

staticprivate

Definition at line 857 of file Partitioner.h.

std::vector<Criterion> Partitioner::CodeCriteria::criteria

private

Definition at line 858 of file Partitioner.h.

double Partitioner::CodeCriteria::threshold

private

Definition at line 859 of file Partitioner.h.

Referenced by get_threshold(), and set_threshold().

The documentation for this class was generated from the following file:

Partitioner.h

Classes

Public Member Functions

Static Public Member Functions

Protected Member Functions

Static Protected Member Functions

Private Attributes

Static Private Attributes

Friends

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation

Friends And Related Function Documentation

Member Data Documentation