ROSE  0.9.6a
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Partitioner Class Reference

Partitions instructions into basic blocks and functions. More...

#include <Partitioner.h>

Collaboration diagram for Partitioner:

Classes

struct  AbandonFunctionDiscovery
 Exception thrown to defer function block discovery. More...
 
struct  BasicBlock
 Represents a basic block within the Partitioner. More...
 
class  BlockAnalysisCache
 Analysis that can be cached in a block. More...
 
struct  BlockConfig
 Basic block configuration information. More...
 
class  ByteRangeCallback
 Base class for byte scanning callbacks. More...
 
class  CodeCriteria
 Criteria to decide whether a region of memory contains code. More...
 
struct  DataBlock
 Represents a region of static data within the address space being disassembled. More...
 
class  DataRangeMapValue
 Value type for DataRangeMap. More...
 
struct  Exception
 
struct  FindData
 Callback to add unassigned addresses to a function. More...
 
struct  FindDataPadding
 Callback to detect padding. More...
 
struct  FindFunctionFragments
 Callback to insert unreachable code fragments. More...
 
struct  FindInsnPadding
 Callback to create inter-function instruction padding. More...
 
struct  FindInterPadFunctions
 Callback to find functions that are between padding. More...
 
struct  FindPostFunctionInsns
 Callback to add post-function instructions to the preceding function. More...
 
struct  FindThunks
 Callback to find thunks. More...
 
struct  FindThunkTables
 Callback to find thunk tables. More...
 
class  Function
 Represents a function within the Partitioner. More...
 
class  FunctionRangeMapValue
 Value type for FunctionRangeMap. More...
 
struct  FunctionStart
 Information about each function starting address. More...
 
class  InsnRangeCallback
 Base class for instruction scanning callbacks. More...
 
class  Instruction
 Holds an instruction along with some other information about the instruction. More...
 
class  IPDParser
 This is the parser for the instruction partitioning data (IPD) files. More...
 
class  RegionStats
 Statistics computed over a region of an address space. More...
 

Public Types

typedef std::map< rose_addr_t,
Disassembler::AddressSet
BasicBlockStarts
 Map of basic block starting addresses. More...
 
typedef std::map< rose_addr_t,
FunctionStart
FunctionStarts
 Map describing the starting address of each known function. More...
 
typedef RangeMap< Extent,
FunctionRangeMapValue
FunctionRangeMap
 Range map associating addresses with functions. More...
 
typedef RangeMap< Extent,
DataRangeMapValue
DataRangeMap
 Range map associating addresses with functions. More...
 
typedef ROSE_Callbacks::List
< InsnRangeCallback
InsnRangeCallbacks
 
typedef ROSE_Callbacks::List
< ByteRangeCallback
ByteRangeCallbacks
 

Public Member Functions

BasicBlockStarts detectBasicBlocks (const Disassembler::InstructionMap &) const __attribute__((deprecated))
 Find the beginnings of basic blocks based on instruction type and call targets. More...
 
FunctionStarts detectFunctions (SgAsmInterpretation *, const Disassembler::InstructionMap &insns, BasicBlockStarts &bb_starts) const __attribute__((deprecated))
 Returns a list of the currently defined functions. More...
 
 Partitioner ()
 
virtual ~Partitioner ()
 
virtual void set_search (unsigned heuristics)
 Sets the set of heuristics used by the partitioner. More...
 
virtual unsigned get_search () const
 Returns a bit mask of SgAsmFunction::FunctionReason bits indicating which heuristics would be used by the partitioner. More...
 
void set_allow_discontiguous_blocks (bool b)
 Turns on/off the allowing of discontiguous basic blocks. More...
 
bool get_allow_discontiguous_blocks () const
 Returns an indication of whether discontiguous blocks are allowed. More...
 
void set_debug (FILE *f)
 Sends diagnostics to the specified output stream. More...
 
FILE * get_debug () const
 Returns the file currently used for debugging; null implies no debugging. More...
 
void set_progress_reporting (FILE *, unsigned min_interval)
 Set progress reporting properties. More...
 
void add_function_detector (FunctionDetector f)
 Adds a user-defined function detector to this partitioner. More...
 
virtual SgAsmBlockpartition (SgAsmInterpretation *, const Disassembler::InstructionMap &, MemoryMap *mmap=NULL)
 Top-level function to run the partitioner on some instructions and build an AST. More...
 
virtual SgAsmBlockpartition (SgAsmInterpretation *, Disassembler *, MemoryMap *)
 Top-level function to run the partitioner, calling the specified disassembler as necessary to generate instructions. More...
 
virtual void clear ()
 Reset partitioner to initial conditions by discarding all instructions, basic blocks, functions, and configuration file settings and definitions. More...
 
virtual void load_config (const std::string &filename)
 Loads the specified configuration file. More...
 
virtual void add_instructions (const Disassembler::InstructionMap &insns)
 Adds additional instructions to be processed. More...
 
Disassembler::InstructionMap get_instructions () const
 Get the list of all instructions. More...
 
const Disassembler::BadMapget_disassembler_errors () const
 Get the list of disassembler errors. More...
 
void clear_disassembler_errors ()
 Clears errors from the disassembler. More...
 
virtual Instructionfind_instruction (rose_addr_t, bool create=true)
 Finds an instruction at the specified address. More...
 
virtual Instructiondiscard (Instruction *, bool discard_entire_block=false)
 Drop an instruction from consideration. More...
 
virtual BasicBlockdiscard (BasicBlock *)
 Drop a basic block from the partitioner. More...
 
virtual Functionadd_function (rose_addr_t entry_va, unsigned reasons, std::string name="")
 Adds a new function definition to the partitioner. More...
 
virtual Functionfind_function (rose_addr_t entry_va)
 Looks up a function by address. More...
 
virtual SgAsmBlockbuild_ast (SgAsmInterpretation *interp=NULL)
 Builds the AST describing all the functions. More...
 
virtual void fixup_cfg_edges (SgNode *ast)
 Update control flow graph edge nodes. More...
 
virtual void fixup_pointers (SgNode *ast, SgAsmInterpretation *interp=NULL)
 Updates pointers inside instructions. More...
 
virtual RegionStatsnew_region_stats ()
 Create a new region statistics object. More...
 
virtual RegionStatsaggregate_statistics (bool do_variance=true)
 Computes aggregate statistics over all known functions. More...
 
virtual void clear_aggregate_statistics ()
 Causes the partitioner to forget statistics. More...
 
virtual bool is_code (const ExtentMap &region, double *raw_vote_ptr=NULL, std::ostream *debug=NULL)
 Determines if a region contains code. More...
 
virtual void append (BasicBlock *, Instruction *)
 Add an instruction to a basic block. More...
 
virtual void append (BasicBlock *, DataBlock *, unsigned reasons)
 Associate a data block with a basic block. More...
 
virtual void append (Function *, BasicBlock *, unsigned reasons, bool keep=false)
 Append basic block to function. More...
 
virtual void append (Function *, DataBlock *, unsigned reasons, bool force=false)
 Append data region to function. More...
 
virtual void remove (Function *, BasicBlock *)
 Remove a basic block from a function. More...
 
virtual void remove (Function *, DataBlock *)
 Remove a data block from a function. More...
 
virtual void remove (BasicBlock *, DataBlock *)
 Remove a data block from a basic block. More...
 
virtual BasicBlockfind_bb_containing (rose_addr_t, bool create=true)
 Finds a basic block containing the specified instruction address. More...
 
virtual BasicBlockfind_bb_starting (rose_addr_t, bool create=true)
 Makes sure the block at the specified address exists. More...
 
virtual DataBlockfind_db_starting (rose_addr_t, size_t size)
 Finds (or creates) a data block. More...
 
virtual Disassembler::AddressSet successors (BasicBlock *, bool *complete=NULL)
 Returns known successors of a basic block. More...
 
virtual rose_addr_t call_target (BasicBlock *)
 Returns call target if block could be a function call. More...
 
virtual void truncate (BasicBlock *, rose_addr_t)
 Reduces the size of a basic block by truncating its list of instructions. More...
 
virtual void discover_first_block (Function *)
 Adds first basic block to empty function before we start discovering blocks of any other functions. More...
 
virtual void discover_blocks (Function *, unsigned reason)
 
virtual void discover_blocks (Function *, rose_addr_t, unsigned reason)
 Discover the basic blocks that belong to the current function. More...
 
virtual void pre_cfg (SgAsmInterpretation *interp=NULL)
 Detects functions before analyzing the CFG. More...
 
virtual void analyze_cfg (SgAsmBlock::Reason)
 Detect functions by analyzing the CFG. More...
 
virtual void post_cfg (SgAsmInterpretation *interp=NULL)
 Detects functions after analyzing the CFG. More...
 
virtual SgAsmFunctionbuild_ast (Function *)
 Build an AST for a single function. More...
 
virtual SgAsmBlockbuild_ast (BasicBlock *)
 Build an AST for a single basic block. More...
 
virtual SgAsmBlockbuild_ast (DataBlock *)
 Build an AST for a single data block. More...
 
virtual bool pops_return_address (rose_addr_t)
 Determines if a block pops the stack w/o returning. More...
 
virtual void update_analyses (BasicBlock *)
 Runs local block analyses if their cached results are invalid and caches the results. More...
 
virtual rose_addr_t canonic_block (rose_addr_t)
 Follow alias links in basic blocks. More...
 
virtual bool is_function_call (BasicBlock *, rose_addr_t *)
 Returns true if basic block appears to end with a function call. More...
 
virtual bool is_thunk (Function *)
 Determines if function is a thunk. More...
 
virtual Functioneffective_function (DataBlock *)
 Returns the function to which this data block is effectively assigned. More...
 
virtual void mark_call_insns ()
 Naive marking of CALL instruction targets as functions. More...
 
virtual void mark_ipd_configuration ()
 Seeds partitioner with IPD configuration information. More...
 
virtual void mark_entry_targets (SgAsmGenericHeader *)
 Seeds functions for program entry points. More...
 
virtual void mark_export_entries (SgAsmGenericHeader *)
 Seeds functions for PE exports. More...
 
virtual void mark_eh_frames (SgAsmGenericHeader *)
 Seeds functions for error handling frames. More...
 
virtual void mark_elf_plt_entries (SgAsmGenericHeader *)
 Seeds functions that are dynamically linked via .plt. More...
 
virtual void mark_func_symbols (SgAsmGenericHeader *)
 Seeds functions that correspond to function symbols. More...
 
virtual void mark_func_patterns ()
 Seeds functions according to byte and instruction patterns. More...
 
virtual void name_plt_entries (SgAsmGenericHeader *)
 Gives names to dynmaic linking trampolines for ELF. More...
 
virtual void name_import_entries (SgAsmGenericHeader *)
 Gives names to dynamic linking thunks for PE. More...
 
virtual void find_pe_iat_extents (SgAsmGenericHeader *)
 Find the addresses for all PE Import Address Tables. More...
 
virtual size_t function_extent (FunctionRangeMap *extents)
 Adds extents for all defined functions. More...
 
virtual size_t function_extent (Function *, FunctionRangeMap *extents=NULL, rose_addr_t *lo_addr=NULL, rose_addr_t *hi_addr=NULL)
 Returns information about the function addresses. More...
 
virtual size_t datablock_extent (DataBlock *, DataRangeMap *extents=NULL, rose_addr_t *lo_addr=NULL, rose_addr_t *hi_addr=NULL)
 Returns information about the datablock addresses. More...
 
virtual size_t datablock_extent (DataRangeMap *extent)
 Adds assigned datablocks to extent. More...
 
virtual size_t padding_extent (DataRangeMap *extent)
 Adds padding datablocks to extent. More...
 
virtual bool is_contiguous (Function *, bool strict=false)
 Returns an indication of whether a function is contiguous. More...
 
void progress (FILE *, const char *fmt,...) const __attribute__((format(printf
 Conditionally prints a progress report. More...
 
void virtual size_t detach_thunks ()
 Splits thunks off of the start of functions. More...
 
virtual bool detach_thunk (Function *)
 Splits one thunk off the start of a function if possible. More...
 
void name_pe_dynlink_thunks (SgAsmInterpretation *interp)
 Gives names to PE dynamic linking thunks if possible. More...
 
virtual void adjust_padding ()
 Adjusts ownership of padding data blocks. More...
 
virtual void merge_function_fragments ()
 Merge function fragments. More...
 
virtual void merge_functions (Function *parent, Function *other)
 Merge two functions. More...
 
Disassembler::AddressSet discover_jump_table (BasicBlock *bb, bool do_create=true, ExtentMap *table_addresses=NULL)
 Looks for a jump table. More...
 
void set_map (MemoryMap *mmap, MemoryMap *ro_mmap=NULL)
 Accessors for the memory maps. More...
 
MemoryMapget_map () const
 Accessors for the memory maps. More...
 
virtual CodeCriterianew_code_criteria ()
 Create a new criteria object. More...
 
virtual CodeCriterianew_code_criteria (const RegionStats *mean, const RegionStats *variance, double threshold)
 Create a new criteria object. More...
 
virtual RegionStatsregion_statistics (const ExtentMap &)
 Computes various statistics over part of an address space. More...
 
virtual RegionStatsregion_statistics (Function *)
 Computes various statistics over part of an address space. More...
 
virtual RegionStatsregion_statistics ()
 Computes various statistics over part of an address space. More...
 
virtual RegionStatsget_aggregate_mean () const
 Accessors for cached aggregate statistics. More...
 
virtual RegionStatsget_aggregate_variance () const
 Accessors for cached aggregate statistics. More...
 
virtual size_t count_kinds (const InstructionMap &)
 Counts the number of distinct kinds of instructions. More...
 
virtual size_t count_kinds ()
 Counts the number of distinct kinds of instructions. More...
 
virtual size_t count_privileged (const InstructionMap &)
 Counts the number of privileged instructions. More...
 
virtual size_t count_privileged ()
 Counts the number of privileged instructions. More...
 
virtual double ratio_privileged ()
 Counts the number of privileged instructions. More...
 
virtual size_t count_floating_point (const InstructionMap &)
 Counts the number of floating point instructions. More...
 
virtual size_t count_floating_point ()
 Counts the number of floating point instructions. More...
 
virtual double ratio_floating_point ()
 Counts the number of floating point instructions. More...
 
virtual size_t count_registers (const InstructionMap &, double *mean=NULL, double *variance=NULL)
 Counts the number of register references. More...
 
virtual size_t count_registers (double *mean=NULL, double *variance=NULL)
 Counts the number of register references. More...
 
virtual double ratio_registers (double *mean=NULL, double *variance=NULL)
 Counts the number of register references. More...
 
virtual double count_size_variance (const InstructionMap &insns)
 Returns the variance of instruction bit widths. More...
 
virtual double count_size_variance ()
 Returns the variance of instruction bit widths. More...
 
virtual CodeCriteriaget_code_criteria () const
 Accessors for code criteria. More...
 
virtual void set_code_criteria (CodeCriteria *cc)
 Accessors for code criteria. More...
 
virtual void scan_contiguous_insns (InstructionMap insns, InsnRangeCallbacks &cblist, Instruction *insn_prev, Instruction *insn_end)
 Scans contiguous sequences of instructions. More...
 
void scan_contiguous_insns (const InstructionMap &insns, InsnRangeCallback *callback, Instruction *insn_prev, Instruction *insn_end)
 Scans contiguous sequences of instructions. More...
 
virtual void scan_unassigned_insns (InsnRangeCallbacks &callbacks)
 Scans ranges of unassigned instructions. More...
 
void scan_unassigned_insns (InsnRangeCallback *callback)
 Scans ranges of unassigned instructions. More...
 
virtual void scan_intrafunc_insns (InsnRangeCallbacks &callbacks)
 Scans the unassigned instructions within a function. More...
 
void scan_intrafunc_insns (InsnRangeCallback *callback)
 Scans the unassigned instructions within a function. More...
 
virtual void scan_interfunc_insns (InsnRangeCallbacks &callbacks)
 Scans the instructions between functions. More...
 
void scan_interfunc_insns (InsnRangeCallback *callback)
 Scans the instructions between functions. More...
 
virtual void scan_unassigned_bytes (ByteRangeCallbacks &callbacks, MemoryMap *restrict_map=NULL)
 Scans ranges of the address space that have not been assigned to any function. More...
 
void scan_unassigned_bytes (ByteRangeCallback *callback, MemoryMap *restrict_map=NULL)
 Scans ranges of the address space that have not been assigned to any function. More...
 
virtual void scan_intrafunc_bytes (ByteRangeCallbacks &callbacks, MemoryMap *restrict_map=NULL)
 Scans unassigned ranges of the address space within a function. More...
 
void scan_intrafunc_bytes (ByteRangeCallback *callback, MemoryMap *restrict_map=NULL)
 Scans unassigned ranges of the address space within a function. More...
 
virtual void scan_interfunc_bytes (ByteRangeCallbacks &callbacks, MemoryMap *restrict_map=NULL)
 Scans unassigned ranges of the address space between functions. More...
 
void scan_interfunc_bytes (ByteRangeCallback *callback, MemoryMap *restrict_map=NULL)
 Scans unassigned ranges of the address space between functions. More...
 
bool is_pe_dynlink_thunk (Instruction *)
 Returns true if the basic block is a PE dynamic linking thunk. More...
 
bool is_pe_dynlink_thunk (BasicBlock *)
 Returns true if the basic block is a PE dynamic linking thunk. More...
 
bool is_pe_dynlink_thunk (Function *)
 Returns true if the basic block is a PE dynamic linking thunk. More...
 

Static Public Member Functions

static unsigned parse_switches (const std::string &, unsigned initial_flags)
 Parses a string describing the heuristics and returns the bit vector that can be passed to set_search(). More...
 
static void disassembleInterpretation (SgAsmInterpretation *)
 Called by frontend() to disassemble an entire interpretation. More...
 
static rose_addr_t get_indirection_addr (SgAsmInstruction *, rose_addr_t offset)
 Return the virtual address that holds the branch target for an indirect branch. More...
 
static rose_addr_t value_of (SgAsmValueExpression *)
 Returns the integer value of a value expression since there's no virtual method for doing this. More...
 

Public Attributes

Disassemblerdisassembler
 Optional disassembler to call when an instruction is needed. More...
 
InstructionMap insns
 Instruction cache, filled in by user or populated by disassembler. More...
 
MemoryMapmap
 Memory map used for disassembly if disassembler is present. More...
 
MemoryMap ro_map
 The read-only parts of 'map', used for insn semantics mem reads. More...
 
ExtentMap pe_iat_extents
 Virtual addresses for all PE Import Address Tables. More...
 
Disassembler::BadMap bad_insns
 Captured disassembler exceptions. More...
 
BasicBlocks basic_blocks
 All known basic blocks. More...
 
Functions functions
 All known functions, pending and complete. More...
 
DataBlocks data_blocks
 Blocks that point to static data. More...
 
unsigned func_heuristics
 Bit mask of SgAsmFunction::FunctionReason bits. More...
 
std::vector< FunctionDetectoruser_detectors
 List of user-defined function detection methods. More...
 
FILE * debug
 Stream where diagnistics are sent (or null). More...
 
bool allow_discont_blocks
 Allow basic blocks to be discontiguous in virtual memory. More...
 
BlockConfigMap block_config
 IPD configuration info for basic blocks. More...
 

Static Public Attributes

static time_t progress_interval = 10
 Minimum interval between progress reports. More...
 
static time_t progress_time = 0
 Time of last report, or zero if no report has been generated. More...
 
static FILE * progress_file = stderr
 File to which reports are made. More...
 
static const rose_addr_t NO_TARGET = (rose_addr_t)-1
 

Protected Types

typedef std::map< rose_addr_t,
Instruction * > 
InstructionMap
 
typedef std::vector
< Instruction * > 
InstructionVector
 
typedef std::map< rose_addr_t,
BasicBlock * > 
BasicBlocks
 
typedef std::map< rose_addr_t,
DataBlock * > 
DataBlocks
 
typedef std::map< rose_addr_t,
Function * > 
Functions
 
typedef void(* FunctionDetector )(Partitioner *, SgAsmGenericHeader *)
 Data type for user-defined function detectors. More...
 
typedef std::map< rose_addr_t,
BlockConfig * > 
BlockConfigMap
 

Static Protected Member Functions

static
InstructionMap::const_iterator 
pattern1 (const InstructionMap &insns, InstructionMap::const_iterator first, Disassembler::AddressSet &exclude)
 Looks for stack frame setup. More...
 
static SgAsmInstructionisSgAsmInstruction (const Instruction *)
 Augments dynamic casts defined from ROSETTA. More...
 
static SgAsmInstructionisSgAsmInstruction (SgNode *)
 Augments dynamic casts defined from ROSETTA. More...
 
static SgAsmx86InstructionisSgAsmx86Instruction (const Instruction *)
 Augments dynamic casts defined from ROSETTA. More...
 
static SgAsmx86InstructionisSgAsmx86Instruction (SgNode *)
 Augments dynamic casts defined from ROSETTA. More...
 

Protected Attributes

RegionStatsaggregate_mean
 Aggregate statistics returned by get_region_stats_mean(). More...
 
RegionStatsaggregate_variance
 Aggregate statistics returned by get_region_stats_variance(). More...
 
CodeCriteriacode_criteria
 Criteria used to determine if a region contains code or data. More...
 

Detailed Description

Partitions instructions into basic blocks and functions.

The Partitioner classes are responsible for assigning instructions to basic blocks, and basic blocks to functions. A "basic block" is a sequence of instructions where control flow enters at only the first instruction and exits at only the last instruction. The definition can be further restricted to include only those instructions that are contiguous and non-overlapping in virtual memory by setting the set_allow_discontiguous_blocks() property to false. Every instruction belongs to exactly one basic block. A "function" is a collection of basic blocks having a single entry point. Every basic block belongs to exactly one function. If ROSE cannot determine what function a block belongs to, it will be placed in a special function created for the purpose of holding such blocks.

A partitioner can operate in one of two modes: it can use a list of instructions that has been previously disassembled (a.k.a. "passive mode"), or it can drive a Disassembler to obtain instructions as necessary (a.k.a. "active mode"). Each mode has its benefits:

  • Active mode disassembles only what is actually necessary. Although the disassembler is recursive and can follow the flow of control, its analyses are not as thorough or robust as the partitioner. Therfore, in order to be sure that the partitioner has all the instructions it needs in this mode, the disassembler is usually run in a very aggressive mode. In the end, much of what was disassembled is then thrown away.
  • Passive mode allows the partitioner to search for instruction sequences in parts of the specimen address space that the partitioner otherwise would not have disassembled in active mode. Thus, passive mode can detect function entry addresses by searching for function prologues such as the common x86 pair "push ebp; mov ebp, esp".
  • Passive mode delegates all disassembling to some other software layer. This gives the user full control over the disassembly process before partitioning even starts.
  • Passive mode can be used to force the partitioner to omit certain address ranges or instruction sequences from consideration. For example, one could disassemble certain parts of program while skipping over instructions that were inserted maliciously in order to thwart disassembly. On the other hand a similar effect can be had by manipulating the MemoryMap used by an active partitioner. The case where this doesn't work is when two instruction streams overlap and we want to exclude one of them from consideration.

The partitioner organizes instructions into basic blocks by starting at a particular instruction and then looking at its set of successor addresses. Successor addresses are edges of the eventual control-flow graph (CFG) that are calculated using instruction semantics, and are available to the end user via SgAsmBlock::get_cached_successors(). It's not always possible to statically determine the complete set of successors; in this case, get_cached_successors() will return at least one successor whose isAmbiguous() property is true. Because ROSE performs semantic analysis over all instructions of the basic block (and occassionally related blocks), it can sometimes determine that instructions that are usually considered to be conditional branches are, in fact, unconditional (e.g., a PUSH followed by a RET; or a CALL to a block that discards the return address). ROSE includes unconditional branch instructions and their targets within the same basic block (provided no other CFG edge comes into the middle of the block).

The partitioner organizes basic blocks into functions in three phases, all three of which can be run by a single call to the partitioner() method. The first phase considers all disassembled instructions and other available information such as symbol tables and tries to determine which addresses are entry points for functions. For instance, if the symbol table contains function symbols, then the address stored in the symbol table is assumed to be the entry point of a function. ROSE has a variety of these "pre-cfg" detection methods which can be enabled/disabled at runtime with the set_search() method. ROSE also supports user-defined search methods that can be registered with add_function_detector(). The three phases are initialized and influenced by the contents of an optional configuration file specified with the load_config() method.

The second phase for assigning blocks to functions is via analysis of the control-flow graph. In a nutshell, ROSE traverses the CFG starting with the entry address of each function, adding blocks to the function as it goes. When it detects that a block has edges coming in from two different functions, it creates a new function whose entry point is that block (see definition of "function" above; a function can only have one entry point).

The third and final phase, called "post-cfg", makes final adjustments, such as adding SgAsmFunction objects for no-op or zero padding occuring between the previously detected functions. This could also be user-extended to add blocks to functions that ROSE detected during CFG analysis (such as unreferenced basic blocks, no-ops, etc. that occur within the extent of a function.)

By default, ROSE constructs a Partitioner to use for partitioning instructions of binary files during frontend() parsing (this happens in Disassembler::disassembleInterpretation()). This Partitioner's settings are controlled by two command-line switches whose documentation can be seen by running any ROSE program with the –help switch. These command-line switches operate by setting property values in the SgFile node and then transferring them to the Partitioner when the Partitioner is constructed.

The results of block and function detection are stored in the Partitioner object itself. One usually retrieves this information via a call to the build_ast() method, which constructs a ROSE AST. Any instructions that were not assigned to blocks of a function can be optionally discarded (see the SgAsmFunction::FUNC_LEFTOVERS bit of set_search(), or the "leftovers" parameter of the "-rose:partitioner_search" command-line switch).

The Partitioner class can easily be subclassed by the user. Some of the Disassembler methods automatically call Partitioner::partition(), using either a temporarily instantiated Partitioner, or a partitioner provided by the user.

NOTE: Some of the methods used by the partitioner are more complex than one might originally imagine. The CFG analysis, for instance, must contend with the fact that the graph nodes (basic blocks) are changing as new edges are discovered (e.g., splitting a large block when we discover an edge coming into the middle). Changes to the nodes result in changes to the edges (e.g., a PUSH/RET pair is an unconditional branch to a known target, but if the block were to be divided then the RET becomes a branch to an unknown address (i.e., the edge disappears).

Another complexity is that the CFG analysis must avoid circular logic. Consider the following instructions:

1: PUSH 2
2: NOP
3: RET

When all three instructions are in the same basic block, as they are initially, the RET is an unconditional branch to the NOP. This splits the block and the RET no longer has known successors (according to block semantic analysis). So the edge from the RET to the NOP disappears and the three instructions would coalesce back into a single basic block. In this situation, ROSE keeps these three instructions as two basic blocks with no CFG edges to the second block.

A third complexity is that the Partitioner cannot rely on the usual ROSE AST traversal mechanisms because it must perform its work before the AST is created. However, the Partitioner benefits from this situation by being able to use data structures and methods that are optimized for performance.

A final complexity, is that the Disassembler and Partitioner classes are both designed to be useful in a general way, and independent of each other. These two classes can be called even when the user doesn't have an AST. For instance, the tests/roseTests/binaryTests/disassembleBuffer.C is an example of how the Disassembler and Partitioner classes can be used to disassemble and partition a buffer of instructions obtained outside of ROSE's binary file parsing mechanisms.

Definition at line 116 of file Partitioner.h.

Member Typedef Documentation

typedef std::map<rose_addr_t, Instruction*> Partitioner::InstructionMap
protected

Definition at line 154 of file Partitioner.h.

typedef std::vector<Instruction*> Partitioner::InstructionVector
protected

Definition at line 155 of file Partitioner.h.

typedef std::map<rose_addr_t, BasicBlock*> Partitioner::BasicBlocks
protected

Definition at line 246 of file Partitioner.h.

typedef std::map<rose_addr_t, DataBlock*> Partitioner::DataBlocks
protected

Definition at line 276 of file Partitioner.h.

typedef std::map<rose_addr_t, Function*> Partitioner::Functions
protected

Definition at line 352 of file Partitioner.h.

typedef void(* Partitioner::FunctionDetector)(Partitioner *, SgAsmGenericHeader *)
protected

Data type for user-defined function detectors.

Definition at line 355 of file Partitioner.h.

typedef std::map<rose_addr_t, BlockConfig*> Partitioner::BlockConfigMap
protected

Definition at line 367 of file Partitioner.h.

Map of basic block starting addresses.

The key is the virtual address of the first instruction in the basic block; the value is the set of all virtual addresses of instructions known to branch to this basic block (i.e., set of all known callers).

Deprecated:
This data type is used only for backward compatibility by detectBasicBlocks() and detectFunctions(). It has been replaced by Partitioner::BasicBlocks.

Definition at line 383 of file Partitioner.h.

Map describing the starting address of each known function.

Deprecated:
This type has been replaced with Partitioner::Functions, which is capable of describing noncontiguous functions.

Definition at line 404 of file Partitioner.h.

Range map associating addresses with functions.

Definition at line 673 of file Partitioner.h.

Range map associating addresses with functions.

Definition at line 696 of file Partitioner.h.

Constructor & Destructor Documentation

Partitioner::Partitioner ( )
inline

Definition at line 418 of file Partitioner.h.

virtual Partitioner::~Partitioner ( )
inlinevirtual

Definition at line 422 of file Partitioner.h.

References clear().

Member Function Documentation

SgAsmInstruction * Partitioner::isSgAsmInstruction ( const Instruction insn)
staticprotected

Augments dynamic casts defined from ROSETTA.

A Partitioner::Instruction used to be just a SgAsmInstruction before we needed to combine it with some additional info for the partitioner. Therefore, there's quite a bit of code (within the partitioner) that treats them as AST nodes. Rather than replace every occurrance of isSgAsmInstruction(N) with something like (N?isSgAsmInstruction(N->node):NULL), we add additional versions of the necessary global functions, but define them only within the partitioner.

Definition at line 33 of file Partitioner.C.

References isSgAsmInstruction(), and Partitioner::Instruction::node.

SgAsmInstruction * Partitioner::isSgAsmInstruction ( SgNode node)
staticprotected

Augments dynamic casts defined from ROSETTA.

A Partitioner::Instruction used to be just a SgAsmInstruction before we needed to combine it with some additional info for the partitioner. Therefore, there's quite a bit of code (within the partitioner) that treats them as AST nodes. Rather than replace every occurrance of isSgAsmInstruction(N) with something like (N?isSgAsmInstruction(N->node):NULL), we add additional versions of the necessary global functions, but define them only within the partitioner.

Definition at line 40 of file Partitioner.C.

References isSgAsmInstruction().

SgAsmx86Instruction * Partitioner::isSgAsmx86Instruction ( const Instruction insn)
staticprotected

Augments dynamic casts defined from ROSETTA.

A Partitioner::Instruction used to be just a SgAsmInstruction before we needed to combine it with some additional info for the partitioner. Therefore, there's quite a bit of code (within the partitioner) that treats them as AST nodes. Rather than replace every occurrance of isSgAsmInstruction(N) with something like (N?isSgAsmInstruction(N->node):NULL), we add additional versions of the necessary global functions, but define them only within the partitioner.

Definition at line 47 of file Partitioner.C.

References isSgAsmx86Instruction(), and Partitioner::Instruction::node.

SgAsmx86Instruction * Partitioner::isSgAsmx86Instruction ( SgNode node)
staticprotected

Augments dynamic casts defined from ROSETTA.

A Partitioner::Instruction used to be just a SgAsmInstruction before we needed to combine it with some additional info for the partitioner. Therefore, there's quite a bit of code (within the partitioner) that treats them as AST nodes. Rather than replace every occurrance of isSgAsmInstruction(N) with something like (N?isSgAsmInstruction(N->node):NULL), we add additional versions of the necessary global functions, but define them only within the partitioner.

Definition at line 54 of file Partitioner.C.

References isSgAsmx86Instruction().

Partitioner::BasicBlockStarts Partitioner::detectBasicBlocks ( const Disassembler::InstructionMap insns) const

Find the beginnings of basic blocks based on instruction type and call targets.

Deprecated:
This function is deprecated. Basic blocks are now represented by Partitioner::BasicBlock.

Definition at line 4223 of file Partitioner.C.

References SgAsmStatement::get_address(), SgAsmInstruction::get_branch_target(), SgAsmx86Instruction::get_kind(), SgAsmInstruction::get_size(), SgAsmInstruction::get_successors(), isSgAsmx86Instruction(), SgAsmInstruction::terminates_basic_block(), x86_call, x86_farcall, and x86_pop.

Partitioner::FunctionStarts Partitioner::detectFunctions ( SgAsmInterpretation ,
const Disassembler::InstructionMap insns,
BasicBlockStarts bb_starts 
) const

Returns a list of the currently defined functions.

Deprecated:
This function has been replaced by pre_cfg(), analyze_cfg(), and post_cfg()

Definition at line 4276 of file Partitioner.C.

virtual void Partitioner::set_search ( unsigned  heuristics)
inlinevirtual

Sets the set of heuristics used by the partitioner.

The heuristics should be a bit mask containing the SgAsmFunction::FunctionReason bits. These same bits are assigned to the "reason" property of the resulting function nodes in the AST, depending on which heuristic detected the function.

Definition at line 432 of file Partitioner.h.

References func_heuristics.

Referenced by Disassembler::disassembleInterpretation(), and disassembleInterpretation().

virtual unsigned Partitioner::get_search ( ) const
inlinevirtual

Returns a bit mask of SgAsmFunction::FunctionReason bits indicating which heuristics would be used by the partitioner.

Definition at line 438 of file Partitioner.h.

References func_heuristics.

void Partitioner::set_allow_discontiguous_blocks ( bool  b)
inline

Turns on/off the allowing of discontiguous basic blocks.

When set, a basic block may contain instructions that are discontiguous in memory. Such blocks are created when find_bb_containing() encounters an unconditional jump whose only successor is known and the successor would not be part of any other block.

Here's an example of a discontiguous basic block.

0x00473bf0: 83 c0 18 |... | add eax, 0x18
0x00473bf3: 68 e0 84 44 00 |h..D. | push 0x004484e0
0x00473bf8: e9 db 72 fc ff |..r.. | jmp 0x0043aed8
0x0043aed8: c3 |. | ret
0x004484e0: 89 45 f0 |.E. | mov DWORD PTR ss:[ebp + 0xf0(-0x10)], eax
0x004484e3: 8b 45 f0 |.E. | mov eax, DWORD PTR ss:[ebp + 0xf0(-0x10)]
0x004484e6: 8b 40 60 |.@` | mov eax, DWORD PTR ds:[eax + 0x60]
0x004484e9: 03 45 fc |.E. | add eax, DWORD PTR ss:[ebp + 0xfc(-0x04)]
0x004484ec: 89 45 ec |.E. | mov DWORD PTR ss:[ebp + 0xec(-0x14)], eax
0x004484ef: e9 0c 83 02 00 |..... | jmp 0x00470800
0x00470800: 8b 45 ec |.E. | mov eax, DWORD PTR ss:[ebp + 0xec(-0x14)]
0x00470803: 8b 40 18 |.@. | mov eax, DWORD PTR ds:[eax + 0x18]
0x00470806: 48 |H | dec eax
0x00470807: 85 c0 |.. | test eax, eax
0x00470809: 0f 8c 4f 3d 00 00 |..O=..| jl 0x0047455e
(successors: 0x0047080f 0x0047455e)

When this property is disabled, the above single basic block would have been four blocks, ending at the JMP at 0x473bf8, the RET at 0x43aed8, the JMP at 0x4484ef, and the JL at 0x470809.

The default is that blocks are allowed to be discontiguous.

Definition at line 471 of file Partitioner.h.

References allow_discont_blocks.

bool Partitioner::get_allow_discontiguous_blocks ( ) const
inline

Returns an indication of whether discontiguous blocks are allowed.

See set_allow_discontiguous_blocks() for details.

Definition at line 476 of file Partitioner.h.

References allow_discont_blocks.

void Partitioner::set_debug ( FILE *  f)
inline

Sends diagnostics to the specified output stream.

Null (the default) turns off debugging.

Definition at line 481 of file Partitioner.h.

References debug.

Referenced by Disassembler::disassemble().

FILE* Partitioner::get_debug ( ) const
inline

Returns the file currently used for debugging; null implies no debugging.

Definition at line 486 of file Partitioner.h.

References debug.

void Partitioner::set_map ( MemoryMap mmap,
MemoryMap ro_mmap = NULL 
)

Accessors for the memory maps.

The first argument is usually the complete memory map. It should define all memory that holds instructions, either instructions that have already been disassembled and provided to the Partitioner, or instructions that might be disassembled in the course of partitioning. Depending on disassembler flags, the disassembler will probably only look at portions of the map that are marked executable.

The second (optional) map is used to initialize memory in the virtual machine semantics layer and should contain all read-only memory addresses for the specimen. This map normally also includes the parts of the first argument that hold instructions. Things such as dynamic library addresses (i.e., import sections) can also be supplied if they are initialized and not expected to change during the life of the specimen. If a null pointer is specified (the default) then this map is created from all read-only segments of the first argument.

The first map will be stored by the partitioner as a pointer; the other supplied maps are copied.

Definition at line 182 of file Partitioner.C.

References MemoryMap::clear(), MemoryMap::MM_PROT_READ, MemoryMap::MM_PROT_WRITE, and MemoryMap::prune().

MemoryMap* Partitioner::get_map ( ) const
inline

Accessors for the memory maps.

The first argument is usually the complete memory map. It should define all memory that holds instructions, either instructions that have already been disassembled and provided to the Partitioner, or instructions that might be disassembled in the course of partitioning. Depending on disassembler flags, the disassembler will probably only look at portions of the map that are marked executable.

The second (optional) map is used to initialize memory in the virtual machine semantics layer and should contain all read-only memory addresses for the specimen. This map normally also includes the parts of the first argument that hold instructions. Things such as dynamic library addresses (i.e., import sections) can also be supplied if they are initialized and not expected to change during the life of the specimen. If a null pointer is specified (the default) then this map is created from all read-only segments of the first argument.

The first map will be stored by the partitioner as a pointer; the other supplied maps are copied.

Definition at line 507 of file Partitioner.h.

References map.

void Partitioner::set_progress_reporting ( FILE *  output,
unsigned  min_interval 
)

Set progress reporting properties.

A progress report is produced not more than once every min_interval seconds (default is 10) by sending a single line of ouput to the specified file. Progress reporting can be disabled by supplying a null pointer for the file. Progress report properties are class variables.

Definition at line 66 of file Partitioner.C.

References output().

void Partitioner::add_function_detector ( FunctionDetector  f)
inline

Adds a user-defined function detector to this partitioner.

Any number of detectors can be added and they will be run by pre_cfg() in the order they were added, after the built-in methods run. Each user-defined detector will be called first with the SgAmGenericHeader pointing to null, then once for each file header. The user-defined methods are run only if the SgAsmFunction::FUNC_USERDEF is set (see set_search()), which is the default. The reason for having user-defined function detectors is that the detection of functions influences the shape of the AST and so it is easier to apply those analyses here, before the AST is built, rather than in the mid-end after the AST is built.

Definition at line 528 of file Partitioner.h.

References user_detectors.

unsigned Partitioner::parse_switches ( const std::string &  s,
unsigned  initial_flags 
)
static

Parses a string describing the heuristics and returns the bit vector that can be passed to set_search().

The input string should be a comma-separated list (without white space) of search specifications. Each specification should be an optional qualifier character followed by either an integer or a word. The accepted words are the lower-case versions of the constants enumerated by SgAsmFunction::FunctionReason, but without the leading "FUNC_". The qualifier determines whether the bits specified by the integer or word are added to the return value ("+") or removed from the return value ("-"). The "=" qualifier acts like "+" but first zeros the return value. The default qualifier is "+" except when the word is "default", in which case the specifier is "=". An optional initial bit mask can be specified (defaults to SgAsmFunction::FUNC_DEFAULT).

Definition at line 99 of file Partitioner.C.

References flags, SgAsmFunction::FUNC_CALL_INSN, SgAsmFunction::FUNC_CALL_TARGET, SgAsmFunction::FUNC_DEFAULT, SgAsmFunction::FUNC_EH_FRAME, SgAsmFunction::FUNC_ENTRY_POINT, SgAsmFunction::FUNC_EXPORT, SgAsmFunction::FUNC_IMPORT, SgAsmFunction::FUNC_INTRABLOCK, SgAsmFunction::FUNC_LEFTOVERS, SgAsmFunction::FUNC_MISCMASK, SgAsmFunction::FUNC_PADDING, SgAsmFunction::FUNC_PATTERN, SgAsmFunction::FUNC_SYMBOL, SgAsmFunction::FUNC_THUNK, and SgAsmFunction::FUNC_USERDEF.

SgAsmBlock * Partitioner::partition ( SgAsmInterpretation interp,
const Disassembler::InstructionMap insns,
MemoryMap mmap = NULL 
)
virtual

Top-level function to run the partitioner on some instructions and build an AST.

The SgAsmInterpretation is optional. If it is null then those function seeding operations that depend on having file headers are not run. The memory map argument is optional only if a memory map has already been attached to this partitioner object with the set_map() method.

Definition at line 4098 of file Partitioner.C.

References SgAsmBlock::BLK_GRAPH1.

Referenced by Disassembler::disassemble(), and disassembleInterpretation().

SgAsmBlock * Partitioner::partition ( SgAsmInterpretation interp,
Disassembler d,
MemoryMap m 
)
virtual

Top-level function to run the partitioner, calling the specified disassembler as necessary to generate instructions.

Definition at line 4127 of file Partitioner.C.

References SgAsmBlock::BLK_GRAPH1.

void Partitioner::clear ( )
virtual

Reset partitioner to initial conditions by discarding all instructions, basic blocks, functions, and configuration file settings and definitions.

Definition at line 613 of file Partitioner.C.

Referenced by ~Partitioner().

void Partitioner::load_config ( const std::string &  filename)
virtual

Loads the specified configuration file.

This should be called before any of the partitioning functions (such as partition()). If an error occurs then Partitioner::IPDParser::Exception error is thrown.

Definition at line 656 of file Partitioner.C.

References Partitioner::IPDParser::parse().

Referenced by Disassembler::disassembleInterpretation(), and disassembleInterpretation().

void Partitioner::add_instructions ( const Disassembler::InstructionMap insns)
virtual

Adds additional instructions to be processed.

New instructions are only added at addresses that don't already have an instruction.

Definition at line 4140 of file Partitioner.C.

Referenced by disassembleInterpretation().

Disassembler::InstructionMap Partitioner::get_instructions ( ) const

Get the list of all instructions.

This includes instructions that were added with add_instructions(), instructions added by a passive partition() call, and instructions added by an active partitioner.

Definition at line 4149 of file Partitioner.C.

const Disassembler::BadMap& Partitioner::get_disassembler_errors ( ) const
inline

Get the list of disassembler errors.

Only active partitioners accumulate this information since only active partitioners call the disassembler to obtain instructions.

Definition at line 569 of file Partitioner.h.

References bad_insns.

void Partitioner::clear_disassembler_errors ( )
inline

Clears errors from the disassembler.

This might be useful in order to cause the partitioner to call the disassembler again for certain addresses. Normally, if the partitioner fails to obtain an instruction at a particular address it remembers the failure and does not try again. The bad map is also cleared by the Partitioner::clear() method, which clears various other things in addition.

Definition at line 577 of file Partitioner.h.

References bad_insns.

Partitioner::Instruction * Partitioner::find_instruction ( rose_addr_t  va,
bool  create = true 
)
virtual

Finds an instruction at the specified address.

If the partitioner is operating in active mode and create is true, then the disassembler will be invoked if necessary to obtain the instruction. This function returns the null pointer if no instruction is available. If the disassembler was called and threw an exception, then we catch the exception and add it to the bad instruction list.

Definition at line 898 of file Partitioner.C.

Referenced by fixup_pointers(), Partitioner::FindInsnPadding::operator()(), Partitioner::FindThunks::operator()(), Partitioner::FindThunkTables::operator()(), and Partitioner::FindPostFunctionInsns::operator()().

Partitioner::Instruction * Partitioner::discard ( Instruction insn,
bool  discard_entire_block = false 
)
virtual

Drop an instruction from consideration.

If the instruction is the beginning of a basic block then drop the entire basic block, returning its subsequent instructions back to the (implied) list of free instructions. If the instruction is in the middle of a basic block, then either drop the entire basic block, or truncate it at the specified instruction depending on whether discard_entire_block is true or false.

This method always returns the null pointer.

Definition at line 853 of file Partitioner.C.

References Partitioner::Instruction::get_address(), and Partitioner::BasicBlock::insns.

Referenced by Partitioner::FindInsnPadding::operator()().

Partitioner::BasicBlock * Partitioner::discard ( BasicBlock bb)
virtual

Drop a basic block from the partitioner.

The specified basic block, which must not belong to any function, is removed from the Partitioner, deleted, and its instructions all returned to the (implied) list of free instructions. This function always returns the null pointer.

Definition at line 874 of file Partitioner.C.

References Partitioner::BasicBlock::address(), Partitioner::Instruction::bblock, Partitioner::BasicBlock::clear_data_blocks(), Partitioner::BasicBlock::function, and Partitioner::BasicBlock::insns.

Partitioner::Function * Partitioner::add_function ( rose_addr_t  entry_va,
unsigned  reasons,
std::string  name = "" 
)
virtual

Adds a new function definition to the partitioner.

New functions can be added at any time, including during the analyze_cfg() call. When this method is called with an entry_va for an existing function, the specified reasons will be merged with the existing function, and the existing function will be given the specified name if it has none.

Definition at line 1028 of file Partitioner.C.

References Partitioner::Function::entry_va, SgAsmFunction::FUNC_MISCMASK, Partitioner::Function::name, name, and Partitioner::Function::reason.

Referenced by mark_func_patterns(), Partitioner::FindInsnPadding::operator()(), Partitioner::FindThunks::operator()(), Partitioner::FindThunkTables::operator()(), and Partitioner::FindInterPadFunctions::operator()().

Partitioner::Function * Partitioner::find_function ( rose_addr_t  entry_va)
virtual

Looks up a function by address.

Returns the function pointer if found, the null pointer if not found.

Definition at line 1019 of file Partitioner.C.

SgAsmBlock * Partitioner::build_ast ( SgAsmInterpretation interp = NULL)
virtual

Builds the AST describing all the functions.

The return value is an SgAsmBlock node that points to a list of SgAsmFunction nodes (the functions), each of which points to a list of SgAsmBlock nodes (the basic blocks). Any basic blocks that were not assigned to a function by the Partitioner will be added to a function named "***uncategorized blocks***" whose entry address will be the address of the lowest instruction, and whose reasons for existence will include the SgAsmFunction::FUNC_LEFTOVERS bit. However, if the FUNC_LEFTOVERS bit is not turned on (see set_search()) then uncategorized blocks will not appear in the AST.

If an interpretation is supplied, then it will be used to obtain information about where various file sections are mapped into memory. This mapping is used to fix-up various kinds of pointers in the instructions to make them relative to a file section. For instance, a pointer into the ".bss" section will be made relative to the beginning of that section.

Definition at line 3912 of file Partitioner.C.

References SgAsmBlock::BLK_LEFTOVERS, Partitioner::Function::clear_basic_blocks(), Partitioner::Function::clear_data_blocks(), RangeMap< R, T >::contains(), Partitioner::Function::entry_va, SgAsmFunction::FUNC_LEFTOVERS, Partitioner::BasicBlock::function, SgAsmBlock::get_statementList(), and SgNode::set_parent().

void Partitioner::fixup_cfg_edges ( SgNode ast)
virtual

Update control flow graph edge nodes.

This method traverses the specified AST and updates any edge nodes so their block pointers point to actual blocks rather than just containing virtual addresses. The update only happens for edges that don't already have a node pointer.

Definition at line 3771 of file Partitioner.C.

References SgAsmIntegerValueExpression::get_absolute_value(), SgAsmStatement::get_address(), SgAsmIntegerValueExpression::get_base_node(), SgAsmBlock::get_statementList(), SgAsmBlock::get_successors(), isSgAsmBlock(), isSgAsmInstruction(), SgAsmIntegerValueExpression::make_relative_to(), and preorder.

void Partitioner::fixup_pointers ( SgNode ast,
SgAsmInterpretation interp = NULL 
)
virtual

Updates pointers inside instructions.

This method traverses each instruction in the specified AST and looks for integer value expressions that that have no base node (i.e., those that have only an absolute value). For each such value it finds, it tries to determine if that value points to code or data. Code pointers are made relative to the instruction or function (for function calls) to which they point; data pointers are made relative to the data to which they point.

The specified interpretation is only used to obtain a list of all mapped sections. The sections are used to determine whether a value is a data pointer even if it doesn't point to any specific data that was discovered during disassembly.

This method is called by build_ast(), but can also be called explicitly. Only pointers that are not already relative to some object are affected.

Definition at line 3820 of file Partitioner.C.

References Partitioner::Instruction::bblock, SgAsmGenericFile::best_section_by_va(), datablock_extent(), find_instruction(), SgAsmFunction::FUNC_LEFTOVERS, Partitioner::BasicBlock::function, SgAsmIntegerValueExpression::get_absolute_value(), Partitioner::Instruction::get_address(), SgAsmStatement::get_address(), SgAsmIntegerValueExpression::get_base_node(), SgAsmFunction::get_entry_va(), SgAsmInterpretation::get_headers(), SgAsmGenericHeaderList::get_headers(), SgAsmGenericSection::get_mapped_xperm(), SgAsmStaticData::get_size(), isSgAsmInstruction(), isSgAsmIntegerValueExpression(), SgAsmIntegerValueExpression::make_relative_to(), Partitioner::Instruction::node, Partitioner::DataBlock::nodes, and Partitioner::Function::reason.

void Partitioner::disassembleInterpretation ( SgAsmInterpretation interp)
static
virtual RegionStats* Partitioner::new_region_stats ( )
inlinevirtual

Create a new region statistics object.

We do it this way because the statistics class is closely tied to the partitioner class, but users might want to augment the statistics. The RegionStats is a virtual class as is this creator.

Definition at line 906 of file Partitioner.h.

virtual CodeCriteria* Partitioner::new_code_criteria ( )
inlinevirtual

Create a new criteria object.

This allows a user to derive a new class from CodeCriteria and have that class be used by the partitioner.

Definition at line 913 of file Partitioner.h.

Referenced by Partitioner::FindFunctionFragments::operator()().

virtual CodeCriteria* Partitioner::new_code_criteria ( const RegionStats mean,
const RegionStats variance,
double  threshold 
)
inlinevirtual

Create a new criteria object.

This allows a user to derive a new class from CodeCriteria and have that class be used by the partitioner.

Definition at line 916 of file Partitioner.h.

virtual RegionStats* Partitioner::region_statistics ( const ExtentMap )
virtual

Computes various statistics over part of an address space.

If no region is supplied then the statistics are calculated over the part of the Partitioner memory map that contains execute permission. The statistics are returned by argument so that subclasses have an easy way to augment them.

Referenced by Partitioner::FindFunctionFragments::operator()().

virtual RegionStats* Partitioner::region_statistics ( Function )
virtual

Computes various statistics over part of an address space.

If no region is supplied then the statistics are calculated over the part of the Partitioner memory map that contains execute permission. The statistics are returned by argument so that subclasses have an easy way to augment them.

virtual RegionStats* Partitioner::region_statistics ( )
virtual

Computes various statistics over part of an address space.

If no region is supplied then the statistics are calculated over the part of the Partitioner memory map that contains execute permission. The statistics are returned by argument so that subclasses have an easy way to augment them.

virtual RegionStats* Partitioner::aggregate_statistics ( bool  do_variance = true)
virtual

Computes aggregate statistics over all known functions.

This method computes region statistics for each individual function (except padding and leftovers) and obtains an average and, optionally, the variance. The average and variance are cached in the partitioner and can be retrieved by get_aggregate_mean() and get_aggregate_variance(). This method also returns the mean regardless of whether its cached. The values are not recomputed if they are already cached; the cache can be cleared with clear_aggregate_cache().

Referenced by Partitioner::FindFunctionFragments::operator()().

virtual RegionStats* Partitioner::get_aggregate_mean ( ) const
inlinevirtual

Accessors for cached aggregate statistics.

If the partitioner has aggregated statistics over known functions, then that information is available by this method: get_aggregate_mean() returns the average values over all functions, and get_aggregate_variance() returns the variance. The partitioner normally calculates this information immediately after performing the first CFG analysis, after most instructions are added to most functions, but before data blocks are added. A null pointer is returned if the information is not available. The user is allowed to modify the values, but should not free the objects. New values can be computed by clearing the cache (clear_aggregate_statistics()) and then calling a function that computes them again, such as aggregate_statistics() or is_code().

Definition at line 945 of file Partitioner.h.

References aggregate_mean.

virtual RegionStats* Partitioner::get_aggregate_variance ( ) const
inlinevirtual

Accessors for cached aggregate statistics.

If the partitioner has aggregated statistics over known functions, then that information is available by this method: get_aggregate_mean() returns the average values over all functions, and get_aggregate_variance() returns the variance. The partitioner normally calculates this information immediately after performing the first CFG analysis, after most instructions are added to most functions, but before data blocks are added. A null pointer is returned if the information is not available. The user is allowed to modify the values, but should not free the objects. New values can be computed by clearing the cache (clear_aggregate_statistics()) and then calling a function that computes them again, such as aggregate_statistics() or is_code().

Definition at line 946 of file Partitioner.h.

References aggregate_variance.

Referenced by Partitioner::FindFunctionFragments::operator()().

virtual void Partitioner::clear_aggregate_statistics ( )
inlinevirtual

Causes the partitioner to forget statistics.

The statistics aggregated over known functions are discarded, and subsequent calls to get_aggregate_mean() and get_aggregate_variance() will return null pointers until the data is recalculated (if ever).

Definition at line 952 of file Partitioner.h.

References aggregate_mean, and aggregate_variance.

virtual size_t Partitioner::count_kinds ( const InstructionMap )
virtual

Counts the number of distinct kinds of instructions.

The counting is based on the instructions' get_kind() method.

virtual size_t Partitioner::count_kinds ( )
inlinevirtual

Counts the number of distinct kinds of instructions.

The counting is based on the instructions' get_kind() method.

Definition at line 960 of file Partitioner.h.

References count_privileged(), and insns.

virtual size_t Partitioner::count_privileged ( const InstructionMap )
virtual

Counts the number of privileged instructions.

Such instructions are generally don't appear in normal code.

virtual size_t Partitioner::count_privileged ( )
inlinevirtual

Counts the number of privileged instructions.

Such instructions are generally don't appear in normal code.

Definition at line 966 of file Partitioner.h.

References count_privileged(), and insns.

Referenced by count_kinds(), count_privileged(), and ratio_privileged().

virtual double Partitioner::ratio_privileged ( )
inlinevirtual

Counts the number of privileged instructions.

Such instructions are generally don't appear in normal code.

Definition at line 967 of file Partitioner.h.

References count_privileged(), insns, and NAN.

virtual size_t Partitioner::count_floating_point ( const InstructionMap )
virtual

Counts the number of floating point instructions.

virtual size_t Partitioner::count_floating_point ( )
inlinevirtual

Counts the number of floating point instructions.

Definition at line 973 of file Partitioner.h.

References count_floating_point(), and insns.

Referenced by count_floating_point(), and ratio_floating_point().

virtual double Partitioner::ratio_floating_point ( )
inlinevirtual

Counts the number of floating point instructions.

Definition at line 974 of file Partitioner.h.

References count_floating_point(), insns, and NAN.

virtual size_t Partitioner::count_registers ( const InstructionMap ,
double *  mean = NULL,
double *  variance = NULL 
)
virtual

Counts the number of register references.

Returns the total number of register reference expressions, but the real value of this method is that it also computes the an average register reference size and variance. Register sizes are represented as a power of two in an attempt to weight common register sizes equally. In other words, a 16 bit program with a couple of 8 bit values should have a variance that's close to a similar sized 32-bit program with a couple of 16-bit values.

Referenced by ratio_registers().

virtual size_t Partitioner::count_registers ( double *  mean = NULL,
double *  variance = NULL 
)
inlinevirtual

Counts the number of register references.

Returns the total number of register reference expressions, but the real value of this method is that it also computes the an average register reference size and variance. Register sizes are represented as a power of two in an attempt to weight common register sizes equally. In other words, a 16 bit program with a couple of 8 bit values should have a variance that's close to a similar sized 32-bit program with a couple of 16-bit values.

Definition at line 984 of file Partitioner.h.

References count_registers(), and insns.

Referenced by count_registers().

virtual double Partitioner::ratio_registers ( double *  mean = NULL,
double *  variance = NULL 
)
inlinevirtual

Counts the number of register references.

Returns the total number of register reference expressions, but the real value of this method is that it also computes the an average register reference size and variance. Register sizes are represented as a power of two in an attempt to weight common register sizes equally. In other words, a 16 bit program with a couple of 8 bit values should have a variance that's close to a similar sized 32-bit program with a couple of 16-bit values.

Definition at line 985 of file Partitioner.h.

References count_registers(), insns, and NAN.

virtual double Partitioner::count_size_variance ( const InstructionMap insns)
virtual

Returns the variance of instruction bit widths.

The variance is computed over the instruction size, the address size, and the operand size. The sizes 16-, 32-, and 64-bit are mapped to the integers 0, 1, and 2 respectively and the mean is computed. The variance is the sum of squares of the difference between each data point and the mean. Returns NAN if the instruction map is empty. Most valid code has a variance of less than 0.05.

virtual double Partitioner::count_size_variance ( )
inlinevirtual

Returns the variance of instruction bit widths.

The variance is computed over the instruction size, the address size, and the operand size. The sizes 16-, 32-, and 64-bit are mapped to the integers 0, 1, and 2 respectively and the mean is computed. The variance is the sum of squares of the difference between each data point and the mean. Returns NAN if the instruction map is empty. Most valid code has a variance of less than 0.05.

Definition at line 996 of file Partitioner.h.

References count_size_variance(), and insns.

Referenced by count_size_variance().

virtual bool Partitioner::is_code ( const ExtentMap region,
double *  raw_vote_ptr = NULL,
std::ostream *  debug = NULL 
)
virtual

Determines if a region contains code.

The determination is made by computing aggregate statistics over each of the functions that are already known, then building a CodeCriteria object. The same analysis is run over the region in question and the compared with the CodeCriteria object. The criteria is then discarded.

If the partitioner's get_aggregate_mean() and get_aggregate_variance() return non-null values, then those statistics are used in favor of computing new ones. If new statistics are computed, they will be cached for those methods to return later.

If a raw_vote_ptr is supplied, then upon return it will hold a value between zero and one, inclusive, which is the weighted average of the votes from the individual analyses. The raw vote is the value compared against the code criteria threshold to obtain a Boolean result.

virtual CodeCriteria* Partitioner::get_code_criteria ( ) const
inlinevirtual

Accessors for code criteria.

A CodeCriteria object can be associated with the Partitioner, in which case the partitioner does not compute statistics over the known functions, but rather uses the code criteria directly. The caller is reponsible for allocating and freeing the criteria. If no criteria is supplied, then one is created as necessary by calling new_code_criteria() and passing it the average and variance computed over all the functions (excluding leftovers and padding) or use the values cached in the partitioner.

Definition at line 1018 of file Partitioner.h.

References code_criteria.

virtual void Partitioner::set_code_criteria ( CodeCriteria cc)
inlinevirtual

Accessors for code criteria.

A CodeCriteria object can be associated with the Partitioner, in which case the partitioner does not compute statistics over the known functions, but rather uses the code criteria directly. The caller is reponsible for allocating and freeing the criteria. If no criteria is supplied, then one is created as necessary by calling new_code_criteria() and passing it the average and variance computed over all the functions (excluding leftovers and padding) or use the values cached in the partitioner.

Definition at line 1019 of file Partitioner.h.

References code_criteria.

void Partitioner::scan_contiguous_insns ( InstructionMap  insns,
InsnRangeCallbacks cblist,
Instruction insn_prev,
Instruction insn_end 
)
virtual

Scans contiguous sequences of instructions.

The specified callbacks are invoked for each contiguous sequence of instructions in the specified instruction map. At each iteration of the loop, we choose the instruction with the lowest address and the subsequent instructions that are contiguous in memory, build up the callback argument list, invoke the callbacks on the list, and remove those instructions from consideration by subsequent iterations of the loop.

The callback arguments are built from the supplied values of insn_prev and insn_end. The insn_begin member is the instruction with the lowest address in this iteration and ninsns is the number of contiguous instructions.

Definition at line 1635 of file Partitioner.C.

References ROSE_Callbacks::List< T >::apply(), and Partitioner::Instruction::get_address().

Referenced by scan_contiguous_insns().

void Partitioner::scan_contiguous_insns ( const InstructionMap insns,
InsnRangeCallback callback,
Instruction insn_prev,
Instruction insn_end 
)
inline

Scans contiguous sequences of instructions.

The specified callbacks are invoked for each contiguous sequence of instructions in the specified instruction map. At each iteration of the loop, we choose the instruction with the lowest address and the subsequent instructions that are contiguous in memory, build up the callback argument list, invoke the callbacks on the list, and remove those instructions from consideration by subsequent iterations of the loop.

The callback arguments are built from the supplied values of insn_prev and insn_end. The insn_begin member is the instruction with the lowest address in this iteration and ninsns is the number of contiguous instructions.

Definition at line 1087 of file Partitioner.h.

References scan_contiguous_insns().

void Partitioner::scan_unassigned_insns ( InsnRangeCallbacks callbacks)
virtual

Scans ranges of unassigned instructions.

Scans through the list of existing instructions that are not assigned to any function and invokes all of the specified callbacks on each range of such instructions. The ranges of unassigned instructions are not necessarily contiguous or non-overlapping but are bounded by the insn_begin (inclusive) and insn_end (exclusive, or null) callback arguments. The callbacks are invoked via the scan_contiguous_insns() method with a different insn_begin for each call.

Callbacks are allowed to disassemble additional instructions and/or assign/break associations between instructions and functions. Only the instructions that are already disassembled at the beginning of this call are considered by the iterators, but the instruction/function associations may change during the iteration.

All callbacks should honor their "enabled" argument and do nothing if it is clear. This feature is used by some of the other instruction scanning methods to filter out certain ranges of instructions. For instance, the scan_intrafunc_insns() will set "enabled" to true only for ranges of unassigned instructions whose closest surrounding assigned instructions both belong to the same function.

Definition at line 1656 of file Partitioner.C.

References ROSE_Callbacks::List< T >::empty(), and Partitioner::BasicBlock::function.

Referenced by scan_unassigned_insns().

void Partitioner::scan_unassigned_insns ( InsnRangeCallback callback)
inline

Scans ranges of unassigned instructions.

Scans through the list of existing instructions that are not assigned to any function and invokes all of the specified callbacks on each range of such instructions. The ranges of unassigned instructions are not necessarily contiguous or non-overlapping but are bounded by the insn_begin (inclusive) and insn_end (exclusive, or null) callback arguments. The callbacks are invoked via the scan_contiguous_insns() method with a different insn_begin for each call.

Callbacks are allowed to disassemble additional instructions and/or assign/break associations between instructions and functions. Only the instructions that are already disassembled at the beginning of this call are considered by the iterators, but the instruction/function associations may change during the iteration.

All callbacks should honor their "enabled" argument and do nothing if it is clear. This feature is used by some of the other instruction scanning methods to filter out certain ranges of instructions. For instance, the scan_intrafunc_insns() will set "enabled" to true only for ranges of unassigned instructions whose closest surrounding assigned instructions both belong to the same function.

Definition at line 1111 of file Partitioner.h.

References scan_unassigned_insns().

void Partitioner::scan_intrafunc_insns ( InsnRangeCallbacks callbacks)
virtual

Scans the unassigned instructions within a function.

The specified callbacks are invoked for each range of unassigned instructions whose closest surrounding assigned instructions both belong to the same function. This can be used, for example, to discover instructions that should probably be considered part of the same function as the surrounding instructions.

This method operates by making a temporary copy of callbacks, prepending a filtering callback, and then invoking scan_unassigned_insns(). Therefore, the callbacks supplied by the user should all honor their "enabled" argument.

Definition at line 1714 of file Partitioner.C.

References ROSE_Callbacks::List< T >::empty(), Partitioner::BasicBlock::function, and ROSE_Callbacks::List< T >::prepend().

Referenced by scan_intrafunc_insns().

void Partitioner::scan_intrafunc_insns ( InsnRangeCallback callback)
inline

Scans the unassigned instructions within a function.

The specified callbacks are invoked for each range of unassigned instructions whose closest surrounding assigned instructions both belong to the same function. This can be used, for example, to discover instructions that should probably be considered part of the same function as the surrounding instructions.

This method operates by making a temporary copy of callbacks, prepending a filtering callback, and then invoking scan_unassigned_insns(). Therefore, the callbacks supplied by the user should all honor their "enabled" argument.

Definition at line 1127 of file Partitioner.h.

References scan_intrafunc_insns().

void Partitioner::scan_interfunc_insns ( InsnRangeCallbacks callbacks)
virtual

Scans the instructions between functions.

The specified callbacks are invoked for each set of instructions (not necessarily contiguous in memory) that fall "between" two functions. Instruction I(x) at address x is between two functions, Fa and Fb, if there exists a lower address a<x such that I(a) belongs to Fa and there exists a higher address b>x such that I(b) belongs to Fb; and for all instructions I(y) for a<y<b, I(y) does not belong to any function.

Additionally, if no I(a) exists that belongs to a function, and/or no I(b) exists that belongs to a function, then I(x) is also considered part of an inter-function region and the lower and/or upper functions are undefined. In other words, instructions appearing before all functions or after all functions are also considered to be between functions, and all instructions are considered to be between functions if there are no functions.

Only instructions that have already been disassembled are considered.

Definition at line 1687 of file Partitioner.C.

References ROSE_Callbacks::List< T >::empty(), Partitioner::BasicBlock::function, and ROSE_Callbacks::List< T >::prepend().

Referenced by scan_interfunc_insns().

void Partitioner::scan_interfunc_insns ( InsnRangeCallback callback)
inline

Scans the instructions between functions.

The specified callbacks are invoked for each set of instructions (not necessarily contiguous in memory) that fall "between" two functions. Instruction I(x) at address x is between two functions, Fa and Fb, if there exists a lower address a<x such that I(a) belongs to Fa and there exists a higher address b>x such that I(b) belongs to Fb; and for all instructions I(y) for a<y<b, I(y) does not belong to any function.

Additionally, if no I(a) exists that belongs to a function, and/or no I(b) exists that belongs to a function, then I(x) is also considered part of an inter-function region and the lower and/or upper functions are undefined. In other words, instructions appearing before all functions or after all functions are also considered to be between functions, and all instructions are considered to be between functions if there are no functions.

Only instructions that have already been disassembled are considered.

Definition at line 1148 of file Partitioner.h.

References scan_interfunc_insns().

void Partitioner::scan_unassigned_bytes ( ByteRangeCallbacks callbacks,
MemoryMap restrict_map = NULL 
)
virtual

Scans ranges of the address space that have not been assigned to any function.

For each contiguous range of address space that is not associated with any function, each of the specified callbacks is invoked in turn until one of them returns false. The determination of what parts of the address space belong to functions is made before any of the callbacks are invoked and not updated for the duration of this function. The determination is made by calling Partitioner::function_extent() across all known functions, and then passing that mapping to each of the callbacks.

If a restrict_map MemoryMap is specified then only addresses that are also defined in the map are considered.

Definition at line 1739 of file Partitioner.C.

References ROSE_Callbacks::List< T >::apply(), RangeMap< R, T >::begin(), ROSE_Callbacks::List< T >::empty(), RangeMap< R, T >::end(), RangeMap< R, T >::erase_ranges(), RangeMap< R, T >::invert(), and MemoryMap::va_extents().

Referenced by scan_unassigned_bytes().

void Partitioner::scan_unassigned_bytes ( ByteRangeCallback callback,
MemoryMap restrict_map = NULL 
)
inline

Scans ranges of the address space that have not been assigned to any function.

For each contiguous range of address space that is not associated with any function, each of the specified callbacks is invoked in turn until one of them returns false. The determination of what parts of the address space belong to functions is made before any of the callbacks are invoked and not updated for the duration of this function. The determination is made by calling Partitioner::function_extent() across all known functions, and then passing that mapping to each of the callbacks.

If a restrict_map MemoryMap is specified then only addresses that are also defined in the map are considered.

Definition at line 1164 of file Partitioner.h.

References scan_unassigned_bytes().

void Partitioner::scan_intrafunc_bytes ( ByteRangeCallbacks callbacks,
MemoryMap restrict_map = NULL 
)
virtual

Scans unassigned ranges of the address space within a function.

The specified callbacks are invoked for each range of the address space whose closest surrounding assigned addresses both belong to the same function. This can be used, for example, to discover static data or unreachable instructions (by static analysis) that should probably belong to the surrounding function.

If a restrict_map MemoryMap is specified then only addresses that are also defined in the map are considered.

Definition at line 1760 of file Partitioner.C.

References ROSE_Callbacks::List< T >::empty(), Range< rose_addr_t >::maximum(), Range< rose_addr_t >::minimum(), and ROSE_Callbacks::List< T >::prepend().

Referenced by scan_intrafunc_bytes().

void Partitioner::scan_intrafunc_bytes ( ByteRangeCallback callback,
MemoryMap restrict_map = NULL 
)
inline

Scans unassigned ranges of the address space within a function.

The specified callbacks are invoked for each range of the address space whose closest surrounding assigned addresses both belong to the same function. This can be used, for example, to discover static data or unreachable instructions (by static analysis) that should probably belong to the surrounding function.

If a restrict_map MemoryMap is specified then only addresses that are also defined in the map are considered.

Definition at line 1179 of file Partitioner.h.

References scan_intrafunc_bytes().

void Partitioner::scan_interfunc_bytes ( ByteRangeCallbacks callbacks,
MemoryMap restrict_map = NULL 
)
virtual

Scans unassigned ranges of the address space between functions.

The specified callbacks are invoked for each range of addresses that fall "between" two functions. An address is between two functions if the next lower assigned address belongs to one function and the next higher assigned address belongs to some other function, or if there is no assigned lower address and/or no assigned higher address.

If a restrict_map MemoryMap is specified then only addresses that are also defined in the map are considered.

Definition at line 1793 of file Partitioner.C.

References ROSE_Callbacks::List< T >::empty(), Range< rose_addr_t >::maximum(), Range< rose_addr_t >::minimum(), and ROSE_Callbacks::List< T >::prepend().

Referenced by scan_interfunc_bytes().

void Partitioner::scan_interfunc_bytes ( ByteRangeCallback callback,
MemoryMap restrict_map = NULL 
)
inline

Scans unassigned ranges of the address space between functions.

The specified callbacks are invoked for each range of addresses that fall "between" two functions. An address is between two functions if the next lower assigned address belongs to one function and the next higher assigned address belongs to some other function, or if there is no assigned lower address and/or no assigned higher address.

If a restrict_map MemoryMap is specified then only addresses that are also defined in the map are considered.

Definition at line 1194 of file Partitioner.h.

References scan_interfunc_bytes().

Partitioner::InstructionMap::const_iterator Partitioner::pattern1 ( const InstructionMap insns,
InstructionMap::const_iterator  first,
Disassembler::AddressSet exclude 
)
staticprotected

Looks for stack frame setup.

Tries to match "(mov rdi,rdi)?; push rbp; mov rbp,rsp" (or the 32-bit equivalent). The first MOV instruction is a two-byte no-op used for hot patching of executables (single instruction rather than two NOP instructions so that no thread is executing at the second byte when the MOV is replaced by a JMP). The PUSH and second MOV are the standard way to set up the stack frame.

Definition at line 1411 of file Partitioner.C.

References SgAsmRegisterReferenceExpression::get_descriptor(), SgAsmx86Instruction::get_kind(), RegisterDescriptor::get_major(), RegisterDescriptor::get_minor(), SgAsmInstruction::get_operandList(), SgAsmOperandList::get_operands(), SgAsmInstruction::get_size(), isSgAsmx86Instruction(), isSgAsmx86RegisterReferenceExpression(), x86_gpr_bp, x86_gpr_di, x86_gpr_sp, x86_mov, x86_push, and x86_regclass_gpr.

void Partitioner::append ( BasicBlock bb,
DataBlock db,
unsigned  reason 
)
virtual

Associate a data block with a basic block.

Any basic block can point to zero or more data blocks. The data block will then be kept with the same function as the basic block. This is typically used for things like jump tables, where the last instruction of the basic block is an indirect jump, and the data block contains the jump table. When a blasic block is truncated, it looses its data blocks.

A data block's explicit function assignment (i.e., its "function" member) overrides its assignment via a basic block. A data block can be assigned to at most one basic block.

The reason argument is a bit vector of SgAsmBlock::Reason bits that are added to the data block's reasons for existing.

Definition at line 731 of file Partitioner.C.

References Partitioner::DataBlock::basic_block, Partitioner::BasicBlock::data_blocks, and Partitioner::DataBlock::reason.

void Partitioner::append ( Function f,
BasicBlock bb,
unsigned  reason,
bool  keep = false 
)
virtual

Append basic block to function.

This method is a bit of a misnomer because the order that blocks are appended to a function is irrelevant – the blocks are stored in a map by order of block entry address. The block being appended must not already belong to some other function, but it's fine if the block already belongs to the function to which it is being appended (it is not added a second time).

Whenever a block is added to a function, we should supply a reason for adding it. The reasons bit vector are those reasons. The bits are from the SgAsmBlock::Reason enum.

If the keep argument is true, then the block's entry address is also added to the function's list of control flow graph (CFG) heads. These are the addresses of blocks which are used to start the recursive CFG analysis phase of function block discovery. The function's entry address is always considered a CFG head even if it doesn't appear in the set of heads.

Definition at line 756 of file Partitioner.C.

References Partitioner::BasicBlock::address(), Partitioner::Function::basic_blocks, Partitioner::BasicBlock::cache, Partitioner::BasicBlock::function, Partitioner::BlockAnalysisCache::function_return, Partitioner::Function::heads, Partitioner::Function::promote_may_return(), Partitioner::BasicBlock::reason, and SgAsmFunction::RET_SOMETIMES.

void Partitioner::append ( Function func,
DataBlock block,
unsigned  reason,
bool  force = false 
)
virtual

Append data region to function.

This method is a bit of a misnomer because the order that the data blocks are appended to the function is irrelevant – the blocks are stored in a map by order of block address. The data block being appended must not already belong to some other function, but it's fine if the block already belongs to the function to which it is being appended (it is not added a second time).

Whenever a block is added to a function, we should supply a reason for adding it. The reason bit vector are those reasons. The bits are from the SgAsmBlock::Reason enum.

If force is true then the data block is first removed from any basic block or function to which it already belongs.

Definition at line 793 of file Partitioner.C.

References Partitioner::DataBlock::address(), Partitioner::DataBlock::basic_block, Partitioner::Function::data_blocks, Partitioner::DataBlock::function, and Partitioner::DataBlock::reason.

void Partitioner::remove ( Function f,
BasicBlock bb 
)
virtual

Remove a basic block from a function.

The block and function continue to exist–only the association between them is broken.

Definition at line 817 of file Partitioner.C.

References Partitioner::BasicBlock::address(), Partitioner::Function::basic_blocks, and Partitioner::BasicBlock::function.

void Partitioner::remove ( Function f,
DataBlock db 
)
virtual

Remove a data block from a function.

The block and function continue to exist–only the association between them is broken. The data block might also be associated with a basic block, in which case the data block will ultimately belong to the same function as the basic block.

Definition at line 830 of file Partitioner.C.

References Partitioner::DataBlock::address(), Partitioner::Function::data_blocks, and Partitioner::DataBlock::function.

void Partitioner::remove ( BasicBlock bb,
DataBlock db 
)
virtual

Remove a data block from a basic block.

The blocks continue to exist–only the association between them is broken. The data block might still be associated with a function, in which case it will ultimately end up in that function.

Definition at line 842 of file Partitioner.C.

References Partitioner::DataBlock::basic_block, and Partitioner::BasicBlock::data_blocks.

Partitioner::BasicBlock * Partitioner::find_bb_containing ( rose_addr_t  va,
bool  create = true 
)
virtual

Finds a basic block containing the specified instruction address.

If no basic block exists and create is set, then a new block is created which starts at the specified address. The return value, in the case when a block already exists, may be a block where the specified virtual address is either the beginning of the block or somewhere inside the block. In any case, the virtual address will always represent a function.

If no instruction can be found at the specified address then no block is created and a null pointer is returned.

Blocks are created by adding the initial instruction to the block, then repeatedly attempting to add more instructions as follows: if the block successors can all be statically determined, and there is exactly one successor, and that successor is not already part of a block, then the successor is appended to the block.

Block creation is recursive in nature since the computation of a (partial) block's successors might require creation of other blocks. Consider the case of an x86 CALL instruction: after a CALL is appended to a block, the successors are calculated by looking at the target of the CALL. If the target is known and it can be proved that the target block (recursively constructed) discards the return address, then the fall-through address of the CALL is not a direct successor.

See also, set_allow_discontiguous_blocks().

Definition at line 933 of file Partitioner.C.

References Partitioner::Instruction::bblock, SgAsmFunction::FUNC_CALL_TARGET, Partitioner::Instruction::get_size(), and Partitioner::Instruction::terminates_basic_block().

Referenced by Partitioner::FindInsnPadding::operator()(), Partitioner::FindFunctionFragments::operator()(), Partitioner::FindThunks::operator()(), Partitioner::FindThunkTables::operator()(), and Partitioner::FindPostFunctionInsns::operator()().

Partitioner::BasicBlock * Partitioner::find_bb_starting ( rose_addr_t  va,
bool  create = true 
)
virtual

Makes sure the block at the specified address exists.

This is similar to find_bb_containing() except it makes sure that va starts a new basic block if it was previously in the middle of a block. If an existing block had to be truncated to start this new block then the original block's function is marked as pending rediscovery.

Definition at line 981 of file Partitioner.C.

References Partitioner::BasicBlock::address(), Partitioner::BasicBlock::function, Partitioner::BasicBlock::insns, and Partitioner::Function::pending.

Referenced by Partitioner::FindInsnPadding::operator()(), Partitioner::FindFunctionFragments::operator()(), Partitioner::FindThunks::operator()(), and Partitioner::FindThunkTables::operator()().

Partitioner::DataBlock * Partitioner::find_db_starting ( rose_addr_t  start_va,
size_t  size 
)
virtual

Finds (or creates) a data block.

Finds a data block starting at the specified address. If size is non-zero then the existing data block must contain all bytes in the range start_va (inclusive) to start_va + size (exclusive), and if it doesn't then a new SgAsmStaticData node is created and appended to either a new data block or a data block that already begins at the specified address. If size is zero an no block exists, then the null pointer is returned. The size of the existing block does not matter if size is zero.

Definition at line 2094 of file Partitioner.C.

References RangeMap< R, T >::empty(), RangeMap< R, T >::erase_ranges(), RangeMap< R, T >::insert(), MemoryMap::MM_PROT_NONE, Partitioner::DataBlock::nodes, and SgAsmStatement::set_address().

Referenced by Partitioner::FindDataPadding::operator()(), Partitioner::FindData::operator()(), and Partitioner::FindInsnPadding::operator()().

Disassembler::AddressSet Partitioner::successors ( BasicBlock bb,
bool *  complete = NULL 
)
virtual

Returns known successors of a basic block.

There are two types of successor analyses: one is an analysis that depends only on the instructions of the basic block for which successors are being calculated. It is safe to cache these based on properties of the block itself (e.g., the number of instructions in the block).

The other category is analyses that depend on other blocks, such as determining whether the target of an x86 CALL instruction returns to the instruction after the CALL site. The results of these analyses cannot be cached at the block that needs them and must be recomputed for each call. However, they can be cached at either the block or function that's analyzed, so recomputing them here in this block is probably not too expensive.

All successor addresses are translated according to the alias_for links in existing blocks via calls to canonic_block().

Definition at line 354 of file Partitioner.C.

References Partitioner::BasicBlock::cache, Partitioner::BasicBlock::function, Partitioner::Instruction::get_address(), Partitioner::Instruction::get_size(), Partitioner::BlockAnalysisCache::is_function_call, Partitioner::BasicBlock::last_insn(), Partitioner::Function::possible_may_return(), Partitioner::BlockAnalysisCache::sucs, and Partitioner::BlockAnalysisCache::sucs_complete.

Referenced by Partitioner::FindThunkTables::operator()().

rose_addr_t Partitioner::call_target ( BasicBlock bb)
virtual

Returns call target if block could be a function call.

If the specified block looks like it could be a function call (using only local analysis) then return the call target address. If the block does not look like a function call or the target address cannot be statically computed, then return Partitioner::NO_TARGET.

Definition at line 407 of file Partitioner.C.

References Partitioner::BasicBlock::cache, and Partitioner::BlockAnalysisCache::call_target.

void Partitioner::truncate ( BasicBlock bb,
rose_addr_t  va 
)
virtual

Reduces the size of a basic block by truncating its list of instructions.

The new block contains initial instructions up to but not including the instruction at the specified virtual address. The addresses of the instructions (aside from the instruction with the specified split point), are irrelevant since the choice of where to split is based on the relative positions in the basic block's instruction vector rather than instruction address.

If this basic block's size decreased, then any data blocks associated with this basic block are no longer associated with this basic block.

Definition at line 688 of file Partitioner.C.

References Partitioner::Instruction::bblock, Partitioner::BasicBlock::clear_data_blocks(), and Partitioner::BasicBlock::insns.

void Partitioner::discover_first_block ( Function func)
virtual

Adds first basic block to empty function before we start discovering blocks of any other functions.

This protects against cases where one function simply falls through to another within a basic block, such as: 08048460 <foo>: 8048460: 55 push ebp 8048461: 89 e5 mov ebp,esp 8048463: 83 ec 08 sub esp,0x8 8048466: c7 04 24 d4 85 04 08 mov DWORD PTR [esp],0x80485d4 804846d: e8 8e fe ff ff call 8048300 <puts> 8048472: c7 04 24 00 00 00 00 mov DWORD PTR [esp],0x0 8048479: e8 a2 fe ff ff call 8048320 <_exit> 804847e: 89 f6 mov esi,esi

08048480 <handler>: 8048480: 55 push ebp 8048481: 89 e5 mov ebp,esp 8048483: 83 ec 08 sub esp,0x8

Definition at line 2742 of file Partitioner.C.

References Partitioner::BasicBlock::address(), SgAsmBlock::BLK_ENTRY_POINT, Partitioner::Function::entry_va, Partitioner::BasicBlock::function, Partitioner::BasicBlock::insns, Partitioner::Function::name, Partitioner::Function::pending, Partitioner::Function::reason, SgAsmFunction::reason_str(), and Partitioner::Function::show_properties().

void Partitioner::discover_blocks ( Function f,
unsigned  reason 
)
virtual
void Partitioner::discover_blocks ( Function f,
rose_addr_t  va,
unsigned  reason 
)
virtual

Discover the basic blocks that belong to the current function.

This function recursively adds basic blocks to function f by following the successors of each block. If a successor is an instruction belonging to some other function then it's either a function call (if it branches to the entry point of that function) or it's a collision. Collisions are resolved by discarding and rediscovering the blocks of the other function.

Definition at line 2790 of file Partitioner.C.

References Partitioner::BasicBlock::address(), Partitioner::Function::entry_va, SgAsmFunction::FUNC_CALL_TARGET, SgAsmFunction::FUNC_GRAPH, Partitioner::BasicBlock::function, Partitioner::BasicBlock::insns, Partitioner::Function::name, and Partitioner::Function::pending.

bool Partitioner::pops_return_address ( rose_addr_t  va)
virtual

Determines if a block pops the stack w/o returning.

Definition at line 422 of file Partitioner.C.

References SgAsmx86Instruction::get_kind(), isSgAsmx86Instruction(), x86_ret, and x86_segreg_ss.

void Partitioner::update_analyses ( BasicBlock bb)
virtual
rose_addr_t Partitioner::canonic_block ( rose_addr_t  va)
virtual

Follow alias links in basic blocks.

Folows alias_for links in basic blocks.

The input value is the virtual address of a basic block (which need not exist). We recursively look up the specified block and follow its alias_for link until either the block does not exist or it has no alias_for.

Definition at line 1005 of file Partitioner.C.

References Partitioner::BlockAnalysisCache::alias_for, and Partitioner::BasicBlock::cache.

bool Partitioner::is_function_call ( BasicBlock bb,
rose_addr_t target_va 
)
virtual

Returns true if basic block appears to end with a function call.

If the call target can be determined and target_va is non-null, then target_va will be initialized to contain the virtual address of the call target; otherwise it will contain the constant NO_TARGET.

Definition at line 334 of file Partitioner.C.

References Partitioner::BasicBlock::cache, Partitioner::BlockAnalysisCache::call_target, and Partitioner::BlockAnalysisCache::is_function_call.

bool Partitioner::is_thunk ( Function func)
virtual

Determines if function is a thunk.

A thunk is a small piece of code (a function) whose only purpose is to branch to another function. This predicate should not be confused with the SgAsmFunction::FUNC_THUNK reason bit; the latter is only an indication of why the function was originally created. A thunk (as defined by this predicate) might not have the FUNC_THUNK reason bit set if this function was detected by other means (such as being a target of a function call). Conversely, a function that has the FUNC_THUNK reason bit set might not qualify as being a thunk by the definition implemented in this predicate (additional blocks or instructions might have been discovered that disqualify this function even though it was originally thought to be a thunk).

Definition at line 2396 of file Partitioner.C.

References Partitioner::Function::basic_blocks, SgAsmFunction::FUNC_LEFTOVERS, SgAsmFunction::FUNC_PADDING, SgAsmx86Instruction::get_kind(), Partitioner::BasicBlock::insns, isSgAsmx86Instruction(), Partitioner::Function::reason, x86_farjmp, and x86_jmp.

Partitioner::Function * Partitioner::effective_function ( DataBlock dblock)
virtual

Returns the function to which this data block is effectively assigned.

This returns, in this order, the function to which this data block is explicitly assigned, the function to which this block is implicitly assigned via an association with a basic block, or a null pointer.

Definition at line 506 of file Partitioner.C.

References Partitioner::DataBlock::basic_block, Partitioner::BasicBlock::function, and Partitioner::DataBlock::function.

void Partitioner::mark_call_insns ( )
virtual

Naive marking of CALL instruction targets as functions.

Definition at line 1620 of file Partitioner.C.

References SgAsmFunction::FUNC_CALL_TARGET.

void Partitioner::mark_export_entries ( SgAsmGenericHeader fhdr)
virtual
void Partitioner::mark_func_patterns ( )
virtual

Seeds functions according to byte and instruction patterns.

Note that the instruction pattern matcher looks only at existing instructions–it does not actively disassemble new instructions. In other words, this matcher is intended mostly for passive-mode partitioners where the disassembler has already disassembled everything it can. The byte pattern matcher works whether or not instructions are available.

Definition at line 1560 of file Partitioner.C.

References add_function(), and SgAsmFunction::FUNC_PATTERN.

void Partitioner::name_plt_entries ( SgAsmGenericHeader fhdr)
virtual

Gives names to dynmaic linking trampolines for ELF.

This method gives names to the dynamic linking trampolines in the .plt section if the Partitioner detected them as functions. If mark_elf_plt_entries() was called then they all would have been marked as functions and given names. Otherwise, ROSE might have detected some of them in other ways (like CFG analysis) and this function will give them names.

Definition at line 2525 of file Partitioner.C.

References SgAsmGenericHeader::get_base_va(), SgAsmElfRelocEntryList::get_entries(), SgAsmGenericSection::get_mapped_preferred_rva(), SgAsmGenericSection::get_mapped_size(), SgAsmGenericSymbol::get_name(), SgAsmElfRelocEntry::get_r_offset(), SgAsmGenericHeader::get_section_by_name(), SgAsmGenericHeader::get_sections(), SgAsmGenericSectionList::get_sections(), SgAsmGenericString::get_string(), SgAsmElfRelocEntry::get_sym(), SgAsmElfSymbolSection::get_symbols(), SgAsmElfSymbolList::get_symbols(), SgAsmGenericSection::is_mapped(), isSgAsmElfFileHeader(), isSgAsmElfRelocSection(), isSgAsmElfSymbolSection(), and isSgAsmx86Instruction().

void Partitioner::name_import_entries ( SgAsmGenericHeader fhdr)
virtual

Gives names to dynamic linking thunks for PE.

This method gives names to thunks for imported functions. The thunks must have already been detected by the partitioner–this method does not create new functions. The algorithm scans the list of unnamed functions looking for functions whose entry instruction is an indirect jump. When found, check whether the jump is through a memory address that part of an import address table. If so, use the corresponding import name as the name of this function and append "@import". The "@import" is to distinguish between the actual function whose name is given in the PE Import Section, and the thunk that jumps to that function. It's possible to have multiple thunks that all jump to the same imported function and thus all have the same name.

Definition at line 2604 of file Partitioner.C.

References Partitioner::Function::entry_va, SgAsmMemoryReferenceExpression::get_address(), SgAsmx86Instruction::get_kind(), SgAsmInstruction::get_operandList(), SgAsmOperandList::get_operands(), isSgAsmMemoryReferenceExpression(), isSgAsmPEFileHeader(), isSgAsmPEImportItem(), isSgAsmValueExpression(), isSgAsmx86Instruction(), Partitioner::Function::name, name, preorder, x86_farjmp, and x86_jmp.

void Partitioner::find_pe_iat_extents ( SgAsmGenericHeader hdr)
virtual

Find the addresses for all PE Import Address Tables.

Adds them to Partitioner::pe_iat_extents.

Definition at line 2663 of file Partitioner.C.

References SgAsmGenericHeader::get_sections_by_name().

size_t Partitioner::function_extent ( FunctionRangeMap extents)
virtual

Adds extents for all defined functions.

Scans across all known functions and adds their extents to the specified RangeMap argument. Returns the sum of the return values from the single-function function_extent() method.

Definition at line 3590 of file Partitioner.C.

Referenced by Partitioner::FindFunctionFragments::operator()().

size_t Partitioner::function_extent ( Function func,
FunctionRangeMap extents = NULL,
rose_addr_t lo_addr = NULL,
rose_addr_t hi_addr = NULL 
)
virtual

Returns information about the function addresses.

Every non-empty function has a minimum (inclusive) and maximum (exclusive) address which are returned by reference, but not all functions own all the bytes within that range of addresses. Therefore, the exact bytes are returned by adding them to the optional ExtentMap argument. This function returns the number of nodes (instructions and static data items) in the function. If the function contains no nodes then the extent map is not modified and the low and high addresses are both set to zero.

See also: SgAsmFunction::get_extent(), which calculates the same information but can be used only after we've constructed the AST for the function.

Definition at line 3599 of file Partitioner.C.

References Partitioner::Function::basic_blocks, RangeMap< R, T >::begin(), Partitioner::BasicBlock::data_blocks, Partitioner::Function::data_blocks, RangeMap< R, T >::end(), RangeMap< R, T >::insert(), Partitioner::BasicBlock::insns, and max.

size_t Partitioner::datablock_extent ( DataBlock db,
DataRangeMap extents = NULL,
rose_addr_t lo_addr = NULL,
rose_addr_t hi_addr = NULL 
)
virtual

Returns information about the datablock addresses.

Every data block has a minimum (inclusive) and maximum (exclusive) address which are returned by reference, but some of the addresses in that range might not be owned by the specified data block. Therefore, the exact bytes are returned by adding them to the optional ExtentMap argument. This function returns the number of nodes (static data items) in the data block. If the data block contains no nodes then the extent map is not modified, the low and high addresses are both set to zero, and the return value is zero.

Definition at line 3695 of file Partitioner.C.

References SgAsmStatement::get_address(), SgAsmStaticData::get_size(), RangeMap< R, T >::insert(), max, and Partitioner::DataBlock::nodes.

Referenced by fixup_pointers().

size_t Partitioner::datablock_extent ( DataRangeMap extent)
virtual

Adds assigned datablocks to extent.

Scans across all known data blocks and for any block that's assigned to a function, adds that block's extents to the supplied RangeMap. Return value is the number of data blocks added.

Definition at line 3681 of file Partitioner.C.

size_t Partitioner::padding_extent ( DataRangeMap extent)
virtual

Adds padding datablocks to extent.

Scans across all known data blocks, and for any padding block that's assigned to a function, adds that block's extents to the supplied RangeMap. Return value is the number of padding blocks added.

Definition at line 3667 of file Partitioner.C.

References SgAsmBlock::BLK_PADDING, and Partitioner::DataBlock::reason.

Referenced by Partitioner::FindData::operator()(), and Partitioner::FindInterPadFunctions::operator()().

bool Partitioner::is_contiguous ( Function func,
bool  strict = false 
)
virtual

Returns an indication of whether a function is contiguous.

All empty functions are contiguous. If strict is true, then a function is contiguous if it owns all bytes in a contiguous range of the address space. If strict is false then the definition is relaxed so that the instructions need not be contiguous in memory as long as no other function owns any of the bytes between this function's low and high address range.

Definition at line 3732 of file Partitioner.C.

References Partitioner::BasicBlock::function, RangeMap< R, T >::lower_bound(), max, Partitioner::DataBlock::nodes, and RangeMap< R, T >::size().

Referenced by Partitioner::FindFunctionFragments::operator()().

rose_addr_t Partitioner::get_indirection_addr ( SgAsmInstruction g_insn,
rose_addr_t  offset 
)
static

Return the virtual address that holds the branch target for an indirect branch.

For example, when called with these instructions:

jmp DWORD PTR ds:[0x80496b0] -> (x86) returns 80496b0
jmp QWORD PTR ds:[rip+0x200b52] -> (amd64) returns 200b52 + address following instruction
jmp DWORD PTR ds:[ANY_GPR+0x18] -> (x86) returns offset+0x18
// anything else return zero

We only handle instructions that appear as the first instruction in an ELF .plt entry.

Definition at line 2487 of file Partitioner.C.

References SgAsmStatement::get_address(), SgAsmMemoryReferenceExpression::get_address(), SgAsmRegisterReferenceExpression::get_descriptor(), SgAsmBinaryExpression::get_lhs(), RegisterDescriptor::get_major(), SgAsmInstruction::get_operandList(), SgAsmOperandList::get_operands(), SgAsmBinaryExpression::get_rhs(), SgAsmInstruction::get_size(), isSgAsmBinaryExpression(), isSgAsmMemoryReferenceExpression(), isSgAsmValueExpression(), isSgAsmx86Instruction(), isSgAsmx86RegisterReferenceExpression(), offset, x86_regclass_gpr, and x86_regclass_ip.

rose_addr_t Partitioner::value_of ( SgAsmValueExpression e)
static

Returns the integer value of a value expression since there's no virtual method for doing this.

(FIXME)

Definition at line 2474 of file Partitioner.C.

References SgAsmIntegerValueExpression::get_value(), and isSgAsmIntegerValueExpression().

void Partitioner::progress ( FILE *  debug,
const char *  fmt,
  ... 
) const

Conditionally prints a progress report.

If progress reporting is enabled and the required amount of time has elapsed since the previous report, then the supplied report is emited. Also, if debugging is enabled the report is emitted to the debugging file regardless of the elapsed time. The arguments are the same as fprintf().

Definition at line 74 of file Partitioner.C.

References time.

size_t Partitioner::detach_thunks ( )
virtual

Splits thunks off of the start of functions.

Splits as many thunks as possible from the front of all known functions. Returns the number of thunks split off from functions. It's not important that this be done, but doing so results in functions that more closely match what some other disassemblers do when provided with debug info.

Definition at line 3127 of file Partitioner.C.

bool Partitioner::detach_thunk ( Function func)
virtual

Splits one thunk off the start of a function if possible.

Since the partitioner constructs functions according to the control flow graph, thunks (JMP to start of function) often become part of the function to which they jump. This can happen if the real function has no direct callers and was not detected as a function entry point due to any pattern or symbol. The detach_thunks() function traverses all defined functions and looks for cases where the thunk is attached to the jumped-to function, and splits them into two functions.

Definition at line 3139 of file Partitioner.C.

References Partitioner::Function::basic_blocks, SgAsmBlock::BLK_ENTRY_POINT, Partitioner::Function::data_blocks, Partitioner::Function::entry_va, SageInterface::find(), SgAsmFunction::FUNC_THUNK, Partitioner::BasicBlock::function, SgAsmx86Instruction::get_kind(), Partitioner::Function::get_may_return(), Partitioner::Function::heads, Partitioner::BasicBlock::insns, isSgAsmx86Instruction(), Partitioner::Function::name, Partitioner::Function::pending, Partitioner::BasicBlock::reason, Partitioner::DataBlock::reason, Partitioner::Function::reason, Partitioner::Function::set_may_return(), x86_farjmp, and x86_jmp.

bool Partitioner::is_pe_dynlink_thunk ( Instruction insn)

Returns true if the basic block is a PE dynamic linking thunk.

If the argument is a basic block, then the only requirement is that the basic block contains a single instruction, which in the case of x86, is an indirect JMP through an Import Address Table. If the argument is a function, then the function must contain a single basic block which is a dynamic linking thunk. The addresses of the IATs must have been previously initialized by pre_cfg() or other.

Definition at line 2918 of file Partitioner.C.

References SgAsmMemoryReferenceExpression::get_address(), SgAsmx86Instruction::get_kind(), SgAsmInstruction::get_operandList(), SgAsmOperandList::get_operands(), isSgAsmIntegerValueExpression(), isSgAsmMemoryReferenceExpression(), isSgAsmx86Instruction(), Partitioner::Instruction::node, and x86_jmp.

bool Partitioner::is_pe_dynlink_thunk ( BasicBlock bb)

Returns true if the basic block is a PE dynamic linking thunk.

If the argument is a basic block, then the only requirement is that the basic block contains a single instruction, which in the case of x86, is an indirect JMP through an Import Address Table. If the argument is a function, then the function must contain a single basic block which is a dynamic linking thunk. The addresses of the IATs must have been previously initialized by pre_cfg() or other.

Definition at line 2931 of file Partitioner.C.

References Partitioner::BasicBlock::insns.

bool Partitioner::is_pe_dynlink_thunk ( Function func)

Returns true if the basic block is a PE dynamic linking thunk.

If the argument is a basic block, then the only requirement is that the basic block contains a single instruction, which in the case of x86, is an indirect JMP through an Import Address Table. If the argument is a function, then the function must contain a single basic block which is a dynamic linking thunk. The addresses of the IATs must have been previously initialized by pre_cfg() or other.

Definition at line 2937 of file Partitioner.C.

References Partitioner::Function::basic_blocks, and Partitioner::Function::entry_basic_block().

void Partitioner::name_pe_dynlink_thunks ( SgAsmInterpretation interp)

Gives names to PE dynamic linking thunks if possible.

Mark PE dynamic linking thunks as thunks and give them a name if possible.

The names come from the PE Import Table if an interpretation is supplied as an argument. This also marks such functions as being thunks.

Definition at line 3395 of file Partitioner.C.

References Partitioner::Function::entry_basic_block(), SgAsmFunction::FUNC_THUNK, SgAsmInterpretation::get_headers(), SgAsmGenericHeaderList::get_headers(), isSgAsmPEImportItem(), Partitioner::Function::name, name, preorder, Partitioner::Function::reason, and rva.

void Partitioner::adjust_padding ( )
virtual

Adjusts ownership of padding data blocks.

Each padding data block should be owned by the prior function in the address space. This is normally the case, but when functions are moved around, split, etc., the padding data blocks can get mixed up. This method puts them all back where they belong.

Definition at line 3238 of file Partitioner.C.

References RangeMap< R, T >::begin(), RangeMap< R, T >::end(), RangeMap< R, T >::erase(), RangeMap< R, T >::find_prior(), and Partitioner::DataBlock::reason.

void Partitioner::merge_function_fragments ( )
virtual

Merge function fragments.

The partitioner sometimes goes crazy breaking functions into smaller and smaller parts. This method attempts to merge all those parts after the partitioner's function detection has completed. A function fragment is any function whose only reason code is one of the GRAPH codes (function detected by graph analysis and the rule that every function has only one entry point).

Definition at line 3262 of file Partitioner.C.

References Partitioner::Function::basic_blocks, SgAsmFunction::FUNC_GRAPH, Partitioner::BasicBlock::function, name, Partitioner::Function::reason, and SgAsmFunction::reason_str().

void Partitioner::merge_functions ( Function parent,
Function other 
)
virtual
Disassembler::AddressSet Partitioner::discover_jump_table ( BasicBlock bb,
bool  do_create = true,
ExtentMap table_addresses = NULL 
)

Looks for a jump table.

This method looks at the specified basic block and tries to discover if the last instruction is an indirect jump through memory. If it is, then the entries of the jump table are returned by value (i.e., the control flow successors of the given basic block), and the addresses of the table are added to the optional extent map. It is possible for the jump table to be discontiguous, but this is not usually the case. If do_create is true then data blocks are created for the jump table and added to the basic block.

Definition at line 199 of file Partitioner.C.

References SgAsmBlock::BLK_JUMPTABLE, RegisterDictionary::dictionary_amd64(), RangeMap< R, T >::insert(), Partitioner::BasicBlock::insns, isSgAsmMemoryReferenceExpression(), isSgAsmRegisterReferenceExpression(), isSgAsmx86Instruction(), Partitioner::BasicBlock::last_insn(), MemoryMap::MM_PROT_EXEC, x86_farjmp, and x86_jmp.

Member Data Documentation

RegionStats* Partitioner::aggregate_mean
protected

Aggregate statistics returned by get_region_stats_mean().

Definition at line 1023 of file Partitioner.h.

Referenced by clear_aggregate_statistics(), and get_aggregate_mean().

RegionStats* Partitioner::aggregate_variance
protected

Aggregate statistics returned by get_region_stats_variance().

Definition at line 1024 of file Partitioner.h.

Referenced by clear_aggregate_statistics(), and get_aggregate_variance().

CodeCriteria* Partitioner::code_criteria
protected

Criteria used to determine if a region contains code or data.

Definition at line 1025 of file Partitioner.h.

Referenced by get_code_criteria(), and set_code_criteria().

Disassembler* Partitioner::disassembler

Optional disassembler to call when an instruction is needed.

Definition at line 1833 of file Partitioner.h.

InstructionMap Partitioner::insns

Instruction cache, filled in by user or populated by disassembler.

Definition at line 1834 of file Partitioner.h.

Referenced by count_floating_point(), count_kinds(), count_privileged(), count_registers(), count_size_variance(), ratio_floating_point(), ratio_privileged(), and ratio_registers().

MemoryMap* Partitioner::map

Memory map used for disassembly if disassembler is present.

Definition at line 1835 of file Partitioner.h.

Referenced by get_map().

MemoryMap Partitioner::ro_map

The read-only parts of 'map', used for insn semantics mem reads.

Definition at line 1836 of file Partitioner.h.

Referenced by Partitioner::FindDataPadding::operator()().

ExtentMap Partitioner::pe_iat_extents

Virtual addresses for all PE Import Address Tables.

Definition at line 1837 of file Partitioner.h.

Disassembler::BadMap Partitioner::bad_insns

Captured disassembler exceptions.

Definition at line 1838 of file Partitioner.h.

Referenced by clear_disassembler_errors(), and get_disassembler_errors().

BasicBlocks Partitioner::basic_blocks

All known basic blocks.

Definition at line 1840 of file Partitioner.h.

Functions Partitioner::functions

All known functions, pending and complete.

Definition at line 1841 of file Partitioner.h.

Referenced by Partitioner::FindThunks::operator()().

DataBlocks Partitioner::data_blocks

Blocks that point to static data.

Definition at line 1843 of file Partitioner.h.

unsigned Partitioner::func_heuristics

Bit mask of SgAsmFunction::FunctionReason bits.

Definition at line 1845 of file Partitioner.h.

Referenced by get_search(), and set_search().

std::vector<FunctionDetector> Partitioner::user_detectors

List of user-defined function detection methods.

Definition at line 1846 of file Partitioner.h.

Referenced by add_function_detector().

bool Partitioner::allow_discont_blocks

Allow basic blocks to be discontiguous in virtual memory.

Definition at line 1849 of file Partitioner.h.

Referenced by get_allow_discontiguous_blocks(), and set_allow_discontiguous_blocks().

BlockConfigMap Partitioner::block_config

IPD configuration info for basic blocks.

Definition at line 1850 of file Partitioner.h.

time_t Partitioner::progress_interval = 10
static

Minimum interval between progress reports.

Definition at line 1852 of file Partitioner.h.

time_t Partitioner::progress_time = 0
static

Time of last report, or zero if no report has been generated.

Definition at line 1853 of file Partitioner.h.

FILE * Partitioner::progress_file = stderr
static

File to which reports are made.

Null disables reporting.

Definition at line 1854 of file Partitioner.h.

const rose_addr_t Partitioner::NO_TARGET = (rose_addr_t)-1
static

Definition at line 1857 of file Partitioner.h.

Referenced by Partitioner::BlockAnalysisCache::clear().


The documentation for this class was generated from the following files: