ROSE  0.9.6a
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
AssemblerX86 Class Reference

This class contains methods for assembling x86 instructions (SgAsmx86Instruction). More...

#include <AssemblerX86.h>

Inheritance diagram for AssemblerX86:
Collaboration diagram for AssemblerX86:

Classes

class  InsnDefn
 Defines static characteristics of an instruction used by the assembler and disassembler. More...
 

Public Member Functions

 AssemblerX86 ()
 
virtual ~AssemblerX86 ()
 
virtual SgUnsignedCharList assembleOne (SgAsmInstruction *)
 Assemble an instruction (SgAsmInstruction) into byte code. More...
 
void set_honor_operand_types (bool b)
 Causes the assembler to honor (if true) or disregard (if false) the data types of operands when assembling. More...
 
bool get_honor_operand_types () const
 Returns true if the assembler is honoring operand data types, or false if the assembler is using the smallest possible encoding. More...
 
virtual SgUnsignedCharList assembleProgram (const std::string &source)
 Assemble an x86 program from assembly source code using the nasm assembler. More...
 
- Public Member Functions inherited from Assembler
 Assembler ()
 
virtual ~Assembler ()
 
SgUnsignedCharList assembleBlock (SgAsmBlock *)
 Assembles a single basic block of instructions, packing them together and adjusting their virtual addresses. More...
 
SgUnsignedCharList assembleBlock (const std::vector< SgAsmInstruction * > &insns, rose_addr_t starting_rva)
 Assembles a single basic block of instructions like the version that takes an SgAsmBlock pointer. More...
 
void set_encoding_type (EncodingType et)
 Controls how the assembleOne() method determines which encoding to return. More...
 
EncodingType get_encoding_type () const
 Returns the encoding type employed by this assembler. More...
 
void set_debug (FILE *f)
 Sends assembler diagnostics to the specified output stream. More...
 
FILE * get_debug () const
 Returns the file currently used for debugging; null implies no debugging. More...
 

Private Types

enum  OperandDefn {
  od_none,
  od_AL,
  od_AX,
  od_EAX,
  od_RAX,
  od_DX,
  od_CS,
  od_DS,
  od_ES,
  od_FS,
  od_GS,
  od_SS,
  od_rel8,
  od_rel16,
  od_rel32,
  od_rel64,
  od_ptr16_16,
  od_ptr16_32,
  od_ptr16_64,
  od_r8,
  od_r16,
  od_r32,
  od_r64,
  od_imm8,
  od_imm16,
  od_imm32,
  od_imm64,
  od_r_m8,
  od_r_m16,
  od_r_m32,
  od_r_m64,
  od_m,
  od_m8,
  od_m16,
  od_m32,
  od_m64,
  od_m128,
  od_m16_16,
  od_m16_32,
  od_m16_64,
  od_m16a16,
  od_m16a32,
  od_m32a32,
  od_m16a64,
  od_moffs8,
  od_moffs16,
  od_moffs32,
  od_moffs64,
  od_sreg,
  od_m32fp,
  od_m64fp,
  od_m80fp,
  od_st0,
  od_st1,
  od_st2,
  od_st3,
  od_st4,
  od_st5,
  od_st6,
  od_st7,
  od_sti,
  od_mm,
  od_mm_m32,
  od_mm_m64,
  od_xmm,
  od_xmm_m16,
  od_xmm_m32,
  od_xmm_m64,
  od_xmm_m128,
  od_XMM0,
  od_0,
  od_1,
  od_m80,
  od_dec,
  od_m80bcd,
  od_m2byte,
  od_m14_28byte,
  od_m94_108byte,
  od_m512byte,
  od_r16_m16,
  od_r32_m8,
  od_r32_m16,
  od_r64_m16,
  od_CR0,
  od_CR7,
  od_CR8,
  od_CR0CR7,
  od_DR0DR7,
  od_reg,
  od_CL
}
 Operand types, from Intel "Instruction Set Reference, A-M" section 3.1.1.2, Vol. More...
 
enum  MemoryReferencePattern {
  mrp_unknown,
  mrp_disp,
  mrp_index,
  mrp_index_disp,
  mrp_base,
  mrp_base_disp,
  mrp_base_index,
  mrp_base_index_disp
}
 
typedef std::vector< const
InsnDefn * > 
DictionaryPage
 Instruction assembly definitions for a single kind of instruction. More...
 
typedef std::map
< X86InstructionKind,
DictionaryPage
InsnDictionary
 Instruction assembly definitions for all kinds of instructions. More...
 

Private Member Functions

SgUnsignedCharList fixup_prefix_bytes (SgAsmx86Instruction *insn, SgUnsignedCharList source)
 Rewrites the prefix bytes stored in the original source to be in the same order (and same repeat counts) as stored in the target, or p_raw_bytes data member of the instruction. More...
 
SgUnsignedCharList assemble (SgAsmx86Instruction *insn, const InsnDefn *defn)
 Low-level method to assemble a single instruction using the specified definition from the assembly dictionary. More...
 
void matches (const InsnDefn *defn, SgAsmx86Instruction *insn, int64_t *disp, int64_t *imm) const
 Attempts to match an instruction with a definition. More...
 
bool matches (OperandDefn, SgAsmExpression *, SgAsmInstruction *, int64_t *disp, int64_t *imm) const
 Attempts to match an instruction operand with a definition operand. More...
 
uint8_t build_modrm (const InsnDefn *, SgAsmx86Instruction *, size_t argno, uint8_t *sib, int64_t *displacement, uint8_t *rex) const
 Builds the ModR/M byte, SIB byte. More...
 
void build_modreg (const InsnDefn *, SgAsmx86Instruction *, size_t argno, uint8_t *modrm, uint8_t *rex) const
 Adjusts the "reg" field of the ModR/M byte and adjusts the REX prefix byte if necessary. More...
 
uint8_t segment_override (SgAsmx86Instruction *)
 Calculates the segment override from the instruction operands rather than obtaining it from the p_segmentOverride data member. More...
 

Static Private Member Functions

static size_t od_e_val (unsigned opcode_mods)
 Returns value of En modification. More...
 
static uint8_t od_rex_byte (unsigned opcode_mods)
 
static uint8_t build_modrm (unsigned mod, unsigned reg, unsigned rm)
 Returns a ModR/M byte constructed from the three standard fields: mode, register, and register/memory. More...
 
static unsigned modrm_mod (uint8_t modrm)
 Returns the mode field of a ModR/M byte. More...
 
static unsigned modrm_reg (uint8_t modrm)
 Returns the register field of a ModR/M byte. More...
 
static unsigned modrm_rm (uint8_t modrm)
 Returns the register/memory field of a ModR/M byte. More...
 
static uint8_t build_sib (unsigned ss, unsigned index, unsigned base)
 Returns a SIB byte constructed from the three standard fields: scale, index, and base. More...
 
static unsigned sib_ss (uint8_t sib)
 Returns the scale field of a SIB byte. More...
 
static unsigned sib_index (uint8_t sib)
 Returns the index field of a SIB byte. More...
 
static unsigned sib_base (uint8_t sib)
 Returns the base field of a SIB byte. More...
 
static void initAssemblyRules ()
 Build the dictionary used by the x86 assemblers. More...
 
static void initAssemblyRules_part1 ()
 
static void initAssemblyRules_part2 ()
 
static void initAssemblyRules_part3 ()
 
static void initAssemblyRules_part4 ()
 
static void initAssemblyRules_part5 ()
 
static void initAssemblyRules_part6 ()
 
static void initAssemblyRules_part7 ()
 
static void initAssemblyRules_part8 ()
 
static void initAssemblyRules_part9 ()
 
static void define (const InsnDefn *d)
 Adds a definition to the assembly dictionary. More...
 
static std::string to_str (X86InstructionKind)
 Returns the string version of the instruction kind sans "x86_" prefix. More...
 
static bool matches_rel (SgAsmInstruction *, int64_t val, size_t nbytes)
 Determines whether a call/jump target can be represented as a IP-relative displacement of the specified size. More...
 
static MemoryReferencePattern parse_memref (SgAsmInstruction *insn, SgAsmMemoryReferenceExpression *expr, SgAsmx86RegisterReferenceExpression **base_reg, SgAsmx86RegisterReferenceExpression **index_reg, SgAsmValueExpression **scale, SgAsmValueExpression **displacement)
 Parses memory refernce expressons and returns the address BASE_REG + (INDEX_REG * SCALE) + DISPLACEMENT, where BASE_REG and INDEX_REG are optional register reference expressions and SCALE and DISPLACEMENT are optional value expressions. More...
 

Private Attributes

bool honor_operand_types
 If true, operand types rather than values determine assembled form. More...
 

Static Private Attributes

static const unsigned od_e_mask = 0x00000070
 Indicates that the ModR/M byte of the instruction uses only the r/m (register or memory) operand. More...
 
static const unsigned od_e_pres = 0x00000080
 
static const unsigned od_e0 = 0x00000000 | od_e_pres
 
static const unsigned od_e1 = 0x00000010 | od_e_pres
 
static const unsigned od_e2 = 0x00000020 | od_e_pres
 
static const unsigned od_e3 = 0x00000030 | od_e_pres
 
static const unsigned od_e4 = 0x00000040 | od_e_pres
 
static const unsigned od_e5 = 0x00000050 | od_e_pres
 
static const unsigned od_e6 = 0x00000060 | od_e_pres
 
static const unsigned od_e7 = 0x00000070 | od_e_pres
 
static const unsigned od_rex_pres = 0x00000001
 Indicates the use of a REX prefix that affects operand size or instruction semantics. More...
 
static const unsigned od_rex_mask = 0x00000f00
 
static const unsigned od_rex = 0x00000000 | od_rex_pres
 
static const unsigned od_rexb = 0x00000100 | od_rex_pres
 
static const unsigned od_rexx = 0x00000200 | od_rex_pres
 
static const unsigned od_rexxb = 0x00000300 | od_rex_pres
 
static const unsigned od_rexr = 0x00000400 | od_rex_pres
 
static const unsigned od_rexrb = 0x00000500 | od_rex_pres
 
static const unsigned od_rexrx = 0x00000600 | od_rex_pres
 
static const unsigned od_rexrxb = 0x00000700 | od_rex_pres
 
static const unsigned od_rexw = 0x00000800 | od_rex_pres
 
static const unsigned od_rexwb = 0x00000900 | od_rex_pres
 
static const unsigned od_rexwx = 0x00000a00 | od_rex_pres
 
static const unsigned od_rexwxb = 0x00000b00 | od_rex_pres
 
static const unsigned od_rexwr = 0x00000c00 | od_rex_pres
 
static const unsigned od_rexwrb = 0x00000d00 | od_rex_pres
 
static const unsigned od_rexwrx = 0x00000e00 | od_rex_pres
 
static const unsigned od_rexwrxb = 0x00000f00 | od_rex_pres
 
static const unsigned od_modrm = 0x00000002
 Indicates that the ModR/M byte of the instruction contains a register operand and an r/m operand. More...
 
static const unsigned od_c_mask = 0x00007000
 A 1-byte (CB), 2-byte (CW), 4-byte (CD), 6-byte (CP), 8-byte (CO), or 10-byte (CT) value follows the opcode. More...
 
static const unsigned od_cb = 0x00001000
 
static const unsigned od_cw = 0x00002000
 
static const unsigned od_cd = 0x00003000
 
static const unsigned od_cp = 0x00004000
 
static const unsigned od_co = 0x00005000
 
static const unsigned od_ct = 0x00006000
 
static const unsigned od_i_mask = 0x00070000
 A 1-byte (IB), 2-byte (IW), 4-byte (ID), or 8-byte (IO) little-endian immediate operand to the instruction follows the opcode, ModR/M bytes or scale-indexing bytes. More...
 
static const unsigned od_ib = 0x00010000
 
static const unsigned od_iw = 0x00020000
 
static const unsigned od_id = 0x00030000
 
static const unsigned od_io = 0x00040000
 
static const unsigned od_r_mask = 0x00700000
 A register code, from 0 through 7, added to a byte of the opcode. More...
 
static const unsigned od_rb = 0x00100000
 
static const unsigned od_rw = 0x00200000
 
static const unsigned od_rd = 0x00300000
 
static const unsigned od_ro = 0x00400000
 
static const unsigned od_i = 0x00000004
 A number used in floating-point instructions when one of the operands is ST(i) from the FPU register stack. More...
 
static const unsigned COMPAT_LEGACY = 0x01
 These bits define the compatibility of an instruction to 32- and 64-bit modes. More...
 
static const unsigned COMPAT_64 = 0x02
 Definition is compatible with 64-bit architectures. More...
 
static InsnDictionary defns
 Instruction assembly definitions organized by X86InstructionKind. More...
 

Additional Inherited Members

- Public Types inherited from Assembler
enum  EncodingType {
  ET_SHORTEST,
  ET_LONGEST,
  ET_MATCHES
}
 Assemblers can often assemble a single instruction various ways. More...
 
- Static Public Member Functions inherited from Assembler
static Assemblercreate (SgAsmInterpretation *interp)
 Creates an assembler that is appropriate for assembling instructions in the specified interpretation. More...
 
static Assemblercreate (SgAsmGenericHeader *)
 Creates an assembler that is appropriate for assembling instructions in the specified header. More...
 
- Protected Attributes inherited from Assembler
FILE * p_debug
 Set to non-null to get debugging info. More...
 
EncodingType p_encoding_type
 Which encoding should be returned by assembleOne. More...
 

Detailed Description

This class contains methods for assembling x86 instructions (SgAsmx86Instruction).

End users will generally not need to use the AssemblerX86 class directly. Instead, they will call Assembler::create() to create an assembler that's appropriate for a particular binary file header or interpretation and then use that assembler to assemble instructions.

The assembler itself is quite small compared to the disassembler (about one third the size) and doesn't actually know about any instructions; it only knows how to encode various prefixes and operand addressing modes. For each instruction to be assembled, the assembler consults a dictionary of assembly definitions. The instruction is looked up in this dictionary and the chosen definition then drives the assembly. If the instruction being assembled matches multiple definitions then each valid definition is tried and the "best" one (see Assembler::set_encoding_type()) is returned.

The dictionary is generated directly from the Intel "Instruction Set Reference" PDF documentation as augmented by a small text file in this directory. The IntelAssemblyBuilder perl script generates AssemblerX86Init.h and AssemblerX86Init.C, which contain the X86InstructionKind enumeration, a function to initialize the dictionary (AssemblerX86::initAssemblyRules()), and a function for converting an X86InstructionKind constant to a string (AssemblerX86::to_str()).

Definition at line 26 of file AssemblerX86.h.

Member Typedef Documentation

typedef std::vector<const InsnDefn*> AssemblerX86::DictionaryPage
private

Instruction assembly definitions for a single kind of instruction.

Definition at line 395 of file AssemblerX86.h.

Instruction assembly definitions for all kinds of instructions.

Definition at line 398 of file AssemblerX86.h.

Member Enumeration Documentation

Operand types, from Intel "Instruction Set Reference, A-M" section 3.1.1.2, Vol.

2A 3-3

Enumerator
od_none 

Operand is not present as part of the instruction.

od_AL 

AL register.

od_AX 

AX register.

od_EAX 

EAX register.

od_RAX 

RAX register.

od_DX 

DX register.

od_CS 

CS register.

od_DS 

DS register.

od_ES 

ES register.

od_FS 

FS register.

od_GS 

GS register.

od_SS 

SS register.

od_rel8 

A relative address in the range from 128 bytes before the end of the instruction to 127 bytes after the end of the instruction.

od_rel16 

A relative address in the same code segment as the instruction assembled, with an operand size attribute of 16 bits.

od_rel32 

A relative address in the same code segment as the instruction assembled, with an operand size attribute of 32 bits.

od_rel64 

A relative address in the same code segment as the instruction assembled, with an operand size attribute of 64 bits.

od_ptr16_16 

A far pointer, typically to a code segment different from that of the instruction.

The notation 16:16 indicates that the value of the pointer has two parts. The value to the left of the colon is a 16-bit selector or value destined for the code segment register. The value to the right corresponds to the offset within the destination segment. The ptr1616 symbol is used when the instruction's operand-size attribute is 16 bits.

od_ptr16_32 

A far pointer, typically to a code segment different from that of the instruction.

The notation 16:32 indicates that the value of the pointer has two parts. The value to the left of the colon is a 16-bit selector or value destined for the code segment register. The value to the right corresponds to the offset within the destination segment. The ptr1632 symbol is used when the instruction's operand-size attribute is 32 bits.

od_ptr16_64 

A far pointer, typically to a code segment different from that of the instruction.

The notation 16:64 indicates that the value of the pointer has two parts. The value to the left of the colon is a 16-bit selector or value destined for the code segment register. The value to the right corresponds to the offset within the destination segment. The ptr1664 symbol is used when the instruction's operand-size attribute is 64 bits.

od_r8 

One of the byte general-purpose registers: AL, CL, DL, BL, AH, CH, DH, BH, BPL, SPL, DIL and SIL; or one of the byte registers (R8L-R15L) available when using REX.R and 64-bit mode.

od_r16 

One of the word general-purpose registers: AX, CX, DX, BX, SP, BP, SI, DI; or one of the word registers (R8-R15) available when using REX.R and 64-bit mode.

od_r32 

One of the doubleword general-purpose registers: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI; or one of the doubleword registers (R8D-R15D) available when using REX.R and 64-bit mode.

od_r64 

One of the quadword general-purpose registers: RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8-R15.

These are available when using REX.R and 64-bit mode.

od_imm8 

An immediate byte value, a signed number between -128 and +127, inclusive.

For instructions in which imm8 is combined with a word or doubleword operand, the immediate value is sign-extended to form a word or doubleword (the upper byte of the word is filled with the topmost bit of the immediate value).

od_imm16 

An immediate word value used for instructions whose operand-size attribute is 16 bits.

This is a number between -32,768 and +32,767 inclusive.

od_imm32 

An immediate doubleword value used for instructions whose operand-size attribute is 32 bits.

It allows the use of a number between -2,147,483,648 and +2,147,483,647 inclusive.

od_imm64 

An immediate quadword value used for instructions whose operand-size attribute is 64 bits.

The value allows the use of a number between -9,223,372,036,854,775,808 and +9,223,372,036,854,775,807 inclusive.

od_r_m8 

A byte operand that is either the contents of a byte general-purpose register (AL, CL, DL, BL, AH, CH, DH, BH, BPL, SPL, DIL and SIL) or a byte from memory.

Byte registers R8L-R15L are available using REX.R in 64-bit mode. This is indicated as "r/m8" in Intel documentation.

od_r_m16 

A word general-purpose register or memory operand used for instructions whose operand-size attribute is 16-bits.

The word general-purpose registers are AX, CX, DX, BX, SP, BP, SI, and DI. The contents of memory are found at the address provided by the effective address computation. Word registers R8W-R15W are available using REX.R in 64-bit mode. This is indicated as "r/m16" in Intel documentation.

od_r_m32 

A doubleword general-purpose register or memory operand used for instructions whose operand-size attribute is 32-bits.

The doubleword general-purpose registers are EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI. The contents of memory are found at the address provided by the effective address computation. Doubleword registers R8D-R15D are available when using REX.R in 64-bit mode. This is indicated as "r/m32" in Intel documentation.

od_r_m64 

A quadword general-purpose register or memory operand used for instructions whose operand-size attribute is 64 bits when using REX.W.

Quadword general-purpose registers are RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8-R15; these are available only in 64-bit mode. The contents of memory are found at the address provided by the effective address computation. This is indicated as "r/m64" in Intel documentation.

od_m 

A 16-, 32-, or 64-bit operand in memory.

od_m8 

A byte operand in memory, usually expressed as a variable or array name, but pointed to by the DS:(E)SI or ES:(E)DI registers.

In 64-bit mode, it is pointed to by the RSI or RDI registers.

od_m16 

A word operand in memory, usually expressed as a variable or array name, but pointed to by the DS:(E)SI or ES:(E)DI registers.

This nomenclature is used only with the string instructions.

od_m32 

A doubleword operand in memory, usually expressed as a variable or array name, but pointed to by the DS:(E)SI or ES:(E)DI registers.

This nomenclature is used only with the string instructions.

od_m64 

A memory quadword operand in memory.

od_m128 

A memory double quadword operand in memory.

od_m16_16 

A memory operand containing a far pointer composed of two numbers.

In the notation 16:16, the number to the left of the colon corresponds to the pointer's segment selector while the number to the right corresponds to its offset.

od_m16_32 

A memory operand containing a far pointer composed of two numbers.

In the notation 16:16, the number to the left of the colon corresponds to the pointer's segment selector while the number to the right corresponds to its offset.

od_m16_64 

A memory operand containing a far pointer composed of two numbers.

In the notation 16:16, the number to the left of the colon corresponds to the pointer's segment selector while the number to the right corresponds to its offset.

od_m16a16 

A memory operand consisting of data item pairs whose sizes are indicated on the left and the right side of the "a" (normally written "m16&16" in Intel manuals).

All memory addressing modes are allowed. This operand definition is used by the BOUND instruction to provide an operand containing an upper and lower bound for array indices.

od_m16a32 

A memory operand consisting of data item pairs whose sizes are indicated on the left and the right side of the "a" (normally written "m16&32" in Intel manuals).

All memory addressing modes are allowed. This operand definition is used by the LIDT and LGDT to provide a word with which to load the limit field, and a doubleword with which to load the base field of the corresponding GDTR and IDTR registers.

od_m32a32 

A memory operand consisting of data item pairs whose sizes are indicated on the left and the right side of the "a" (normally written "m32&32" in Intel manuals).

All memory addressing modes are allowed. This operand definition is used by the BOUND instruction to provide an operand containing an upper and lower bound for array indices.

od_m16a64 

A memory operand consisting of data item pairs whose sizes are indicated on the left and the right side of the "a" (normally written "m64&64" in Intel manuals).

All memory addressing modes are allowed. This operand definition is used by the LIDT and LGDT in 64-bit mode to provide a word with which to load the limit field, and a quadword with which to load the base field of the corresponding GDTR and IDTR registers.

od_moffs8 

A simple memory variable (memory offset) of type byte used by some variants of the MOV instruction.

The actual address is given by a simple offset relative to the segment base. No ModR/M byte is used in the instruction. This is used by instructions with a 8-bit address size attribute.

od_moffs16 

A simple memory variable (memory offset) of type word used by some variants of the MOV instruction.

The actual address is given by a simple offset relative to the segment base. No ModR/M byte is used in the instruction. This is used by instructions with a 16-bit address size attribute.

od_moffs32 

A simple memory variable (memory offset) of type doubleword used by some variants of the MOV instruction.

The actual address is given by a simple offset relative to the segment base. No ModR/M byte is used in the instruction. This is used by instructions with a 32-bit address size attribute.

od_moffs64 

A simple memory variable (memory offset) of type quadword used by some variants of the MOV instruction.

The actual address is given by a simple offset relative to the segment base. No ModR/M byte is used in the instruction. This is used by instructions with a 64-bit address size attribute.

od_sreg 

A segment register.

The segment register bit assignments are ES=0, CS=1, SS=2, DS=3, FS=4, and GS=5.

od_m32fp 

A single-precision floating-point operand in memory used as operands for x87 FPU floating-point instructions.

od_m64fp 

A double-precision floating-point operand in memory used as operands for x87 FPU floating-point instructions.

od_m80fp 

A double extended-precision floating-point operand in memory used as operands for x87 FPU floating-point instructions.

od_st0 

The 0th (top) element of the FPU register stack.

od_st1 

The 1st (second-from-top) element of the FPU register stack.

od_st2 

The 2nd element of the FPU register stack.

od_st3 

The 3rd element of the FPU register stack.

od_st4 

The 4th element of the FPU register stack.

od_st5 

The 5th element of the FPU register stack.

od_st6 

The 6th element of the FPU register stack.

od_st7 

The 7th (bottom) element of the FPU register stack.

od_sti 

Any element of the FPU register stack.

od_mm 

An MMX register.

The 64-bit MMX registers are MM0 through MM7.

od_mm_m32 

The low-order 32 bits of an MMX register or a 32-bit memory operand.

The contents of memory are found at the address provided by the effective address computation. This definition is called "mm/m32" in Intel documentation.

od_mm_m64 

An MMX register or a 64-bit memory operand.

The contents of memory are found at the address provided by the effective address computation. This definition is called "mm/m64" in Intel documentation.

od_xmm 

An XMM register.

The 128-bit XMM registers are XMM0 through XMM7; XMM8 through XMM15 are available using REX.R in 64-bit mode.

od_xmm_m16 

See PMOVSXBQ.

od_xmm_m32 

An XMM register or a 32-bit memory operand.

The 128-bit XMM registers are XMM0 through XMM7; XMM8 through XMM15 are available using REX.R in 64-bit mode. The contents of memory are found at the address provided by the effective address computation.

od_xmm_m64 

An XMM register or a 64-bit memory operand.

The 128-bit SIMD floating-point registers are XMM0 through XMM7; XMM8 through XMM15 are available using REX.R in 64-bit mode. The contents of memory are found at the address provided by the effective address computation.

od_xmm_m128 

An XMM register or a 128-bit memory operand.

The 128-bit SIMD floating-point registers are XMM0 through XMM7; XMM8 through XMM15 are available using REX.R in 64-bit mode. The contents of memory are found at the address provided by the effective address computation.

od_XMM0 

See BLENDVPD.

od_0 

See ENTER.

od_1 

See ENTER.

od_m80 

See FBLD.

od_dec 

See FBLD.

od_m80bcd 

See FBSTP.

od_m2byte 

See FLDCW.

od_m14_28byte 

See FLDENV.

od_m94_108byte 

See FRSTOR.

od_m512byte 

See FXRSTORE.

od_r16_m16 

See LAR.

od_r32_m8 

See PINSRB.

od_r32_m16 

See LAR.

od_r64_m16 

See SLDT.

od_CR0 

See MOV.

od_CR7 

See MOV.

od_CR8 

See MOV.

od_CR0CR7 

See MOV.

od_DR0DR7 

See MOV.

od_reg 

See MOVMSKPD.

od_CL 

See SAR.

Definition at line 141 of file AssemblerX86.h.

Enumerator
mrp_unknown 
mrp_disp 
mrp_index 
mrp_index_disp 
mrp_base 
mrp_base_disp 
mrp_base_index 
mrp_base_index_disp 

Definition at line 382 of file AssemblerX86.h.

Constructor & Destructor Documentation

AssemblerX86::AssemblerX86 ( )
inline

Definition at line 28 of file AssemblerX86.h.

References defns, and initAssemblyRules().

virtual AssemblerX86::~AssemblerX86 ( )
inlinevirtual

Definition at line 34 of file AssemblerX86.h.

Member Function Documentation

virtual SgUnsignedCharList AssemblerX86::assembleOne ( SgAsmInstruction )
virtual

Assemble an instruction (SgAsmInstruction) into byte code.

The new bytes are added to the end of the vector.

Implements Assembler.

void AssemblerX86::set_honor_operand_types ( bool  b)
inline

Causes the assembler to honor (if true) or disregard (if false) the data types of operands when assembling.

For instance, when honoring operand data types, if an operand is a 32-bit SgAsmIntegerValueExpression then the assembler will attempt to encode it as four bytes even if its value could be encoded as a single byte. This is turned on automatically if the Assembler::set_encoding_type() is set to Assembler::ET_MATCHES, but can also be turned on independently.

Definition at line 44 of file AssemblerX86.h.

References honor_operand_types.

bool AssemblerX86::get_honor_operand_types ( ) const
inline

Returns true if the assembler is honoring operand data types, or false if the assembler is using the smallest possible encoding.

Definition at line 50 of file AssemblerX86.h.

References honor_operand_types.

virtual SgUnsignedCharList AssemblerX86::assembleProgram ( const std::string &  source)
virtual

Assemble an x86 program from assembly source code using the nasm assembler.

Implements Assembler.

static size_t AssemblerX86::od_e_val ( unsigned  opcode_mods)
inlinestaticprivate

Returns value of En modification.

Definition at line 79 of file AssemblerX86.h.

References od_e_mask.

static uint8_t AssemblerX86::od_rex_byte ( unsigned  opcode_mods)
inlinestaticprivate

Definition at line 102 of file AssemblerX86.h.

References od_rex_mask.

static uint8_t AssemblerX86::build_modrm ( unsigned  mod,
unsigned  reg,
unsigned  rm 
)
inlinestaticprivate

Returns a ModR/M byte constructed from the three standard fields: mode, register, and register/memory.

Definition at line 328 of file AssemblerX86.h.

static unsigned AssemblerX86::modrm_mod ( uint8_t  modrm)
inlinestaticprivate

Returns the mode field of a ModR/M byte.

Definition at line 333 of file AssemblerX86.h.

static unsigned AssemblerX86::modrm_reg ( uint8_t  modrm)
inlinestaticprivate

Returns the register field of a ModR/M byte.

Definition at line 336 of file AssemblerX86.h.

static unsigned AssemblerX86::modrm_rm ( uint8_t  modrm)
inlinestaticprivate

Returns the register/memory field of a ModR/M byte.

Definition at line 339 of file AssemblerX86.h.

static uint8_t AssemblerX86::build_sib ( unsigned  ss,
unsigned  index,
unsigned  base 
)
inlinestaticprivate

Returns a SIB byte constructed from the three standard fields: scale, index, and base.

Definition at line 342 of file AssemblerX86.h.

static unsigned AssemblerX86::sib_ss ( uint8_t  sib)
inlinestaticprivate

Returns the scale field of a SIB byte.

Definition at line 347 of file AssemblerX86.h.

static unsigned AssemblerX86::sib_index ( uint8_t  sib)
inlinestaticprivate

Returns the index field of a SIB byte.

Definition at line 350 of file AssemblerX86.h.

static unsigned AssemblerX86::sib_base ( uint8_t  sib)
inlinestaticprivate

Returns the base field of a SIB byte.

Definition at line 353 of file AssemblerX86.h.

static void AssemblerX86::initAssemblyRules ( )
staticprivate

Build the dictionary used by the x86 assemblers.

All x86 assemblers share a common dictionary.

Referenced by AssemblerX86().

static void AssemblerX86::initAssemblyRules_part1 ( )
staticprivate
static void AssemblerX86::initAssemblyRules_part2 ( )
staticprivate
static void AssemblerX86::initAssemblyRules_part3 ( )
staticprivate
static void AssemblerX86::initAssemblyRules_part4 ( )
staticprivate
static void AssemblerX86::initAssemblyRules_part5 ( )
staticprivate
static void AssemblerX86::initAssemblyRules_part6 ( )
staticprivate
static void AssemblerX86::initAssemblyRules_part7 ( )
staticprivate
static void AssemblerX86::initAssemblyRules_part8 ( )
staticprivate
static void AssemblerX86::initAssemblyRules_part9 ( )
staticprivate
static void AssemblerX86::define ( const InsnDefn d)
inlinestaticprivate

Adds a definition to the assembly dictionary.

All x86 assemblers share a common dictionary.

Definition at line 413 of file AssemblerX86.h.

References defns, and AssemblerX86::InsnDefn::kind.

static std::string AssemblerX86::to_str ( X86InstructionKind  )
staticprivate

Returns the string version of the instruction kind sans "x86_" prefix.

This is not necessarily the same as the mnemonic since occassionally multiple kinds will map to a single mnemonic (e.g., RET maps to both x86_ret and x86_retf).

SgUnsignedCharList AssemblerX86::fixup_prefix_bytes ( SgAsmx86Instruction insn,
SgUnsignedCharList  source 
)
private

Rewrites the prefix bytes stored in the original source to be in the same order (and same repeat counts) as stored in the target, or p_raw_bytes data member of the instruction.

The source should contain only prefix bytes from groups 1 through 4 as listed in section 2.1.1 of the Intel Instruction Set Reference. It should not contain the REX byte. Any source prefix that does not appear in the original instruction will be placed at the end of the result; any prefix that appears in the original instruction but not the source will be dropped.

SgUnsignedCharList AssemblerX86::assemble ( SgAsmx86Instruction insn,
const InsnDefn defn 
)
private

Low-level method to assemble a single instruction using the specified definition from the assembly dictionary.

An Assembler::Exception is thrown if the instruction is not compatible with the definition.

void AssemblerX86::matches ( const InsnDefn defn,
SgAsmx86Instruction insn,
int64_t *  disp,
int64_t *  imm 
) const
private

Attempts to match an instruction with a definition.

An exception is thrown if the instruction and definition do not match. If the disp or imm arguments are non-null pointers then the operands of the instruction are also checked, and any operand which is an IP-relative displacement or immediate have their values returned through those arguments.

bool AssemblerX86::matches ( OperandDefn  ,
SgAsmExpression ,
SgAsmInstruction ,
int64_t *  disp,
int64_t *  imm 
) const
private

Attempts to match an instruction operand with a definition operand.

Returns true if they match, false otherwise. The disp and imm pointers are used to return values if the operand is an IP-relative displacement or immediate value.

static bool AssemblerX86::matches_rel ( SgAsmInstruction ,
int64_t  val,
size_t  nbytes 
)
staticprivate

Determines whether a call/jump target can be represented as a IP-relative displacement of the specified size.

static MemoryReferencePattern AssemblerX86::parse_memref ( SgAsmInstruction insn,
SgAsmMemoryReferenceExpression expr,
SgAsmx86RegisterReferenceExpression **  base_reg,
SgAsmx86RegisterReferenceExpression **  index_reg,
SgAsmValueExpression **  scale,
SgAsmValueExpression **  displacement 
)
staticprivate

Parses memory refernce expressons and returns the address BASE_REG + (INDEX_REG * SCALE) + DISPLACEMENT, where BASE_REG and INDEX_REG are optional register reference expressions and SCALE and DISPLACEMENT are optional value expressions.

uint8_t AssemblerX86::build_modrm ( const InsnDefn ,
SgAsmx86Instruction ,
size_t  argno,
uint8_t *  sib,
int64_t *  displacement,
uint8_t *  rex 
) const
private

Builds the ModR/M byte, SIB byte.

Also adjusts the REX prefix byte and returns any displacement value.

void AssemblerX86::build_modreg ( const InsnDefn ,
SgAsmx86Instruction ,
size_t  argno,
uint8_t *  modrm,
uint8_t *  rex 
) const
private

Adjusts the "reg" field of the ModR/M byte and adjusts the REX prefix byte if necessary.

uint8_t AssemblerX86::segment_override ( SgAsmx86Instruction )
private

Calculates the segment override from the instruction operands rather than obtaining it from the p_segmentOverride data member.

Returns zero if no segment override is necessary.

Member Data Documentation

const unsigned AssemblerX86::od_e_mask = 0x00000070
staticprivate

Indicates that the ModR/M byte of the instruction uses only the r/m (register or memory) operand.

The reg field contains n, providing an extension to the instruction's opcode. This form is written as "/0", "/1", etc. in the Intel documentation.

Definition at line 69 of file AssemblerX86.h.

Referenced by od_e_val().

const unsigned AssemblerX86::od_e_pres = 0x00000080
staticprivate

Definition at line 70 of file AssemblerX86.h.

const unsigned AssemblerX86::od_e0 = 0x00000000 | od_e_pres
staticprivate

Definition at line 71 of file AssemblerX86.h.

const unsigned AssemblerX86::od_e1 = 0x00000010 | od_e_pres
staticprivate

Definition at line 72 of file AssemblerX86.h.

const unsigned AssemblerX86::od_e2 = 0x00000020 | od_e_pres
staticprivate

Definition at line 73 of file AssemblerX86.h.

const unsigned AssemblerX86::od_e3 = 0x00000030 | od_e_pres
staticprivate

Definition at line 74 of file AssemblerX86.h.

const unsigned AssemblerX86::od_e4 = 0x00000040 | od_e_pres
staticprivate

Definition at line 75 of file AssemblerX86.h.

const unsigned AssemblerX86::od_e5 = 0x00000050 | od_e_pres
staticprivate

Definition at line 76 of file AssemblerX86.h.

const unsigned AssemblerX86::od_e6 = 0x00000060 | od_e_pres
staticprivate

Definition at line 77 of file AssemblerX86.h.

const unsigned AssemblerX86::od_e7 = 0x00000070 | od_e_pres
staticprivate

Definition at line 78 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rex_pres = 0x00000001
staticprivate

Indicates the use of a REX prefix that affects operand size or instruction semantics.

The ordering of the REX prefix and other optional/mandatory instruction prefixes are discussed in Chapter 2 of the Intel "Instruction Set Reference, A-M".

Definition at line 84 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rex_mask = 0x00000f00
staticprivate

Definition at line 85 of file AssemblerX86.h.

Referenced by od_rex_byte().

const unsigned AssemblerX86::od_rex = 0x00000000 | od_rex_pres
staticprivate

Definition at line 86 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexb = 0x00000100 | od_rex_pres
staticprivate

Definition at line 87 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexx = 0x00000200 | od_rex_pres
staticprivate

Definition at line 88 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexxb = 0x00000300 | od_rex_pres
staticprivate

Definition at line 89 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexr = 0x00000400 | od_rex_pres
staticprivate

Definition at line 90 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexrb = 0x00000500 | od_rex_pres
staticprivate

Definition at line 91 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexrx = 0x00000600 | od_rex_pres
staticprivate

Definition at line 92 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexrxb = 0x00000700 | od_rex_pres
staticprivate

Definition at line 93 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexw = 0x00000800 | od_rex_pres
staticprivate

Definition at line 94 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexwb = 0x00000900 | od_rex_pres
staticprivate

Definition at line 95 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexwx = 0x00000a00 | od_rex_pres
staticprivate

Definition at line 96 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexwxb = 0x00000b00 | od_rex_pres
staticprivate

Definition at line 97 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexwr = 0x00000c00 | od_rex_pres
staticprivate

Definition at line 98 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexwrb = 0x00000d00 | od_rex_pres
staticprivate

Definition at line 99 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexwrx = 0x00000e00 | od_rex_pres
staticprivate

Definition at line 100 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rexwrxb = 0x00000f00 | od_rex_pres
staticprivate

Definition at line 101 of file AssemblerX86.h.

const unsigned AssemblerX86::od_modrm = 0x00000002
staticprivate

Indicates that the ModR/M byte of the instruction contains a register operand and an r/m operand.

This form is written as "/r" in the Intel documentation.

Definition at line 106 of file AssemblerX86.h.

const unsigned AssemblerX86::od_c_mask = 0x00007000
staticprivate

A 1-byte (CB), 2-byte (CW), 4-byte (CD), 6-byte (CP), 8-byte (CO), or 10-byte (CT) value follows the opcode.

This value is used to specify a code offset and possibly a new value for the code segment register.

Definition at line 110 of file AssemblerX86.h.

const unsigned AssemblerX86::od_cb = 0x00001000
staticprivate

Definition at line 111 of file AssemblerX86.h.

const unsigned AssemblerX86::od_cw = 0x00002000
staticprivate

Definition at line 112 of file AssemblerX86.h.

const unsigned AssemblerX86::od_cd = 0x00003000
staticprivate

Definition at line 113 of file AssemblerX86.h.

const unsigned AssemblerX86::od_cp = 0x00004000
staticprivate

Definition at line 114 of file AssemblerX86.h.

const unsigned AssemblerX86::od_co = 0x00005000
staticprivate

Definition at line 115 of file AssemblerX86.h.

const unsigned AssemblerX86::od_ct = 0x00006000
staticprivate

Definition at line 116 of file AssemblerX86.h.

const unsigned AssemblerX86::od_i_mask = 0x00070000
staticprivate

A 1-byte (IB), 2-byte (IW), 4-byte (ID), or 8-byte (IO) little-endian immediate operand to the instruction follows the opcode, ModR/M bytes or scale-indexing bytes.

The opcode determines if the operand is a signed value.

Definition at line 120 of file AssemblerX86.h.

const unsigned AssemblerX86::od_ib = 0x00010000
staticprivate

Definition at line 121 of file AssemblerX86.h.

const unsigned AssemblerX86::od_iw = 0x00020000
staticprivate

Definition at line 122 of file AssemblerX86.h.

const unsigned AssemblerX86::od_id = 0x00030000
staticprivate

Definition at line 123 of file AssemblerX86.h.

const unsigned AssemblerX86::od_io = 0x00040000
staticprivate

Definition at line 124 of file AssemblerX86.h.

const unsigned AssemblerX86::od_r_mask = 0x00700000
staticprivate

A register code, from 0 through 7, added to a byte of the opcode.

This form is written as "+rb" in the Intel documentation.

Definition at line 128 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rb = 0x00100000
staticprivate

Definition at line 129 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rw = 0x00200000
staticprivate

Definition at line 130 of file AssemblerX86.h.

const unsigned AssemblerX86::od_rd = 0x00300000
staticprivate

Definition at line 131 of file AssemblerX86.h.

const unsigned AssemblerX86::od_ro = 0x00400000
staticprivate

Definition at line 132 of file AssemblerX86.h.

const unsigned AssemblerX86::od_i = 0x00000004
staticprivate

A number used in floating-point instructions when one of the operands is ST(i) from the FPU register stack.

The number i (which can range from 0 to 7) is added to the opcode byte form a single opcode byte. This form is written as "+i" in the Intel documentation.

Definition at line 137 of file AssemblerX86.h.

const unsigned AssemblerX86::COMPAT_LEGACY = 0x01
staticprivate

These bits define the compatibility of an instruction to 32- and 64-bit modes.

Definition is compatible with non 64-bit architectures.

Definition at line 324 of file AssemblerX86.h.

const unsigned AssemblerX86::COMPAT_64 = 0x02
staticprivate

Definition is compatible with 64-bit architectures.

Definition at line 325 of file AssemblerX86.h.

InsnDictionary AssemblerX86::defns
staticprivate

Instruction assembly definitions organized by X86InstructionKind.

Definition at line 464 of file AssemblerX86.h.

Referenced by AssemblerX86(), and define().

bool AssemblerX86::honor_operand_types
private

If true, operand types rather than values determine assembled form.

Definition at line 465 of file AssemblerX86.h.

Referenced by get_honor_operand_types(), and set_honor_operand_types().


The documentation for this class was generated from the following file: