# CS 261 Fall 2021

Mike Lam, Professor



$$rac{-b\pm\sqrt{b^2-4ac}}{2a}$$

$$-
abla p + 
abla \cdot oldsymbol{ au} + 
ho \, {f g}$$

#### x86-64 Data Movement and Arithmetic

# Topics

- Data movement
- Instruction validity
- Stack operations
- Arithmetic and logical operations

#### von Neumann architecture



(repeat)

#### Data movement

- Primary data movement instruction: "mov"
  - Copies data from first operand to second operand
- There are no "types" in assembly code
  - You must know how many bytes you want to move
    - Information = Bits + Context
  - Often, a "class" of machine instructions (e.g., "mov\_") will perform similar operations on different sizes of data
- Historical artifact: "word" in x86 is 16 bits
  - 1 byte (8 bits) = "byte" (b suffix)
  - 2 bytes (16 bits) = "word" (w suffix)
  - 4 bytes (32 bits) = "double/long word" (1 suffix)
  - 8 bytes (64 bits) = "quad word" (q suffix)

#### Data movement

- Primary data movement instruction: "mov"
  - Copies data from first operand to second operand
  - Multiple suffixes:
    - movb, movw, movl, movq, movabsq
    - movabsq is the only form that takes a 64-bit immediate
- Zero-extension variant: "movz"
  - movzbw, movzbl, movzwl, movzbq, movzwq
  - Note lack of movzlq; just use movl, which sets higher 32-bits to zero
- Sign-extension variant: "movs"

# Registers

- Multiple names per register
  - Refers to different data sizes
  - **e**XX = lower 32-bits (e.g., **e**ax)
  - rXX = full 64 bits (e.g., rax)
- Instruction suffixes and operand sizes must match!
  - E.g., movg \$1, %<u>r</u>ax is valid but movg \$1, %<u>e</u>ax is not

| register | zero-extended<br>for 32-bit operands | -    |         | low<br>8-bit | 16-bit | 32-bit                        | 64-b      |
|----------|--------------------------------------|------|---------|--------------|--------|-------------------------------|-----------|
| 0        |                                      |      | AH*     | AL           | AX     | EAX                           | RAX       |
| 3        |                                      |      | BH*     | BL           | BX     | EBX                           | RBX       |
| 1        |                                      |      | CH*     | CL           | сх     | ECX                           | RCX       |
| 2        |                                      |      | DH*     | DL           | DX     | EDX                           | RDX       |
| 6        |                                      |      |         | SIL**        | SI     | ESI                           | RSI       |
| 7        |                                      |      |         | DIL**        | DI     | EDI                           | RDI       |
| 5        |                                      |      |         | BPL**        | BP     | EBP                           | RBP       |
| 4        |                                      |      |         | SPL**        | SP     | ESP                           | RSP       |
| 8        |                                      |      |         | R8B          | R8W    | R8D                           | R8        |
| 9        |                                      |      |         | R9B          | R9W    | R9D                           | R9        |
| 10       |                                      |      |         | R10B         | R10W   | R10D                          | R10       |
| 11       |                                      |      |         | R11B         | R11W   | RIID                          | R11       |
| 12       |                                      |      |         | R12B         | R12W   | R12D                          | R12       |
| 13       |                                      |      |         | R13B         | R13W   | R13D                          | R13       |
| 14       |                                      |      |         | R14B         | R14W   | R14D                          | R14       |
| 15       |                                      |      |         | R15B         | R15W   | R15D                          | R15       |
| 63       | 3                                    | 2 31 | 16 15 8 | 7 0          |        |                               |           |
|          | 0                                    |      |         |              | RFLAG  | 5                             | 3-309.eps |
|          |                                      |      |         |              | RIP    |                               |           |
| 63       | 3                                    | 2 31 |         | 0            |        | addressable<br>K prefix is us |           |

# Memory addressing modes

- Absolute: addr
  - Effective address: addr
- Indirect: (reg)
  - Effective address: **R**[*reg*]
- Base + displacement: offset(reg)
  - Effective address: *offset* + R[*reg*]
- Indexed: offset(reg<sub>base</sub>, reg<sub>index</sub>)
  - Effective address: offset + R[reg<sub>base</sub>] + R[reg<sub>index</sub>]
- Scaled indexed: offset(reg<sub>base</sub>, reg<sub>index</sub>, s)
  - Effective address: offset + R[reg<sub>base</sub>] + R[reg<sub>index</sub>] · s
  - Scale (s) must be 1, 2, 4, or 8

**R**[**reg**] = value of register **reg** 



## Memory operands

- Addresses in x86-64 are always 32 or 64 bits
  - Thus, the registers used to calculate the effective address of a memory operand must be 32 or 64 bits
    - E.g., movw %ax, (%ebp) is valid
    - E.g., movw %ax, (%rbp) is valid
    - E.g., movw %ax, (%bp) is not valid!
    - E.g., **movw** %ax, %rbp is **not valid**!
- The size of data moved is determined by the size of the register operand or the instruction suffix
  - NOT the size of the register(s) used to calculate the effective address
  - Memory locations have no "type" in assembly/machine code

# Validity summary

- Is an instruction valid?
  - Is the opcode valid?
  - Are all of the operands valid?
    - For immediate operands, is it a source register?
      - (cannot write to immediates!)
    - For register operands, is it a valid register?
      - (and does it match the width suffix?)
    - For memory operands, is it a valid addressing mode?
      - (and are all registers used 32- or 64-bits?)

# Question

- Which of the following are valid x86-64 movement instructions?
  - A) movb %eax, %ecx
  - B) movl %eax, %ecx
  - C) movl \$8, %edx
  - D) movl \$8, %rdx
  - E) movw \$0x24, 0x4(%rsp)
  - F) movl \$0x24, 0x4(%esp)

## Aside: suffixes

- Is the operand size suffix mandatory?
  - E.g., the "l" or "q" in "movl" or "movq"
- Technically, it is only required if it cannot be inferred
  - E.g., mov %eax, %edi is not ambiguous
    - We can infer that this is a 32-bit move because of the destination
  - However, mov \$2, (%rdx) is ambiguous
    - Is it a 8-bit move? 32 bits? 64 bits?
    - A suffix is required here (e.g., movl \$2, (%rdx) for 32 bits)
  - Generally, it is safer always to include the suffix



 T/F: "movl (%rax), (%rdx)" is a valid x86-64 assembly instruction.

## Aside: memory operands

- In x86-64, most opcodes have no memory -> memory form
  - You can't encode two memory operands in the same instruction
  - Invalid: movl (%rax), (%rdx)
- Solution: use a temporary register

movl (%rax), %ecx
movl %ecx, (%rdx)

# **Stack operations**

- The system stack holds 8-byte (quadword) slots, growing downward from high addresses to low addresses
  - Stack Pointer (SP) register stores address of "top" of stack
    - i.e., a pointer to the last value pushed (lowest address)
    - On x86-64, it is %rsp b/c addresses are 64 bits
  - pushq <reg> instruction
    - Subtract 8 from stack pointer
    - Store value of <reg> at (%rsp)
  - popq <reg> instruction
    - Retrieve value at (%rsp)
      - Save value in the given register
    - Increment stack pointer by 8



- Given the following register state, what will the values of the registers be after the following instruction sequence?
  - pushq %rax
  - pushq %rcx
  - pushq %rbx
  - pushq %rdx
  - popq %rax
  - popq %rbx
  - popq %rcx
  - popq %rdx

#### **Registers**

| <u>Name</u> | <u>Value</u> |
|-------------|--------------|
| %rax        | 0xAA         |
| %rbx        | 0xBB         |
| %rcx        | 0xCC         |
| %rdx        | 0xDD         |

- Given the following register state, what will the values of the registers be after the following instruction sequence?
  - pushq %rax
  - pushq %rcx
  - pushq %rbx
  - pushq %rdx
  - popq %rax %rax = 0xDD
  - popq %rbx 9
  - popq %rcx
  - popq %rdx

| %rax | = | 0xDD |
|------|---|------|
| %rbx | = | 0xBB |
| %rcx | = | 0xCC |
| %rdx | = | 0xAA |

| Registers   |              |  |  |
|-------------|--------------|--|--|
| <u>Name</u> | <u>Value</u> |  |  |
| %rax        | 0xAA         |  |  |
| %rbx        | 0xBB         |  |  |
| %rcx        | 0xCC         |  |  |
| %rdx        | 0xDD         |  |  |

# Arithmetic and logic operations

| Instruction |                     | Effect                             | Description              |
|-------------|---------------------|------------------------------------|--------------------------|
| eaq         | S, D                | $D \leftarrow \&S$                 | Load effective address   |
| NC          | D                   | $D \leftarrow D+1$                 | Increment                |
| EC          | D                   | $D \leftarrow D-1$                 | Decrement                |
| EG          | D                   | $D \leftarrow -D$                  | Negate                   |
| т           | D                   | $D \leftarrow \neg D$              | Complement               |
| D           | <i>S</i> , <i>D</i> | $D \leftarrow D + S$               | Add                      |
| в           | <i>S</i> , <i>D</i> | $D \leftarrow D - S$               | Subtract                 |
| UL          | S, D                | $D \leftarrow D * S$               | Multiply                 |
| R           | S, D                | $D \leftarrow D^{\circ}S$          | Exclusive-or             |
|             | S, D                | $D \leftarrow D \mid S$            | Or                       |
| D           | S, D                | $D \leftarrow D \& S$              | And                      |
| L           | k, D                | $D \leftarrow D << k$              | Left shift               |
| L           | k, D                | $D \leftarrow D << k$              | Left shift (same as SAL) |
| R           | k, D                | $D \leftarrow D >>_{A} k$          | Arithmetic right shift   |
| R           | k, D                | $D \leftarrow D >>_{\mathrm{L}} k$ | Logical right shift      |

Figure 3.10 Integer arithmetic operations. The load effective address (leaq) instruction is commonly used to perform simple arithmetic. The remaining ones are more standard unary or binary operations. We use the notation  $>>_A$  and  $>>_L$  to denote arithmetic and logical right shift, respectively. Note the nonintuitive ordering of the operands with ATT-format assembly code.

| Instru                          | ction                                | Effect                                                                                                                | Description                                                                                    | Registers                                                                                                       |
|---------------------------------|--------------------------------------|-----------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------|
| leaq                            | <i>S</i> , <i>D</i>                  | $D \leftarrow \&S$                                                                                                    | Load effective address                                                                         | Name Value                                                                                                      |
| INC<br>DEC<br>NEG<br>NOT        | D<br>D<br>D<br>D                     | $D \leftarrow D+1$ $D \leftarrow D-1$ $D \leftarrow -D$ $D \leftarrow -D$                                             | Increment<br>Decrement<br>Negate<br>Complement                                                 | %rax 0x12<br>%rbx 0x56<br>%rcx 0x02<br>%rdx 0xF0                                                                |
| ADD<br>SUB<br>IMUL<br>XOR<br>OR | S, D<br>S, D<br>S, D<br>S, D<br>S, D | $D \leftarrow D + S$ $D \leftarrow D - S$ $D \leftarrow D * S$ $D \leftarrow D^{*} S$ $D \leftarrow D   S$            | Add<br>Subtract<br>Multiply<br>Exclusive-or<br>Or                                              | What are the values of the destination registers after each of the following instructions executes in sequence? |
| AND<br>SAL<br>SHL<br>SAR<br>SHR | S, D<br>k, D<br>k, D<br>k, D<br>k, D | $D \leftarrow D \& S$ $D \leftarrow D << k$ $D \leftarrow D << k$ $D \leftarrow D >>_{A} k$ $D \leftarrow D >>_{L} k$ | And<br>Left shift<br>Left shift (same as SAL)<br>Arithmetic right shift<br>Logical right shift | addq %rax, %rax<br>subq %rax, %rbx<br>imulq %rcx, %rax<br>andq %rbx, %rdx<br>shrq \$4, %rdx                     |

Figure 3.10 Integer arithmetic operations. The load effective address (leaq) instruction is commonly used to perform simple arithmetic. The remaining ones are more standard unary or binary operations. We use the notation  $>>_A$  and  $>>_L$  to denote arithmetic and logical right shift, respectively. Note the nonintuitive ordering of the operands with ATT-format assembly code.

| Instru                          | iction                               | Effect                                                                                                                | Description                                                                                    | Registers                                                                                                                                     |
|---------------------------------|--------------------------------------|-----------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------|
| leaq                            | <i>S</i> , <i>D</i>                  | $D \leftarrow \&S$                                                                                                    | Load effective address                                                                         | Name Value                                                                                                                                    |
| INC<br>DEC<br>NEG<br>NOT        | D<br>D<br>D<br>D                     | $D \leftarrow D+1$<br>$D \leftarrow D-1$<br>$D \leftarrow -D$<br>$D \leftarrow -D$                                    | Increment<br>Decrement<br>Negate<br>Complement                                                 | %rax 0x12<br>%rbx 0x56<br>%rcx 0x02<br>%rdx 0xF0                                                                                              |
| ADD<br>SUB<br>IMUL<br>XOR<br>OR | S, D<br>S, D<br>S, D<br>S, D<br>S, D | $D \leftarrow D + S$ $D \leftarrow D - S$ $D \leftarrow D * S$ $D \leftarrow D ^ S$ $D \leftarrow D   S$              | Add<br>Subtract<br>Multiply<br>Exclusive-or<br>Or                                              | What are the values of the destination registers after each of the following instructions executes in sequence?                               |
| AND<br>SAL<br>SHL<br>SAR<br>SHR | S, D<br>k, D<br>k, D<br>k, D<br>k, D | $D \leftarrow D \& S$ $D \leftarrow D << k$ $D \leftarrow D << k$ $D \leftarrow D >>_{A} k$ $D \leftarrow D >>_{L} k$ | And<br>Left shift<br>Left shift (same as SAL)<br>Arithmetic right shift<br>Logical right shift | addq %rax, %rax %rax:0x24<br>subq %rax, %rbx %rbx:0x32<br>imulq %rcx, %rax %rax:0x48<br>andq %rbx, %rdx %rdx:0x30<br>shrq \$4, %rdx %rdx:0x03 |

**Figure 3.10 Integer arithmetic operations.** The load effective address (leaq) instruction is commonly used to perform simple arithmetic. The remaining ones are more standard unary or binary operations. We use the notation  $>>_A$  and  $>>_L$  to denote arithmetic and logical right shift, respectively. Note the nonintuitive ordering of the operands with ATT-format assembly code.

%rax = 0x48 %rbx = 0x32 %rcx = 0x02 %rdx = 0x03

| Instru | ction               | Effect                        | Description              |                                                               |
|--------|---------------------|-------------------------------|--------------------------|---------------------------------------------------------------|
| leaq   | S, D                | $D \leftarrow \&S$            | Load effective address   |                                                               |
| INC    | D                   | $D \leftarrow D+1$            | Increment                |                                                               |
| DEC    | D                   | $D \leftarrow D-1$            | Decrement                |                                                               |
| NEG    | D                   | $D \leftarrow -D$             | Negate                   |                                                               |
| NOT    | D                   | $D \leftarrow \sim D$         | Complement               |                                                               |
| ADD    | <i>S</i> , <i>D</i> | $D \leftarrow D + S$          | Add                      | What does the following instruction do if $\%$ rax = $0x100?$ |
| SUB    | S, D                | $D \leftarrow D - S$          | Subtract                 |                                                               |
| IMUL   | S, D                | $D \leftarrow D * S$          | Multiply                 | leag (%rax, %rax, 2), %rax                                    |
| XOR    | S, D                | $D \leftarrow D^{\circ}S$     | Exclusive-or             |                                                               |
| OR     | S, D                | $D \leftarrow D \mid S$       | Or                       |                                                               |
| AND    | S, D                | $D \leftarrow D \& S$         | And                      | %rax = 0x300                                                  |
| SAL    | k, D                | $D \leftarrow D << k$         | Left shift               | (multiply by three)                                           |
| SHL    | k, D                | $D \leftarrow D << k$         | Left shift (same as SAL) | Note: leag door not optically                                 |
| SAR    | k, D                | $D \leftarrow D >>_{A} k$     | Arithmetic right shift   | Note: leaq does not actually                                  |
| SHR    | k, D                | $D \leftarrow D >>_{\rm L} k$ | Logical right shift      | read/write memory!                                            |

Figure 3.10 Integer arithmetic operations. The load effective address (leaq) instruction is commonly used to perform simple arithmetic. The remaining ones are more standard unary or binary operations. We use the notation  $>>_A$  and  $>>_L$  to denote arithmetic and logical right shift, respectively. Note the nonintuitive ordering of the operands with ATT-format assembly code.