Y86 Intro

This is an introduction to Y86 that is designed to guide you toward working on P3 and later P4. As we mentioned in class, Y86 is an instruction set architecture (ISA) that is similar to but far simpler than x86-64. The first thing you will want to do is to print or save a copy of the Y86 reference sheet.

Readings and Videos

You will also want to read section 4.1 in your textbook. In addition, we recommend watching the following Youtube videos, courtesy of Dr. Kirkpatrick:

If you watch the videos, you should note a few minor changes in our version of Y86 from the version Kirkpatrick discusses: 1) we have seven additional general registers (%r8-%r14), 2) data movement instructions do not set the CC flags, and 3) we will be using the standard x86-64 calling conventions for procedures.

Getting Started with Y86

Y86 will be the assembly language we will focus on for the remainder of the semester in P3 and P4. In fact, Mini-ELF (which you have worked with in P1 and P2) is really just a format for storing a Y86 machine code program in a file along with the other information needed to load and execute that program (e.g., program headers). After loading a Mini-ELF file into memory in P2, you are now ready to begin treating the loaded file as a Y86 program. In P3, you will be disassembling the program contents by parsing and rendering the machine code using assembly language. In P4, you will be interpreting the Y86 program by simulating a Y86 CPU as it executes the machine code instructions.

At this point, therefore, you will also need to begin writing programs in Y86. We recommend starting with a web-based simulator.

You can enter your Y86 program into this simulator and then click "Assemble" to convert it into object code. Then, you can step through it instruction-by-instruction (similar to gdb), and you can watch the contents of the registers and memory as they change.

Differences from x86-64

Most of what you have learned about x86-64 should translate directly to Y86. However, data movement is vastly simplified; instead of the many addressing modes for the mov instruction, there are now four individual movement instructions: irmovq, rrmovq, rmmovq, and mrmovq.

Each of these new instructions corresponds to a particular combination of immediate, register, and memory operands as described by the first two letters of the instruction; e.g., irmovq moves an immediate value into a register, and rmmovq moves a value in a register to a memory location. In addition, ALL memory operands are now indirect (e.g., a base register with an integer offset). These are the only valid movement commands. Here is a summary:

MNEMONIC DESCRIPTION EXAMPLE
irmovq immediate-to-register irmovq $16, %rax
rrmovq register-to-register rrmovq %rax, %rcx
rmmovq register-to-memory rmmovq %rax, (%rdx)
mrmovq memory-to-register mrmovq (%rdx), %rax

Another important difference is that some instructions that you’ve become used to (e.g., cmp, test, and imul) are no longer present in Y86, and you will need to use alternatives (e.g., sub for cmp, and for test, or a loop for imul). This may require some creativity or temporary registers, because sub and and both modify one of their arguments while cmp and test did not.

Code Regions

In Y86 you can use .pos directives to specify for the assembler an address where a section should begin as well as a type for the section (code, data, or stack). It is customary for segments boundaries to be divisible by 256 bytes (i.e., the last two hex digits are zero). For instance, the following directive specifies that a code segment should begin at memory address 0x100:

.pos 0x100 code

All Y86 programs must contain at least one code region, but unfortunately there are differences concerning the address where execution begins. Most of the simulators begin execution at address zero, but this is somewhat problematic because zero is often used as a “null” address indicator. Thus, when a Y86 program is assembled into Mini-ELF format, we mandate that there must be an entry point (recorded in the Mini-ELF header, as you will recall from P1) indicated by a _start label (note the underscore at the beginning), which should NOT be at address zero.

Minimal Y86 Programs

Here is a minimal Y86 program that is valid for both the web-based simulator and the Mini-ELF assembler on stu:

.pos 0 code
    jmp _start

.pos 0x100 code
_start:
    # your code goes here
    halt

Here is an expanded Y86 program that sets up a stack and then calls main:

.pos 0 code
    jmp _start

.pos 0x100 code
_start:
    irmovq _stack, %rsp
    call main
    halt
main:
    # your code goes here
    ret

.pos 0xf00 stack
_stack:

We must initialize the stack pointer manually, usually by initializing it to a predefined memory address using a label (it is customary to use _stack). The stack begins empty, so the stack segment is zero bytes long in the Mini-ELF file. Because the Y86 address space is only 0x1000 (4K) bytes, it is customary to start the stack at 0xf00.

Y86 on the Student Server

To set up your account on stu to be able to run the Y86 assembler as well as the reference y86 interpreter, run the following script:

/cs/students/cs261/y86/install.sh

You will probably need to log out and back in for the environment changes to take effect. After you do this, save your Y86 code in a file with a .ys extension and run the assembler as follows:

yas <file.ys>

If there are no syntax errors, this will create a Mini-ELF .o file that you can load with your y86 project. We have also distributed a reference implementation of the y86 project, and you can run it as follows:

y86ref [options] <file.o>

In particular, you can use the -d and -D options to produce the P3 output. This is useful for debugging by comparing against your own output. You can also use -e or -E to execute or trace the Y86 program.

Please experiment with these techniques and write some Y86 programs of your own—you will want to be able to create test cases as you work on P3 and P4. If you have questions, post them on Piazza or come to office hours.