Chip-8 on the COSMAC VIP: The Call Routine (Fetch and Decode)

This is part of a series of posts analysing the Chip-8 interpreter on the RCA COSMAC VIP computer. These posts may be useful if you are building a Chip-8 interpreter on another platform or if you have an interest in the operation of the COSMAC VIP. For other posts in the series refer to the index or instruction index.

In a previous post I explained how the initialisation section of the Chip-8 interpreter worked. In this post I’ll look at the Call loop (which is the equivalent of a fetch and decode sequence for a machine language).

This part of the interpreter is responsible for fetching the next Chip-8 instruction from memory, decoding it by analysing its parts and then calling the appropriate routine to deal with each instruction type. To understand how this works, we first need to understand how a Chip-8 instruction is structured.

Each Chip-8 instruction is two bytes. Each byte can be broken down further into two nibbles. The first nibble of the first byte is used to indicate one of sixteen possible instruction groups. These are shown in the table below.

First Instruction Nibble Instruction Group
0 Call machine code instructions
1 Branch instructions
2 Call Chip-8 subroutine instructions
3 Skip if variable equal to immediate operand instructions
4 Skip if variable not equal to immediate operand instructions
5 Skip if variable equal to register instructions
6 Load variable with immediate operand instructions
7 Add immediate operand to variable instructions
8 Other arithmetic and logic instructions
9 Skip if register not equal to register instructions
A Memory indexing instructions
B Branch with offset instructions
C Random number generation instructions
D Display instructions
E Skip if key pressed or not pressed instructions
F Various I/O (timers and keyboard) and memory instructions

So the first thing the interpreter must do after it has fetched the first byte of the instruction is decode the most significant four bits to determine which of these instruction groups is to be executed.

Group 0 is dealt with as a special case. These instructions are used to call machine language subroutines from within a Chip-8 programme. If one of these instructions is detected, the handling routine is executed immediately at this point. Basically this forms an address by masking off the most significant digit of the first byte of the Chip-8 instruction, which then becomes the high order byte of the address and then using the second byte of the Chip-8 instruction as the low order byte of the address. The routine at that address is then called. This means Chip-8 can call any address in the on-card RAM (i.e. any address from 0x0000 to 0x0FFF), but routines in any extended RAM can not be called directly. I’ll look further at machine code integration in a future post.

For the remaining instruction groups, the interpreter next sets up two variable pointers. For some instructions the low order nibble of the first byte is used to select a variable, designated VX. For some of these instructions a second variable, designated VY, is indicated by the high order nibble of the second byte. The interpreter gets these values and uses them to construct pointers to where the relevant variables are stored in memory.

Note that these pointers are established even if the instruction will not make use of them. This makes some instructions a little less efficient than they could be in terms of execution time, but the trade off is that the interpreter is more straightforward and compact. In a 2K system, efficient memory use was more important than getting the best possible execution speed.

The interpreter now uses the instruction group code to index a couple of lookup tables that are used to find the address of the handling routine for each instruction group. The routine at this address is then called. When execution is returned to the Chip-8 call routine, it loops back round to fetch the next instruction. Here’s a flowchart of the sequence:

Chip-8 fetch and decode flowchart

Now here’s the code from the interpreter:

Address (hex) Code (hex) Labels Assembly Comments
001B 96 FETCH_ DECODE_LOOP: GHI 6 Get the high order byte of the VX pointer …
001C B7 PHI 7 … and copy this to the high order byte of the VY pointer
001D E2 SEX 2 Use the stack pointer (R2) for indirect register addressing operations
001E 94 GHI 4 Copy high order byte of CALL routine pointer (R4) …
001F BC PHI C … and copy it to RC (RC will be used later as a pointer into a pair of lookup tables that hold the addresses of the routines that handle each instruction group)
0020 45 LDA 5 Get the first byte of the next Chip-8 instruction and advance the instruction pointer (R5)
0021 AF PLO F Copy first byte of Chip-8 instruction to RF.0
0022 F6 SHR The next four instructions move the most significant digit of the Chip-8 instruction (first byte) – the instruction group code – to the position of the least significant digit. The least significant digit is discarded
0023 F6 SHR
0024 F6 SHR
0025 F6 SHR
0026 32 44 BZ FIRST_DIGIT_0 If the instruction group code indicates a machine language call (code 0), jump directly to the routine that handles this
0028 F9 50 ORI 0x50 Apply a mask to the instruction group code to turn it into the low-order part of an address that points to an entry in a lookup table (This table is stored from 0x0051 to 0x005F)
002A AC PLO C RC now points to the correct entry in a lookup table for the instruction group of the current instruction – this table holds the high order byte of the address of the routine that handles that instruction group
002B 8F GLO F Retrieve the unaltered copy of the first byte of the Chip-8 instruction from RF.0
002C FA 0F ANI 0x0F Mask the first byte of the Chip-8 instruction to leave only the least significant digit
002E F9 F0 ORI 0xF0 Apply a mask to the least significant digit of the first byte of the Chip-8 instruction to form the low order byte of a pointer to the relevant variable (These variables are stored in the final page of on-card RAM from 0x0XF0 to 0x0XFF)
0030 A6 PLO 6 The VX pointer (R6) now points to the correct variable for this instruction
0031 05 LDN 5 Get the second byte of the Chip-8 instruction (do not advance the instruction pointer)
0032 F6 SHR The next four instructions move the most significant digit of the Chip-8 instruction (second byte) – VY – to the position of the least significant digit. The least significant digit is discarded)
0033 F6 SHR
0034 F6 SHR
0035 F6 SHR
0036 F9 F0 ORI 0xF0 Apply a mask to the VY part of the Chip-8 instruction to form the low order byte of a pointer to the relevant variable (These variables are stored in the final page of on-card RAM from 0x0XF0 to 0x0XFF)
0038 A7 PLO 7 The VY pointer (R7) now points to the correct variable for this instruction
0039 4C LDA C Get high-order byte of routine from look-up table
003A B3 PHI 3 Store this in the high order byte of the interpreter programme counter (R3)
003B 8C GLO C Get the low order byte of the address currently pointed to by RC – this will have been moved on by 1 by the LDA instruction…
003C FC 0F ADI 0x0F … so, as the corresponding entries in each table are placed 16 bytes apart, it’s just necessary to add 0x0F to the address …
003E AC PLO C … so that RC now points to the correct place in the second look up table
003F 0C LDN C Get the low order byte of the address from the lookup table
0040 A3 CALL_ SUBROUTINE: PLO 3 And use this to set the low order byte of the interpreter programme counter (R3)
0041 D3 SEP 3 Now call the interpreter subroutine to handle this instruction group
0042 30 1B BR FETCH_ DECODE_LOOP On return from the subroutine, loop back and get the next Chip-8 instruction
0044 8F FIRST_ DIGIT_0: GLO F This subroutine is entered when the first digit of the instruction is 0x0. This indicates a call to the machine code routine stored in the remaining three digits of the instruction. The routine starts by retrieving the original first byte of the Chip-8 instruction in RF.0
0045 FA 0F ANI 0x0F Use a mask to remove the first digit of the instruction (leaving the high order byte of the address to be called)
0047 B3 PHI 3 Use this to set the high order byte of the interpreter programme counter (R3), as this is also used as the programme counter for machine code routines called with this instruction
0048 45 LDA 5 Get the low-order byte of the address to be called directly from memory using the Chip-8 programme counter (R5) and then advance this
0049 30 40 BR CALL_ SUBROUTINE Now return to the main fetch and decode loop and call the relevant subroutine
004B-004E There is a short routine here to turn on the COSMAC VIP’s display. I’ll analyse this in a future post
004F 00 00 DB 0x00, 0x00 This is filler before the subroutine address lookup tables so that the last digit of the address for each entry corresponds to the digit that indicates the instruction group (i.e. the entry for instruction group 1 is found at 0x0051, the entry for instruction group 2 at 0x0052, etc.)
0051 01 01 01 01 01 01 01 01 01 01 01 01 00 01 01 DB 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x01, 0x01 A lookup table holding the high order bytes of the addresses of the subroutines for Chip-8 instruction groups 1 through F
0060 00 DB 0x00 This is filler between the tables so that the second table is also aligned to instruction group numbers (i.e. 1 is at 0x0061, 2 at 0x0062, etc.)
0061 7C 75 83 88 95 B4 87 BC 91 EB A4 D9 70 99 05 DB 0x7C, 0x75, 0x83, 0x88, 0x95, 0xB4, 0x87, 0xBC, 0x91, 0xEB, 0xA4, 0xD9, 0x70, 0x99, 0x05 A lookup table holding the low-order bytes of the addresses of the subroutines for Chip-8 instruction groups 1 through F. So the completed addresses for each digit are:
0x1: 0x017C
0x2: 0x0175
0x3: 0x0183
0x4: 0x018B
0x5: 0x0195
0x6: 0x01B4
0x7: 0x01B7
0x8: 0x01BC
0x9: 0x0191
0xA: 0x01EB
0xB: 0x01A4
0xC: 0x01D9
0xD: 0x0070
0xE: 0x0199
0xF: 0x0105

Execution times for the fetch and decode loop are 40 machine cycles (181.6 microseconds) for group 0 instructions and 68 machine cycles (308.72 microseconds) for all other Chip-8 instructions. Note that these are the execution times for the fetch and decode loop only – the execution time for the called routine needs to be added to this to get the total execution time for the instruction.

Contemporary interpreters may not use this fetch and decode algorithm. For example, a fairly common way to select routines for each instruction group is to use a switch statement. Other interpreters may use function pointers, which is closer to the algorithm used here. It also may not be necessary to set up variable pointers this early in a contemporary interpreter.

In future posts I’ll analyse each instruction group, starting with group 0 for machine code integration.

This entry was posted in Chip-8, Retro Computing and tagged , , , , , . Bookmark the permalink.

3 Responses to "Chip-8 on the COSMAC VIP: The Call Routine (Fetch and Decode)"

Leave a reply