ISA¶
This file deals with description of the ISA used by the falcon microprocessor, which is described in Introduction.
Contents
Registers¶
There are 16 32-bit GPRs, $r0-$r15. There are also a dozen or so special registers:
Index | Name | Present on | Description |
---|---|---|---|
$sr0 | $iv0 | all units | Interrupt 0 vector |
$sr1 | $iv1 | all units | Interrupt 1 vector |
$sr3 | $tv | all units | Trap vector |
$sr4 | $sp | all units | Stack pointer |
$sr5 | $pc | all units | Program counter |
$sr6 | $xcbase | all units | Code xfer external base |
$sr7 | $xdbase | all units | Data xfer external base |
$sr8 | $flags | all units | Misc flags |
$sr9 | $cx | crypto units | Crypt xfer mode |
$sr10 | $cauth | crypto units | Crypt auth code selection |
$sr11 | $xtargets | all units | Xfer port selection |
$sr12 | $tstatus | v3+ units | Trap status |
$flags register¶
$flags [$sr8] register contains various flags controlling the operation of the falcon microprocessor. It is split into the following bitfields:
Bits | Name | Present on | Description |
---|---|---|---|
0-7 | $p0-$p7 | all units | General-purpose predicates |
8 | c | all units | Carry flag |
9 | o | all units | Signed overflow flag |
10 | s | all units | Sign/negative flag |
11 | z | all units | Zero flag |
16 | ie0 | all units | Interrupt 0 enable |
17 | ie1 | all units | Interrupt 1 enable |
18 | ??? | v4+ units | ??? |
20 | is0 | all units | Interrupt 0 saved enable |
21 | is1 | all units | Interrupt 1 saved enable |
22 | ??? | v4+ units | ??? |
24 | ta | all units | Trap handler active |
26-28 | ??? | v4+ units | ??? |
29-31 | ??? | v4+ units | ??? |
Todo
figure out v4+ stuff
$p predicates¶
$flags.p0-p7 are general-purpose single-bit flags. They can be used to store single-bit variables. They can be set via bset, bclr, btgl, and setp instructions. They can be read by xbit instruction, or checked by sleep and bra instructions.
Instructions¶
Instructions have 2, 3, or 4 bytes. First byte of instruction determines its length and format. High 2 bits of the first byte determine the instruction’s operand size; 00 means 8-bit, 01 means 16-bit, 10 means 32-bit, and 11 means an instruction that doesn’t use operand sizing. The set of available opcodes varies greatly with the instruction format.
The subopcode can be stored in one of the following places:
- O1: subopcode goes to low 4 bits of byte 0
- O2: subopcode goes to low 4 bits of byte 1
- OL: subopcode goes to low 6 bits of byte 1
- O3: subopcode goes to low 4 bits of byte 2
The operands are denoted as follows:
- R1x: register encoded in low 4 bits of byte 1
- R2x: register encoded in high 4 bits of byte 1
- R3x: register encoded in high 4 bits of byte 2
- RxS: register used as source
- RxD: register used as destination
- RxSD: register used as both source and destination
- I8: 8-bit immediate encoded in byte 2
- I16: 16-bit immediate encoded in bytes 2 [low part] and 3 [high part]
Sized¶
Sized opcodes are [low 6 bits of opcode]:
- 0x: O1 R2S R1S I8
- 1x: O1 R1D R2S I8
- 2x: O1 R1D R2S I16
- 30: O2 R2S I8
- 31: O2 R2S I16
- 34: O2 R2D I8
- 36: O2 R2SD I8
- 37: O2 R2SD I16
- 38: O3 R2S R1S
- 39: O3 R1D R2S
- 3a: O3 R2D R1S
- 3b: O3 R2SD R1S
- 3c: O3 R3D R2S R1S
- 3d: O2 R2SD
Todo
long call/branch
The subopcodes are as follows:
Instruction | 0x | 1x | 2x | 30 | 31 | 34 | 36 | 37 | 38 | 39 | 3a | 3b | 3c | 3d | imm | flg0 | flg3+ | Cycles | Present on | Description |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
st | 0 | 0 | U | - | - | 1 | all units | store | ||||||||||||
st [sp] | 1 | 1 | U | - | - | all units | store | |||||||||||||
cmpu | 4 | 4 | 4 | U | CZ | CZ | 1 | all units | unsigned compare | |||||||||||
cmps | 5 | 5 | 5 | S | CZ | CZ | 1 | all units | signed compare | |||||||||||
cmp | 6 | 6 | 6 | S | N/A | COSZ | 1 | v3+ units | compare | |||||||||||
add | 0 | 0 | 0 | 0 | 0 | 0 | U | COSZ | COSZ | 1 | all units | add | ||||||||
adc | 1 | 1 | 1 | 1 | 1 | 1 | U | COSZ | COSZ | 1 | all units | add with carry | ||||||||
sub | 2 | 2 | 2 | 2 | 2 | 2 | U | COSZ | COSZ | 1 | all units | substract | ||||||||
sbb | 3 | 3 | 3 | 3 | 3 | 3 | U | COSZ | COSZ | 1 | all units | substract with borrow | ||||||||
shl | 4 | 4 | 4 | 4 | U | C | COSZ | 1 | all units | shift left | ||||||||||
shr | 5 | 5 | 5 | 5 | U | C | COSZ | 1 | all units | shift right | ||||||||||
sar | 7 | 7 | 7 | 7 | U | C | COSZ | 1 | all units | shift right with sign | ||||||||||
ld | 8 | 8 | U | - | - | 1 | all units | load | ||||||||||||
shlc | c | c | c | c | U | C | COSZ | 1 | all units | shift left with carry | ||||||||||
shrc | d | d | d | d | U | C | COSZ | 1 | all units | shift right with carry | ||||||||||
ld [sp] | 0 | 0 | U | - | - | all units | load | |||||||||||||
not | 0 | 0 | OSZ | OSZ | 1 | all units | bitwise not | |||||||||||||
neg | 1 | 1 | OSZ | OSZ | 1 | all units | sign negation | |||||||||||||
movf | 2 | 2 | OSZ | N/A | 1 | v0 units | move | |||||||||||||
mov | 2 | 2 | N/A | - | 1 | v3+ units | move | |||||||||||||
hswap | 3 | 3 | OSZ | OSZ | 1 | all units | swap halves | |||||||||||||
clear | 4 | - | - | 1 | all units | set to 0 | ||||||||||||||
setf | 5 | N/A | OSZ | 1 | v3+ units | set flags from value |
Unsized¶
Unsized opcodes are:
- cx: O1 R1D R2S I8
- dx: O1 R2S R1S I8
- ex: O1 R1D R2S I16
- f0: O2 R2SD I8
- f1: O2 R2SD I16
- f2: O2 R2S I8
- f4: OL I8
- f5: OL I16
- f8: O2
- f9: O2 R2S
- fa: O3 R2S R1S
- fc: O2 R2D
- fd: O3 R2SD R1S
- fe: O3 R1D R2S
- ff: O3 R3D R2S R1S
The subopcodes are as follows:
Instruction | cx | dx | ex | f0 | f1 | f2 | f4 | f5 | f8 | f9 | fa | fc | fd | fe | ff | imm | flg0 | flg3+ | cycles | Present on | Description |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mulu | 0 | 0 | 0 | 0 | 0 | 0 | U | - | - | 1 | all units | unsigned 16x16 -> 32 multiply | |||||||||
muls | 1 | 1 | 1 | 1 | 1 | 1 | S | - | - | 1 | all units | signed 16x16 -> 32 multiply | |||||||||
sext | 2 | 2 | 2 | 2 | U | SZ | SZ | 1 | all units | sign extend | |||||||||||
extrs | 3 | 3 | 3 | U | N/A | SZ | 1 | v3+ units | extract signed bitfield | ||||||||||||
sethi | 3 | 3 | H | - | - | 1 | all units | set high 16 bits | |||||||||||||
and | 4 | 4 | 4 | 4 | 4 | 4 | U | - | COSZ | 1 | all units | bitwise and | |||||||||
or | 5 | 5 | 5 | 5 | 5 | 5 | U | - | COSZ | 1 | all units | bitwise or | |||||||||
xor | 6 | 6 | 6 | 6 | 6 | 6 | U | - | COSZ | 1 | all units | bitwise xor | |||||||||
extr | 7 | 7 | 7 | U | N/A | SZ | 1 | v3+ units | extract bitfield | ||||||||||||
mov | 7 | 7 | S | - | - | 1 | all units | move | |||||||||||||
xbit | 8 | 8 | U | - | SZ | 1 | all units | extract single bit | |||||||||||||
bset | 9 | 9 | U | - | - | 1 | all units | set single bit | |||||||||||||
bclr | a | a | U | - | - | 1 | all units | clear single bit | |||||||||||||
btgl | b | b | U | - | - | 1 | all units | toggle single bit | |||||||||||||
ins | b | b | U | N/A | - | 1 | v3+ units | insert bitfield | |||||||||||||
xbit[fl] | c | c | U | - | SZ | all units | extract single bit | ||||||||||||||
div | c | c | c | U | N/A | - | 30-33 | v3+ units | divide | ||||||||||||
mod | d | d | d | U | N/A | - | 30-33 | v3+ units | modulus | ||||||||||||
??? | e | e | U | - | - | all units | ??? IO port | ||||||||||||||
iord | f | f | U | - | - | ~1-x | all units | read IO port | |||||||||||||
iowr | 0 | 0 | U | - | - | 1-x | all units | write IO port asynchronous | |||||||||||||
iowrs | 1 | 1 | U | N/A | - | 9-x | v3+ units | write IO port synchronous | |||||||||||||
xcld | 4 | - | - | all units | code xfer to falcon | ||||||||||||||||
xdld | 5 | - | - | all units | data xfer to falcon | ||||||||||||||||
xdst | 6 | - | - | all units | data xfer from falcon | ||||||||||||||||
setp | 8 | 8 | - | - | all units | set predicate | |||||||||||||||
ccmd | c | 3c | 3c | - | - | crypto units | crypto coprocessor command | ||||||||||||||
bra | 0x | 0x | S | - | - | 5 | all units | branch relative conditional | |||||||||||||
bra | 1x | 1x | S | - | - | 5 | all units | branch relative conditional | |||||||||||||
jmp | 20 | 20 | 4 | U | - | - | 4-5 | all units | branch absolute | ||||||||||||
call | 21 | 21 | 5 | U | - | - | 4-5 | all units | call subroutine | ||||||||||||
sleep | 28 | U | - | - | NA | all units | sleep until interrupt | ||||||||||||||
add [sp] | 30 | 30 | 1 | S | - | - | 1 | all units | add to stack pointer | ||||||||||||
bset[fl] | 31 | 9 | U | - | - | all units | set single bit | ||||||||||||||
bclr[fl] | 32 | a | U | - | - | all units | clear single bit | ||||||||||||||
btgl[fl] | 33 | b | U | - | - | all units | toggle single bit | ||||||||||||||
ret | 0 | - | - | 5-6 | all units | return from subroutine | |||||||||||||||
iret | 1 | - | - | all units | return from interrupt handler | ||||||||||||||||
exit | 2 | - | - | all units | halt microcontroller | ||||||||||||||||
xdwait | 3 | - | - | all units | wait for data xfer | ||||||||||||||||
??? | 6 | - | - | all units | ??? | ||||||||||||||||
xcwait | 7 | - | - | all units | wait for code xfer | ||||||||||||||||
trap 0 | 8 | N/A | - | v3+ units | trigger software trap | ||||||||||||||||
trap 1 | 9 | N/A | - | v3+ units | trigger software trap | ||||||||||||||||
trap 2 | a | N/A | - | v3+ units | trigger software trap | ||||||||||||||||
trap 3 | b | N/A | - | v3+ units | trigger software trap | ||||||||||||||||
push | 0 | - | - | 1 | all units | push onto stack | |||||||||||||||
itlb | 8 | N/A | - | v3+ units | drop TLB entry | ||||||||||||||||
pop | 0 | - | - | 1 | all units | pop from stack | |||||||||||||||
mov[>sr] | 0 | - | - | all units | move to special register | ||||||||||||||||
mov[<sr] | 1 | - | - | all units | move from special register | ||||||||||||||||
ptlb | 2 | N/A | - | v3+ units | lookup TLB by phys address | ||||||||||||||||
vtlb | 3 | N/A | - | v3+ units | lookup TLB by virt address |
Code segment¶
falcon has separate code and data spaces. Code segment, like data segment, is located in small piece of SRAM in the microcontroller. Its size can be determined by looking at MMIO address falcon+0x108, bits 0-8 shifted left by 8.
Code is byte-oriented, but can only be accessed by 32-bit words from outside, and can only be modified in 0x100-byte [page] units.
On v0, code segment is just a flat piece of RAM, except for the per-page secret flag. See v0 code/data upload registers for information on uploading code and data.
On v3+, code segment is paged with virtual -> physical translation and needs special handling. See IO space for details.
Code execution is started by host via MMIO from arbitrary entry point, and is stopped either by host or by the microcode itself, see Halting microcode execution: exit, Processor execution control registers.
Invalid opcode handling¶
When an invalid opcode is hit, $pc is unmodified and a trap is generated. On v3+, $tstatus reason field is set to 8. v0 engines don’t have $tstatus register, but this is the only trap type they support anyway.