ISA

This file deals with description of the ISA used by the falcon microprocessor, which is described in Introduction.

Registers

There are 16 32-bit GPRs, $r0-$r15. There are also a dozen or so special registers:

Index Name Present on Description
$sr0 $iv0 all units Interrupt 0 vector
$sr1 $iv1 all units Interrupt 1 vector
$sr3 $tv all units Trap vector
$sr4 $sp all units Stack pointer
$sr5 $pc all units Program counter
$sr6 $xcbase all units Code xfer external base
$sr7 $xdbase all units Data xfer external base
$sr8 $flags all units Misc flags
$sr9 $cx crypto units Crypt xfer mode
$sr10 $cauth crypto units Crypt auth code selection
$sr11 $xtargets all units Xfer port selection
$sr12 $tstatus v3+ units Trap status

$flags register

$flags [$sr8] register contains various flags controlling the operation of the falcon microprocessor. It is split into the following bitfields:

Bits Name Present on Description
0-7 $p0-$p7 all units General-purpose predicates
8 c all units Carry flag
9 o all units Signed overflow flag
10 s all units Sign/negative flag
11 z all units Zero flag
16 ie0 all units Interrupt 0 enable
17 ie1 all units Interrupt 1 enable
18 ??? v4+ units ???
20 is0 all units Interrupt 0 saved enable
21 is1 all units Interrupt 1 saved enable
22 ??? v4+ units ???
24 ta all units Trap handler active
26-28 ??? v4+ units ???
29-31 ??? v4+ units ???

Todo

figure out v4+ stuff

$p predicates

$flags.p0-p7 are general-purpose single-bit flags. They can be used to store single-bit variables. They can be set via bset, bclr, btgl, and setp instructions. They can be read by xbit instruction, or checked by sleep and bra instructions.

Instructions

Instructions have 2, 3, or 4 bytes. First byte of instruction determines its length and format. High 2 bits of the first byte determine the instruction’s operand size; 00 means 8-bit, 01 means 16-bit, 10 means 32-bit, and 11 means an instruction that doesn’t use operand sizing. The set of available opcodes varies greatly with the instruction format.

The subopcode can be stored in one of the following places:

  • O1: subopcode goes to low 4 bits of byte 0
  • O2: subopcode goes to low 4 bits of byte 1
  • OL: subopcode goes to low 6 bits of byte 1
  • O3: subopcode goes to low 4 bits of byte 2

The operands are denoted as follows:

  • R1x: register encoded in low 4 bits of byte 1
  • R2x: register encoded in high 4 bits of byte 1
  • R3x: register encoded in high 4 bits of byte 2
  • RxS: register used as source
  • RxD: register used as destination
  • RxSD: register used as both source and destination
  • I8: 8-bit immediate encoded in byte 2
  • I16: 16-bit immediate encoded in bytes 2 [low part] and 3 [high part]

Sized

Sized opcodes are [low 6 bits of opcode]:

  • 0x: O1 R2S R1S I8
  • 1x: O1 R1D R2S I8
  • 2x: O1 R1D R2S I16
  • 30: O2 R2S I8
  • 31: O2 R2S I16
  • 34: O2 R2D I8
  • 36: O2 R2SD I8
  • 37: O2 R2SD I16
  • 38: O3 R2S R1S
  • 39: O3 R1D R2S
  • 3a: O3 R2D R1S
  • 3b: O3 R2SD R1S
  • 3c: O3 R3D R2S R1S
  • 3d: O2 R2SD

Todo

long call/branch

The subopcodes are as follows:

Instruction 0x 1x 2x 30 31 34 36 37 38 39 3a 3b 3c 3d imm flg0 flg3+ Cycles Present on Description
st 0               0           U - - 1 all units store
st [sp]       1         1           U - -   all units store
cmpu       4 4       4           U CZ CZ 1 all units unsigned compare
cmps       5 5       5           S CZ CZ 1 all units signed compare
cmp       6 6       6           S N/A COSZ 1 v3+ units compare
add   0 0       0 0       0 0   U COSZ COSZ 1 all units add
adc   1 1       1 1       1 1   U COSZ COSZ 1 all units add with carry
sub   2 2       2 2       2 2   U COSZ COSZ 1 all units substract
sbb   3 3       3 3       3 3   U COSZ COSZ 1 all units substract with borrow
shl   4         4         4 4   U C COSZ 1 all units shift left
shr   5         5         5 5   U C COSZ 1 all units shift right
sar   7         7         7 7   U C COSZ 1 all units shift right with sign
ld   8                     8   U - - 1 all units load
shlc   c         c         c c   U C COSZ 1 all units shift left with carry
shrc   d         d         d d   U C COSZ 1 all units shift right with carry
ld [sp]           0         0       U - -   all units load
not                   0       0   OSZ OSZ 1 all units bitwise not
neg                   1       1   OSZ OSZ 1 all units sign negation
movf                   2       2   OSZ N/A 1 v0 units move
mov                   2       2   N/A - 1 v3+ units move
hswap                   3       3   OSZ OSZ 1 all units swap halves
clear                           4   - - 1 all units set to 0
setf                           5   N/A OSZ 1 v3+ units set flags from value

Unsized

Unsized opcodes are:

  • cx: O1 R1D R2S I8
  • dx: O1 R2S R1S I8
  • ex: O1 R1D R2S I16
  • f0: O2 R2SD I8
  • f1: O2 R2SD I16
  • f2: O2 R2S I8
  • f4: OL I8
  • f5: OL I16
  • f8: O2
  • f9: O2 R2S
  • fa: O3 R2S R1S
  • fc: O2 R2D
  • fd: O3 R2SD R1S
  • fe: O3 R1D R2S
  • ff: O3 R3D R2S R1S

The subopcodes are as follows:

Instruction cx dx ex f0 f1 f2 f4 f5 f8 f9 fa fc fd fe ff imm flg0 flg3+ cycles Present on Description
mulu 0   0 0 0               0   0 U - - 1 all units unsigned 16x16 -> 32 multiply
muls 1   1 1 1               1   1 S - - 1 all units signed 16x16 -> 32 multiply
sext 2     2                 2   2 U SZ SZ 1 all units sign extend
extrs 3   3                       3 U N/A SZ 1 v3+ units extract signed bitfield
sethi       3 3                     H - - 1 all units set high 16 bits
and 4   4 4 4               4   4 U - COSZ 1 all units bitwise and
or 5   5 5 5               5   5 U - COSZ 1 all units bitwise or
xor 6   6 6 6               6   6 U - COSZ 1 all units bitwise xor
extr 7   7                       7 U N/A SZ 1 v3+ units extract bitfield
mov       7 7                     S - - 1 all units move
xbit 8                           8 U - SZ 1 all units extract single bit
bset       9                 9     U - - 1 all units set single bit
bclr       a                 a     U - - 1 all units clear single bit
btgl       b                 b     U - - 1 all units toggle single bit
ins b   b                         U N/A - 1 v3+ units insert bitfield
xbit[fl]       c                   c   U - SZ   all units extract single bit
div c   c                       c U N/A - 30-33 v3+ units divide
mod d   d                       d U N/A - 30-33 v3+ units modulus
??? e                           e U - -   all units ??? IO port
iord f                           f U - - ~1-x all units read IO port
iowr   0                 0         U - - 1-x all units write IO port asynchronous
iowrs   1                 1         U N/A - 9-x v3+ units write IO port synchronous
xcld                     4           - -   all units code xfer to falcon
xdld                     5           - -   all units data xfer to falcon
xdst                     6           - -   all units data xfer from falcon
setp           8         8           - -   all units set predicate
ccmd           c 3c 3c                 - -   crypto units crypto coprocessor command
bra             0x 0x               S - - 5 all units branch relative conditional
bra             1x 1x               S - - 5 all units branch relative conditional
jmp             20 20   4           U - - 4-5 all units branch absolute
call             21 21   5           U - - 4-5 all units call subroutine
sleep             28                 U - - NA all units sleep until interrupt
add [sp]             30 30   1           S - - 1 all units add to stack pointer
bset[fl]             31     9           U - -   all units set single bit
bclr[fl]             32     a           U - -   all units clear single bit
btgl[fl]             33     b           U - -   all units toggle single bit
ret                 0               - - 5-6 all units return from subroutine
iret                 1               - -   all units return from interrupt handler
exit                 2               - -   all units halt microcontroller
xdwait                 3               - -   all units wait for data xfer
???                 6               - -   all units ???
xcwait                 7               - -   all units wait for code xfer
trap 0                 8               N/A -   v3+ units trigger software trap
trap 1                 9               N/A -   v3+ units trigger software trap
trap 2                 a               N/A -   v3+ units trigger software trap
trap 3                 b               N/A -   v3+ units trigger software trap
push                   0             - - 1 all units push onto stack
itlb                   8             N/A -   v3+ units drop TLB entry
pop                       0         - - 1 all units pop from stack
mov[>sr]                           0     - -   all units move to special register
mov[<sr]                           1     - -   all units move from special register
ptlb                           2     N/A -   v3+ units lookup TLB by phys address
vtlb                           3     N/A -   v3+ units lookup TLB by virt address

Code segment

falcon has separate code and data spaces. Code segment, like data segment, is located in small piece of SRAM in the microcontroller. Its size can be determined by looking at MMIO address falcon+0x108, bits 0-8 shifted left by 8.

Code is byte-oriented, but can only be accessed by 32-bit words from outside, and can only be modified in 0x100-byte [page] units.

On v0, code segment is just a flat piece of RAM, except for the per-page secret flag. See v0 code/data upload registers for information on uploading code and data.

On v3+, code segment is paged with virtual -> physical translation and needs special handling. See IO space for details.

Code execution is started by host via MMIO from arbitrary entry point, and is stopped either by host or by the microcode itself, see Halting microcode execution: exit, Processor execution control registers.

Invalid opcode handling

When an invalid opcode is hit, $pc is unmodified and a trap is generated. On v3+, $tstatus reason field is set to 8. v0 engines don’t have $tstatus register, but this is the only trap type they support anyway.