# Arithmetic instructions¶

Contents

- Arithmetic instructions
- Introduction
- $flags result bits
- Pseudocode conventions
- Comparison: cmpu, cmps, cmp
- Addition/substraction: add, adc, sub, sbb
- Shifts: shl, shr, sar, shlc, shrc
- Unary operations: not, neg, mov, movf, hswap
- Loading immediates: mov, sethi
- Clearing registers: clear
- Setting flags from a value: setf
- Multiplication: mulu, muls
- Sign extension: sext
- Bitfield extraction: extr, extrs
- Bitfield insertion: ins
- Bitwise operations: and, or, xor
- Bit extraction: xbit
- Bit manipulation: bset, bclr, btgl
- Division and remainder: div, mod
- Setting predicates: setp

## Introduction¶

The arithmetic/logical instructions do operations on $r0-$r15 GPRs, sometimes setting bits in $flags register according to the result. The instructions can be “sized” or “unsized”. Sized instructions have 8-bit, 16-bit, and 32-bit variants. Unsized instructions don’t have variants, and always operate on full 32-bit registers. For 8-bit and 16-bit sized instructions, high 24 or 16 bits of destination registers are unmodified.

## $flags result bits¶

The $flags bits often affected by ALU instructions are:

- bit 8: c, carry flag. Set by addition instructions iff a carry out of the high bit (or, equivalently, unsigned overflow) has occured. Likewise set by subtraction instructions iff a borrow into the high bit (or unsigned overflow) has occured. Also used by shift instructions to store the last shifted out bit. Used as the less-than condition in old comparisons.
- bit 9: o, signed overflow flag - set by addition, subtraction, comparison, and negation instructions if a signed overflow occured. Set to 0 by some other instructions.
- bit 10: s, sign flag - set according to the high bit of the result by most arithmetic instructions.
- bit 11: z, zero flag - set iff the result was equal to 0 by most arithmetic instructions.

Also, a few ALU instructions operate on $flags register as a whole.

## Pseudocode conventions¶

`sz`

, for sized instructions, is the selected size of operation: 8, 16, or 32.

`S(x)`

evaluates to `(x >> (sz - 1) & 1)`

, ie. the sign bit of `x`

. If insn
is unsized, assume `sz == 32`

.

`C(a, b, c)`

, where `a, b, c`

are booleans, is the carry flag for
an addition where the two inputs have high bits of `a`

and `b`

,
and the result has a high bit of `c`

. It is computed as follows:

```
bool C(bool a, bool b, bool c) {
// a and b both set - there is always carry out.
if (a && b)
return 1;
// One of a and b is set - there is carry out iff result has high
// bit 0.
if ((a || b) && !c)
return 1;
# Otherwise (a and b both clear), there is no possibility of carry
# out.
return 0;
}
```

Also, `!C(a, !b, c)`

is the borrow flag for a subtraction where
the two inputs have high bits of `a`

and `b`

, and the result has
a high bit of `c`

.

Likewise, `O(a, b, c)`

is similarly defined as the signed overflow flag
for an addition:

```
bool O(bool a, bool b, bool c) {
return a == b && a != c;
// equivalent definition (check it yourself):
// return a ^ b ^ c ^ C(a, b, c);
}
```

Similarly, `O(a, !b, c)`

is the signer overflow flag for subtraction.

## Comparison: cmpu, cmps, cmp¶

Compare two values, setting flags according to results of comparison. `cmp`

sets the usual set of 4 flags, and behaves identically to a subtraction
instruction that doesn’t write its destination register. `cmpu`

sets
only `c`

and `z`

, but otherwise behaves like `cmp`

- thus it is only
useful for unsigned comparisons. `cmps`

sets `z`

normally,
but sets `c`

iff `SRC1`

is less then `SRC2`

when treated as signed
number (thus using unsigned condition codes to store the result of a signed
comparison instead).

`cmpu`

/`cmps`

are the only comparison instructions available on Falcon v0.
Both of them set only the `c`

and `z`

flags, with `cmps`

setting `c`

flag in an unusual way to enable signed comparisons while using unsigned flags
and condition codes. To do an unsigned comparison, use `cmpu`

and the
unsigned branch conditions [`b/a/e`

]. To do a signed comparison, use `cmps`

,
also with unsigned branch conditions.

The Falcon v3+ new `cmp`

instruction sets the full set of flags. To do
an unsigned comparison on v3+, use `cmp`

and the unsigned branch conditions.
To do a signed comparison, use cmp and the signed branch conditions [`l/g/e`

].

- Instructions:
Name Description Present on Subopcode cmpu compare unsigned all units 4 cmps compare signed all units 5 cmp compare v3+ units 6 - Instruction class:
- sized
- Execution time:
- 1 cycle
- Operands:
- SRC1, SRC2
- Forms:
Form Opcode R2, I8 30 R2, I16 31 R2, R1 38 - Immediates:
- cmpu:
- zero-extended
- cmps:
- sign-extended
- cmp:
- sign-extended

- Operation:
uint<sz>_t diff = SRC1 - SRC2; $flags.z = (diff == 0); if (op == cmps) $flags.c = O(S(SRC1), !S(SRC2), S(diff)) ^ S(diff); else if (op == cmpu) $flags.c = !C(S(SRC1), !S(SRC2), S(diff)); else if (op == cmp) { $flags.c = !C(S(SRC1), !S(SRC2), S(diff)); $flags.o = O(S(SRC1), !S(SRC2), S(diff)); $flags.s = S(diff); }

## Addition/substraction: add, adc, sub, sbb¶

Add or substract two values, possibly with carry/borrow. The full set of arithmetic flags is always written.

- Instructions:
Name Description Subopcode add add 0 adc add with carry 1 sub substract 2 sbb substrace with borrow 3 - Instruction class:
- sized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 10 R1, R2, I16 20 R2, R2, I8 36 R2, R2, I16 37 R2, R2, R1 3b R3, R2, R1 3c - Immediates:
- zero-extended
- Operation:
uint<sz>_t res; if (op == add) res = SRC1 + SRC2; else if (op == adc) res = SRC1 + SRC2 + $flags.c; else if (op == sub) res = SRC1 - SRC2; else if (op == sbb) res = SRC1 - SRC2 - $flags.c; if (op == add || op == adc) { $flags.c = C(S(SRC1), S(SRC2), S(res)); $flags.o = O(S(SRC1), S(SRC2), S(res)); } else { $flags.c = !C(S(SRC1), !S(SRC2), S(res)); $flags.o = O(S(SRC1), !S(SRC2), S(res)); } DST = res; $flags.s = S(res); $flags.z = (res == 0);

## Shifts: shl, shr, sar, shlc, shrc¶

Shift a value. For `shl/shr`

, the extra bits “shifted in” are 0. For `sar`

,
they’re equal to sign bit of source. For `shlc/shrc`

, the first such bit
is taken from carry flag, the rest are 0. On Falcon v3+, these instructions
set all 4 arithmetic flags - `s`

and `z`

are set as usual, `o`

is always
set to 0, and `c`

is set to the value of the last shifted out bit, or 0
if the shift count was 0. On Falcon v0, only `c`

is set.

The shift count is always masked to 3 bits in case of 8-bit shift instructions, 4 bits in case of 16-bit shift instructions, and 5 bits in case of 32-bit shift instructions.

- Instructions:
Name Description Subopcode shl shift left 4 shr shift right 5 sar shift right with sign bit 6 shlc shift left with carry in c shrc shift right with carry in d - Instruction class:
- sized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 10 R2, R2, I8 36 R2, R2, R1 3b R3, R2, R1 3c - Immediates:
- truncated
- Operation:
unsigned shcnt; if (sz == 8) shcnt = SRC2 & 7; else if (sz == 16) shcnt = SRC2 & 0xf; else // sz == 32 shcnt = SRC2 & 0x1f; uint<sz>_t res; if (op == shl || op == shlc) { res = SRC1 << shcnt; if (op == shlc && shcnt != 0) res |= $flags.c << (shcnt - 1); if (shcnt == 0) $flags.c = 0; else $flags.c = SRC1 >> (sz - shcnt) & 1; } else { // shr, sar, shrc res = SRC1 >> shcnt; if (op == shrc && shcnt != 0) res |= $flags.c << (sz - shcnt); if (op == sar && S(SRC1)) res |= ~0 << (sz - shcnt); if (shcnt == 0) $flags.c = 0; else $flags.c = SRC1 >> (shcnt - 1) & 1; } DST = res; if (falcon_version != 0) { $flags.o = 0; $flags.s = S(DST); $flags.z = (DST == 0); }

## Unary operations: not, neg, mov, movf, hswap¶

not flips all bits in a value. neg negates a value. mov and movf move a value
from one register to another. mov is the v3+ variant, which just does the
move. movf is the v0 variant, which additionally sets flags according to the
moved value. hswap rotates a value by half its size. All instructions except
`mov`

set 3 flags: `s`

and `z`

(which are set as usual), as well as
`o`

(which is set iff signed overflow occured for `neg`

, and always set
to 0 for other instructions).

- Instructions:
Name Description Present on Subopcode not bitwise complement all units 0 neg negate a value all units 1 movf move a value and set flags v0 units 2 mov move a value v3+ units 2 hswap Swap halves all units 3 - Instruction class:
- sized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC
- Forms:
Form Opcode R1, R2 39 R2, R2 3d - Operation:
if (op == not) { DST = ~SRC; $flags.o = 0; } else if (op == neg) { DST = -SRC; $flags.o = (DST == 1 << (sz - 1)); } else if (op == movf) { DST = SRC; $flags.o = 0; } else if (op == mov) { DST = SRC; } else if (op == hswap) { DST = SRC >> (sz / 2) | SRC << (sz / 2); $flags.o = 0; } if (op != mov) { $flags.s = S(DST); $flags.z = (DST == 0); }

## Loading immediates: mov, sethi¶

mov sets a register to an immediate. sethi sets high 16 bits of a register to an immediate, leaving low bits untouched. mov can be thus used to load small [16-bit signed] immediates, while mov+sethi can be used to load any 32-bit immediate.

- Instructions
Name Description Subopcode mov Load an immediate 7 sethi Set high bits 3 - Instruction class:
- unsized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC
- Forms:
Form Opcode R2, I8 f0 R2, I16 f1 - Immediates:
- mov:
- sign-extended
- sethi:
- zero-extended

- Operation:
if (op == mov) DST = SRC; else if (op == sethi) DST = DST & 0xffff | SRC << 16;

## Clearing registers: clear¶

Sets a register to 0.

- Instructions:
Name Description Subopcode clear Clear a register 4 - Instruction class:
- sized
- Operands:
- DST
- Forms:
Form Opcode R2 3d - Operation:
DST = 0;

## Setting flags from a value: setf¶

Sets `z`

and `s`

flags according to a value, sets `o`

flag to 0.

- Instructions:
Name Description Present on Subopcode setf Set flags according to a value v3+ units 5 - Instruction class:
- sized
- Execution time:
- 1 cycle
- Operands:
- SRC
- Forms:
Form Opcode R2 3d - Operation:
$flags.o = 0; $flags.s = S(SRC); $flags.z = (SRC == 0);

## Multiplication: mulu, muls¶

Does a 16x16 -> 32 multiplication. The inputs are unsigned for `mulu`

,
signed for `muls`

. Sets no flags.

- Instructions:
Name Description Subopcode mulu Multiply unsigned 0 muls Multiply signed 1 - Instruction class:
- unsized
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 c0 R1, R2, I16 e0 R2, R2, I8 f0 R2, R2, I16 f1 R2, R2, R1 fd R3, R2, R1 ff - Immediates:
- mulu:
- zero-extended
- muls:
- sign-extended

- Operation:
s1 = SRC1 & 0xffff; s2 = SRC2 & 0xffff; if (op == muls) { if (s1 & 0x8000) s1 |= 0xffff0000; if (s2 & 0x8000) s2 |= 0xffff0000; } DST = s1 * s2;

## Sign extension: sext¶

Does a sign-extension of low (X+1) bits of a value. Sets `s`

and `z`

flags according to the result. The second argument is, after masking to
5 bits, the bit index (counting from LSB) which contains the new sign bit
- the result will be equal to the source with all bits higher than that
replaced with a copy of the sign bit.

- Instructions:
Name Description Subopcode sext Sign-extend 2 - Instruction class:
- unsized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 c0 R2, R2, I8 f0 R2, R2, R1 fd R3, R2, R1 ff - Immediates:
- truncated
- Operation:
bit = SRC2 & 0x1f; if (SRC1 & 1 << bit) { DST = SRC1 & ((1 << bit) - 1) | -(1 << bit); } else { DST = SRC1 & ((1 << bit) - 1); } $flags.s = S(DST); $flags.z = (DST == 0);

## Bitfield extraction: extr, extrs¶

Extracts a bitfield. The bitfield to extract is given as a pair of (low bit
index, size in bits - 1) packed in a single 10-bit source, with each part
taking 5 bits. The value of the bitfield is returned in the low bits of
the destination register. `extr`

extracts an unsigned bitfield, setting
the remaining destination bits to 0, while `extrs`

extracts a signed
bitfield, setting the remaining bits to a copy of the sign bit (ie. the
highest bit of the bitfield).

Both instructions set `s`

and `z`

flags. While `z`

is set as usual,
`s`

is set to the “fill” bit used for high bits of the destination - thus
it is always `0`

for `extr`

.

- Instructions:
Name Description Present on Subopcode extrs Extract signed bitfield v3+ units 3 extr Extract unsigned bitfield v3+ units 7 - Instruction class:
- unsized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 c0 R1, R2, I16 e0 R3, R2, R1 ff - Immediates:
- zero-extended
- Operation:
int low = SRC2 & 0x1f; int sizem1 = (SRC2 >> 5 & 0x1f); uint32_t bf = (SRC1 >> low) & ((2 << sizem1) - 1); bool fill_bit; if (op == extr) { fill_bit = 0; } else if (op == extrs) { // depending on the mask is probably a bad idea. int signbit = (low + sizem1) & 0x1f; fill_bit = SRC1 >> signbit & 1; } if (fill_bit) bf |= -(2 << sizem1); DST = bf; $flags.s = fill_bit; $flags.z = (DST == 0);

## Bitfield insertion: ins¶

Inserts a bitfield, which is specified like for `extr/extrs`

.
Sets no flags.

- Instructions:
Name Description Present on Subopcode ins Insert a bitfield v3+ units b - Instruction class:
- unsized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 c0 R1, R2, I16 e0 - Immediates:
- zero-extended.
- Operation:
low = SRC2 & 0x1f; size = (SRC2 >> 5 & 0x1f) + 1; if (low + size <= 32) { // nop if bitfield out of bounds - I wouldn't depend on it, though... DST &= ~(((1 << size) - 1) << low); // clear the current contents of the bitfield bf = SRC1 & ((1 << size) - 1); DST |= bf << low; }

## Bitwise operations: and, or, xor¶

Ands, ors, or xors two operands. On Falcon v0, sets no flags. On Falcon v3,
sets all flags - `s`

and `z`

are set as usual, `c`

and `o`

are always
set to 0.

- Instructions:
Name Description Subopcode and Bitwise and 4 or Bitwise or 5 xor Bitwise xor 6 - Instruction class:
- unsized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 c0 R1, R2, I16 e0 R2, R2, I8 f0 R2, R2, I16 f1 R2, R2, R1 fd R3, R2, R1 ff - Immediates:
- zero-extended
- Operation:
if (op == and) DST = SRC1 & SRC2; if (op == or) DST = SRC1 | SRC2; if (op == xor) DST = SRC1 ^ SRC2; if (falcon_version != 0) { $flags.c = 0; $flags.o = 0; $flags.s = S(DST); $flags.z = (DST == 0); }

## Bit extraction: xbit¶

Extracts a single bit of a specified register. On Falcon v0, the bit is stored
to bit 0 of DST, while other destination bits are unmodified, and no flags are
set. On Falcon v3+, the bit is stored to bit 0 of DST, all other bits of DST
are set to 0, `s`

flag is set to 0, and `z`

flag is set iff the extracted
bit was 0 (behaving exactly like an `extr`

instruction with size 1). In both
cases, the bit index is masked off to 5 bits.

- Instructions:
Name Description Subopcode - opcodes c0, ff Subopcode - opcodes f0, fe xbit Extract a bit 8 c - Instruction class:
- unsized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 c0 R3, R2, R1 ff R2, $flags, I8 f0 R1, $flags, R2 fe - Immediates:
- truncated
- Operation:
if (falcon_version == 0) { DST = DST & ~1 | (SRC1 >> bit & 1); } else { DST = SRC1 >> bit & 1; $flags.s = 0; $flags.z = (DST == 0); }

## Bit manipulation: bset, bclr, btgl¶

Set, clear, or flip a specified bit of a register. The requested bit index is masked off to 5 bits. No flags are set.

- Instructions:
Name Description Subopcode - opcodes f0, fd, f9 Subopcode - opcode f4 bset Set a bit 9 31 bclr Clear a bit a 32 btgl Flip a bit b 33 - Instruction class:
- unsized
- Execution time:
- 1 cycle
- Operands:
- DST, SRC
- Forms:
Form Opcode R2, I8 f0 R2, R1 fd $flags, I8 f4 $flags, R2 f9 - Immediates:
- truncated
- Operation:
bit = SRC & 0x1f; if (op == bset) DST |= 1 << bit; else if (op == bclr) DST &= ~(1 << bit); else // op == btgl DST ^= 1 << bit;

## Division and remainder: div, mod¶

Does unsigned 32-bit division / modulus. Sets no flags. If a division by
0 is requested, no exception happens - the division result is always
`0xffffffff`

in this case, and the modulus result is equal to the first
source.

- Instructions:
Name Description Present on Subopcode div Divide v3+ units c mod Take modulus v3+ units d - Instruction class:
- unsized
- Execution time:
- 30-33 cycles
- Operands:
- DST, SRC1, SRC2
- Forms:
Form Opcode R1, R2, I8 c0 R1, R2, I16 e0 R3, R2, R1 ff - Immediates:
- zero-extended
- Operation:
if (SRC2 == 0) { dres = 0xffffffff; } else { dres = SRC1 / SRC2; } if (op == div) DST = dres; else // op == mod DST = SRC1 - dres * SRC2;

## Setting predicates: setp¶

Sets bit #SRC2 in $flags to bit 0 of SRC1. The bit index is masked off to 5 bits.

- Instructions:
Name Description Subopcode setp Set predicate 8 - Instruction class:
- unsized
- Execution time:
- 1 cycle
- Operands:
- SRC1, SRC2
- Forms:
Form Opcode R2, I8 f2 R2, R1 fa - Immediates:
- truncated
- Operation:
bit = SRC2 & 0x1f; $flags = ($flags & ~(1 << bit)) | (SRC1 & 1) << bit;