falcon is a class of general-purpose microprocessor units, used in multiple instances on nvidia GPUs starting from G98. Originally developed as the controlling logic for VP3 video decoding engines as a replacement for xtensa used on VP2, it was later used in many other places, whenever a microprocessor of some sort was needed.

A single falcon unit is made of:

  • the core microprocessor with its code and data SRAM [see Processor control]
  • an IO space containing control registers of all subunits, accessible from the host as well as from the code running on the falcon microprocessor [see IO space]
  • common support logic:
  • optionally, FIFO interface logic, for falcon units used as PFIFO engines and some others [see FIFO interface]
  • optionally, common memory interface logic [see Memory interface]. However, some engines have their own type of memory interface.
  • optionally, a cryptographic AES coprocessor. A falcon unit with such coprocessor is called a “secretful” unit. [see Cryptographic coprocessor]
  • any unit-specific logic the microprocessor is supposed to control


figure out remaining circuitry

The base falcon hardware comes in several different revisions:

  • version 0: used on G98, MCP77, MCP79
  • version 3: used on GT215+, adds a crude VM system for the code segment, edge/level interrupt modes, new instructions [division, software traps, bitfield manipulation, …], and other features
  • version 4: used on GF119+ for some engines [others are still version 3]: adds support for 24-bit code addressing, debugging and ???
  • version 4.1: used on GK110+ for some engines, changes unknown
  • version 5: used on GK208+ for some engines, redesigned ISA encoding


figure out v4 new stuff


figure out v4.1 new stuff


figure out v5 new stuff

The falcon units present on nvidia cards are: