PGRAPH overview

Introduction

Todo

write me

Todo

WAIT_FOR_IDLE and PM_TRIGGER

NV1/NV3 graph object types

The following graphics objects exist on NV1:NV4:

id variants name description
0x01 all BETA sets beta factor for blending
0x02 all ROP sets raster operation
0x03 all CHROMA sets color for color key
0x04 all PLANE sets the plane mask
0x05 all CLIP sets clipping rectangle
0x06 all PATTERN sets pattern, ie. a small repeating image used as one of the inputs to a raster operation or blending
0x07 NV3:NV4 RECT renders solid rectangles
0x08 all POINT renders single points
0x09 all LINE renders solid lines
0x0a all LIN renders solid lins [ie. lines missing a pixel on one end]
0x0b all TRI renders solid triangles
0x0c NV1:NV3 RECT renders solid rectangles
0x0c NV3:NV4 GDI renders Windows 95 primitives: rectangles and characters, with font read from a DMA object
0x0d NV1:NV3 TEXLIN renders quads with linearly mapped textures
0x0d NV3:NV4 M2MF copies data from one DMA object to another
0x0e NV1:NV3 TEXQUAD renders quads with quadratically mapped textures
0x0e NV3:NV4 SIFM Scaled Image From Memory, like NV1’s IFM, but with scaling
0x10 all BLIT copies rectangles of pixels from one place in framebuffer to another
0x11 all IFC Image From CPU, uploads a rectangle of pixels via methods
0x12 all BITMAP uploads and expands a bitmap [ie. 1bpp image] via methods
0x13 NV1:NV3 IFM Image From Memory, uploads a rectangle of pixels from a DMA object to framebuffer
0x14 all ITM Image To Memory, downloads a rectangle of pixels to a DMA object from framebuffer
0x15 NV3:NV4 SIFC Stretched Image From CPU, like IFC, but with image stretching
0x17 NV3:NV4 D3D Direct3D 5 textured triangles
0x18 NV3:NV4 ZPOINT renders single points to a surface with depth buffer
0x1c NV3:NV4 SURF sets rendering surface parameters
0x1d NV1:NV3 TEXLINBETA renders lit quads with linearly mapped textures
0x1e NV1:NV3 TEXQUADBETA renders lit quads with quadratically mapped textures

Todo

check Direct3D version

NV4+ graph object classes

Not really graph objects, but usable as parameters for some object-bind methods [all NV4:GF100]:

class name description
0x0030 NV1_NULL does nothing
0x0002 NV1_DMA_R DMA object for reading
0x0003 NV1_DMA_W DMA object for writing
0x003d NV3_DMA read/write DMA object

Todo

document NV1_NULL

NV1-style operation objects [all NV4:NV5]:

class name description
0x0010 NV1_OP_CLIP clipping
0x0011 NV1_OP_BLEND_AND blending
0x0013 NV1_OP_ROP_AND raster operation
0x0015 NV1_OP_CHROMA color key
0x0064 NV1_OP_SRCCOPY_AND source copy with 0-alpha discard
0x0065 NV3_OP_SRCCOPY source copy
0x0066 NV4_OP_SRCCOPY_PREMULT pre-multiplying copy
0x0067 NV4_OP_BLEND_PREMULT pre-multiplied blending

Memory to memory copy objects:

class variants name description
0x0039 NV4:G80 NV3_M2MF copies data from one buffer to another
0x5039 G80:GF100 G80_M2MF copies data from one buffer to another
0x9039 GF100:GK104 GF100_M2MF copies data from one buffer to another
0xa040 GK104:GK110 GK20A GK104_P2MF copies data from FIFO to memory buffer
0xa140 GK110:GK20A GM107- GK110_P2MF copies data from FIFO to memory buffer

Context objects:

class variants name description
0x0012 NV4:G84 NV1_BETA sets beta factor for blending
0x0017 NV4:G80 NV1_CHROMA sets color for color key
0x0057 NV4:G84 NV4_CHROMA sets color for color key
0x0018 NV4:G80 NV1_PATTERN sets pattern for raster op
0x0044 NV4:G84 NV1_PATTERN sets pattern for raster op
0x0019 NV4:G84 NV1_CLIP sets user clipping rectangle
0x0043 NV4:G84 NV1_ROP sets raster operation
0x0072 NV4:G84 NV4_BETA4 sets component beta factors for pre-multiplied blending
0x0058 NV4:G80 NV3_SURF_DST sets the 2d destination surface
0x0059 NV4:G80 NV3_SURF_SRC sets the 2d blit source surface
0x005a NV4:G80 NV3_SURF_COLOR sets the 3d color surface
0x005b NV4:G80 NV3_SURF_ZETA sets the 3d zeta surface
0x0052 NV4:G80 NV4_SWZSURF sets 2d swizzled destination surface
0x009e NV10:G80 NV10_SWZSURF sets 2d swizzled destination surface
0x039e NV30:NV40 NV30_SWZSURF sets 2d swizzled destination surface
0x309e NV40:G80 NV30_SWZSURF sets 2d swizzled destination surface
0x0042 NV4:G80 NV4_SURF2D sets 2d destination and source surfaces
0x0062 NV10:G80 NV10_SURF2D sets 2d destination and source surfaces
0x0362 NV30:NV40 NV30_SURF2D sets 2d destination and source surfaces
0x3062 NV40:G80 NV30_SURF2D sets 2d destination and source surfaces
0x5062 G80:G84 G80_SURF2D sets 2d destination and source surfaces
0x0053 NV4:NV20 NV4_SURF3D sets 3d color and zeta surfaces
0x0093 NV10:NV20 NV10_SURF3D sets 3d color and zeta surfaces

Solids rendering objects:

class variants name description
0x001c NV4:NV40 NV1_LIN renders a lin
0x005c NV4:G80 NV4_LIN renders a lin
0x035c NV30:NV40 NV30_LIN renders a lin
0x305c NV40:G84 NV30_LIN renders a lin
0x001d NV4:NV40 NV1_TRI renders a triangle
0x005d NV4:G84 NV4_TRI renders a triangle
0x001e NV4:NV40 NV1_RECT renders a rectangle
0x005e NV4:NV40 NV4_RECT renders a rectangle

Image upload from CPU objects:

class variants name description
0x0021 NV4:NV40 NV1_IFC image from CPU
0x0061 NV4:G80 NV4_IFC image from CPU
0x0065 NV5:G80 NV5_IFC image from CPU
0x008a NV10:G80 NV10_IFC image from CPU
0x038a NV30:NV40 NV30_IFC image from CPU
0x308a NV40:G84 NV40_IFC image from CPU
0x0036 NV4:G80 NV1_SIFC stretched image from CPU
0x0076 NV4:G80 NV4_SIFC stretched image from CPU
0x0066 NV5:G80 NV5_SIFC stretched image from CPU
0x0366 NV30:NV40 NV30_SIFC stretched image from CPU
0x3066 NV40:G84 NV40_SIFC stretched image from CPU
0x0060 NV4:G80 NV4_INDEX indexed image from CPU
0x0064 NV5:G80 NV5_INDEX indexed image from CPU
0x0364 NV30:NV40 NV30_INDEX indexed image from CPU
0x3064 NV40:G84 NV40_INDEX indexed image from CPU
0x007b NV10:G80 NV10_TEXTURE texture from CPU
0x037b NV30:NV40 NV30_TEXTURE texture from CPU
0x307b NV40:G80 NV40_TEXTURE texture from CPU

Todo

figure out wtf is the deal with TEXTURE objects

Other 2d source objects:

class variants name description
0x001f NV4:G80 NV1_BLIT blits inside framebuffer
0x005f NV4:G84 NV4_BLIT blits inside framebuffer
0x009f NV15:G80 NV15_BLIT blits inside framebuffer
0x0037 NV4:G80 NV3_SIFM scaled image from memory
0x0077 NV4:G80 NV4_SIFM scaled image from memory
0x0063 NV10:G80 NV5_SIFM scaled image from memory
0x0089 NV10:NV40 NV10_SIFM scaled image from memory
0x0389 NV30:NV40 NV30_SIFM scaled image from memory
0x3089 NV40:G80 NV30_SIFM scaled image from memory
0x5089 G80:G84 G80_SIFM scaled image from memory
0x004b NV4:NV40 NV3_GDI draws GDI primitives
0x004a NV4:G80 NV4_GDI draws GDI primitives

YCbCr two-source blending objects:

class variants name
0x0038 NV4:G80 NV4_DVD_SUBPICTURE
0x0088 NV10:G80 NV10_DVD_SUBPICTURE

Todo

find better name for these two

Unified 2d objects:

class variants name
0x502d G80:GF100 G80_2D
0x902d GF100- GF100_2D

NV3-style 3d objects:

class variants name description
0x0048 NV4:NV15 NV3_D3D Direct3D textured triangles
0x0054 NV4:NV20 NV4_D3D5 Direct3D 5 textured triangles
0x0094 NV10:NV20 NV10_D3D5 Direct3D 5 textured triangles
0x0055 NV4:NV20 NV4_D3D6 Direct3D 6 multitextured triangles
0x0095 NV10:NV20 NV10_D3D6 Direct3D 6 multitextured triangles

Todo

check NV3_D3D version

NV10-style 3d objects:

class variants name description
0x0056 NV10:NV30 NV10_3D Celsius Direct3D 7 engine
0x0096 NV15:NV30 NV15_3D Celsius Direct3D 7 engine
0x0098 NV17:NV20 NV11_3D Celsius Direct3D 7 engine
0x0099 NV17:NV20 NV17_3D Celsius Direct3D 7 engine
0x0097 NV20:NV34 NV20_3D Kelvin Direct3D 8 SM 1 engine
0x0597 NV25:NV40 NV25_3D Kelvin Direct3D 8 SM 1 engine
0x0397 NV30:NV40 NV30_3D Rankine Direct3D 9 SM 2 engine
0x0497 NV35:NV34 NV35_3D Rankine Direct3D 9 SM 2 engine
0x3597 NV40:NV41 NV35_3D Rankine Direct3D 9 SM 2 engine
0x0697 NV34:NV40 NV34_3D Rankine Direct3D 9 SM 2 engine
0x4097 NV40:G80 !TC NV40_3D Curie Direct3D 9 SM 3 engine
0x4497 NV40:G80 TC NV44_3D Curie Direct3D 9 SM 3 engine
0x5097 G80:G200 G80_3D Tesla Direct3D 10 engine
0x8297 G84:G200 G84_3D Tesla Direct3D 10 engine
0x8397 G200:GT215 G200_3D Tesla Direct3D 10 engine
0x8597 GT215:MCP89 GT215_3D Tesla Direct3D 10.1 engine
0x8697 MCP89:GF100 MCP89_3D Tesla Direct3D 10.1 engine
0x9097 GF100:GK104 GF100_3D Fermi Direct3D 11 engine
0x9197 GF108:GK104 GF108_3D Fermi Direct3D 11 engine
0x9297 GF110:GK104 GF110_3D Fermi Direct3D 11 engine
0xa097 GK104:GK110 GK104_3D Kepler Direct3D 11.1 engine
0xa197 GK110:GK20A GK110_3D Kepler Direct3D 11.1 engine
0xa297 GK20A:GM107 GK20A_3D Kepler Direct3D 11.1 engine
0xb097 GM107- GM107_3D Maxwell Direct3D 12 engine

And the compute objects:

class variants name description
0x50c0 G80:GF100 G80_COMPUTE CUDA 1.x engine
0x85c0 GT215:GF100 GT215_COMPUTE CUDA 1.x engine
0x90c0 GF100:GK104 GF100_COMPUTE CUDA 2.x engine
0x91c0 GF110:GK104 GF110_COMPUTE CUDA 2.x engine
0xa0c0 GK104:GK110 GK20A:GM107 GK104_COMPUTE CUDA 3.x engine
0xa1c0 GK110:GK20A GK110_COMPUTE CUDA 3.x engine
0xb0c0 GM107:GM204 GM107_COMPUTE CUDA 4.x engine
0xb1c0 GM204:- GM200_COMPUTE CUDA 4.x engine

The NULL object

Todo

write me

The graphics context

Todo

write something here

Channel context

The following information makes up non-volatile graphics context. This state is per-channel and thus will apply to all objects on it, unless software does trap-swap-restart trickery with object switches. It is guaranteed to be unaffected by subchannel switches and object binds. Some of this state can be set by submitting methods on the context objects, some can only be set by accessing PGRAPH context registers.

  • the beta factor - set by BETA object

  • the 8-bit raster operation - set by ROP object

  • the A1R10G10B10 color for chroma key - set by CHROMA object

  • the A1R10G10B10 color for plane mask - set by PLANE object

  • the user clip rectangle - set by CLIP object:

    • ???
  • the pattern state - set by PATTERN object:

    • shape: 8x8, 64x1, or 1x64
    • 2x A8R10G10B10 pattern color
    • the 64-bit pattern itself
  • the NOTIFY DMA object - pointer to DMA object used by NOTIFY methods. NV1 only - moved to graph object options on NV3+. Set by direct PGRAPH access only.

  • the main DMA object - pointer to DMA object used by IFM and ITM objects. NV1 only - moved to graph object options on NV3+. Set by direct PGRAPH access only.

  • On NV1, framebuffer setup - set by direct PGRAPH access only:

    • ???
  • On NV3+, rendering surface setup:

    • ???

    There are 4 copies of this state, one for each surface used by PGRAPH:

    • DST - the 2d destination surface
    • SRC - the 2d source surface [used by BLIT object only]
    • COLOR - the 3d color surface
    • ZETA - the 3d depth surface

    Note that the M2MF source/destination, ITM destination, IFM/SIFM source, and D3D texture don’t count as surfaces - even though they may be configured to access the same data as surfaces on NV3+, they’re accessed through the DMA circuitry, not the surface circuitry, and their setup is part of volatile state.

Todo

beta factor size

Todo

user clip state

Todo

NV1 framebuffer setup

Todo

NV3 surface setup

Todo

figure out the extra clip stuff, etc.

Todo

update for NV4+

Graph object options

In addition to the per-channel state, there is also per-object non-volatile state, called graph object options. This state is stored in the RAMHT entry for the object [NV1], or in a RAMIN structure [NV3-]. On subchannel switches and object binds, the PFIFO will send this state [NV1] or the pointer to this state [NV3-] to PGRAPH via method 0. On NV1:NV4, this state cannot be modified by any object methods and requires RAMHT/RAMIN access to change. On NV4+, PGRAPH can bind DMA objects on its own when requested via methods, and update the DMA object pointers in RAMIN. On NV5+, PGRAPH can modify most of this state when requested via methods. All NV4+ automatic options modification methods can be disabled by software, if so desired.

The graph options contain the following information:

  • 2d pipeline configuration
  • 2d color and mono format
  • NOTIFY_VALID flag - if set, NOTIFY method will be enabled. If unset, NOTIFY method will cause an interrupt. Can be used by the driver to emulate per-object DMA_NOTIFY setting - this flag will be set on objects whose emulated DMA_NOTIFY value matches the one currently in PGRAPH context, and interrupt will cause a switch of the PGRAPH context value followed by a method restart.
  • SUBCONTEXT_ID - a single-bit flag that can be used to emulate more than one PGRAPH context on one channel. When an object is bound and its SUBCONTEXT_ID doesn’t match PGRAPH’s current SUBCONTEXT_ID, a context switch interrupt is raised to allow software to load an alternate context.

Todo

NV3+

See NV1 PGRAPH: graphics engine for detailed format.

Volatile state

In addition to the non-volatile state described above, PGRAPH also has plenty of “volatile” state. This state deals with the currently requested operation and may be destroyed by switching to a new subchannel or binding a new object [though not by full channel switches - the channels are supposed to be independent after all, and kernel driver is supposed to save/restore all state, including volatile state].

Volatile state is highly object-specific, but common stuff is listed here:

  • the “notifier write pending” flag and requested notification type

Todo

more stuff?

Notifiers

The notifiers are 16-byte memory structures accessed via DMA objects, used for synchronization. Notifiers are written by PGRAPH when certain operations are completed. Software can poll on the memory structure, waiting for it to be written by PGRAPH. The notifier structure is:

base+0x0:
64-bit timestamp - written by PGRAPH with current PTIMER time as of the notifier write. The timestamp is a concatenation of current values of TIME_LOW and TIME_HIGH registers When big-endian mode is in effect, this becomes a 64-bit big-endian number as expected.
base+0x8:
32-bit word always set to 0 by PGRAPH. This field may be used by software to put a non-0 value for software-written error-caused notifications.
base+0xc:
32-bit word always set to 0 by PGRAPH. This is used for synchronization - the software is supposed to set this field to a non-0 value before submitting the notifier write request, then wait for it to become 0. Since the notifier fields are written in order, it is guaranteed that the whole notifier structure has been written by the time this field is set to 0.

Todo

verify big endian on non-G80

There are two types of notifiers: ordinary notifiers [NV1-] and M2MF notifiers [NV3-]. Normal notifiers are written when explicitely requested by the NOTIFY method, M2MF notifiers are written on M2MF transfer completion. M2MF notifiers cannot be turned off, thus it’s required to at least set up a notifier DMA object if M2MF is used, even if the software doesn’t wish to use notifiers for synchronization.

Todo

figure out NV20 mysterious warning notifiers

Todo

describe GF100+ notifiers

The notifiers are always written to the currently bound notifier DMA object. The M2MF notifiers share the DMA object with ordinary notifiers. The layout of the DMA object used for notifiers is fixed:

  • 0x00: ordinary notifier #0
  • 0x10: M2MF notifier [NV3-]
  • 0x20: ordinary notifier #2 [NV3:NV4 only]
  • 0x30: ordinary notifier #3 [NV3:NV4 only]
  • 0x40: ordinary notifier #4 [NV3:NV4 only]
  • 0x50: ordinary notifier #5 [NV3:NV4 only]
  • 0x60: ordinary notifier #6 [NV3:NV4 only]
  • 0x70: ordinary notifier #7 [NV3:NV4 only]
  • 0x80: ordinary notifier #8 [NV3:NV4 only]
  • 0x90: ordinary notifier #9 [NV3:NV4 only]
  • 0xa0: ordinary notifier #10 [NV3:NV4 only]
  • 0xb0: ordinary notifier #11 [NV3:NV4 only]
  • 0xc0: ordinary notifier #12 [NV3:NV4 only]
  • 0xd0: ordinary notifier #13 [NV3:NV4 only]
  • 0xe0: ordinary notifier #14 [NV3:NV4 only]
  • 0xf0: ordinary notifier #15 [NV3:NV4 only]

Todo

0x20 - NV20 warning notifier?

Note that the notifiers always have to reside at the very beginning of the DMA object. On NV1 and NV4+, this effectively means that only 1 notifier of each type can be used per DMA object, requiring mulitple DMA objects if more than one notifier per type is to be used, and likely requiring a dedicated DMA object for the notifiers. On NV3:NV4, up to 15 ordinary notifiers may be used in a single DMA object, though that DMA object likely still needs to be dedicated for notifiers, and only one of the notifiers supports interrupt generation.

NOTIFY method

Ordinary notifiers are requested via the NOTIFY method. Note that the NOTIFY method schedules a notifier write on completion of the method following the NOTIFY - NOTIFY merely sets “a notifier write is pending” state.

It is an error if a NOTIFY method is followed by another NOTIFY method, a DMA_NOTIFY method, an object bind, or a subchannel switch.

In addition to a notifier write, the NOTIFY method may also request a NOTIFY interrupt to be triggered on PGRAPH after the notifier write.

mthd 0x104: NOTIFY [all NV1:GF100 graph objects]

Requests a notifier write and maybe an interrupt. The write/interrupt will be actually performed after the next method completes. Possible parameter values are:

0: WRITE - write ordinary notifier #0 1: WRITE_AND_AWAKEN - write ordinary notifier 0, then trigger NOTIFY

interrupt [NV3-]

2: WRITE_2 - write ordinary notifier #2 [NV3:NV4] 3: WRITE_3 - write ordinary notifier #3 [NV3:NV4] […] 15: WRITE_15 - write ordinary notifier #15 [NV3:NV4]

Operation::
if (!cur_grobj.NOTIFY_VALID) {
/* DMA notify object not set, or needs to be swapped in by sw */ throw(INVALID_NOTIFY);
} else if ((param > 0 && gpu == NV1)
|| (param > 15 && gpu >= NV3 && gpu < NV4) || (param > 1 && gpu >= NV4)) {

/* XXX: what state is changed? */ throw(INVALID_VALUE);

} else if (NOTIFY_PENDING) {
/* tried to do two NOTIFY methods in row / / XXX: what state is changed? */ throw(DOUBLE_NOTIFY);
} else {
NOTIFY_PENDING = 1; NOTIFY_TYPE = param;

}

After every method other than NOTIFY and DMA_NOTIFY, the following is done:

if (NOTIFY_PENDING) {
    int idx = NOTIFY_TYPE;
    if (idx == 1)
        idx = 0;
    dma_write64(NOTIFY_DMA, idx*0x10+0x0, PTIMER.TIME_HIGH << 32 | PTIMER.TIME_LOW);
    dma_write32(NOTIFY_DMA, idx*0x10+0x8, 0);
    dma_write32(NOTIFY_DMA, idx*0x10+0xc, 0);
    if (NOTIFY_TYPE == 1)
        irq_trigger(NOTIFY);
    NOTIFY_PENDING = 0;
}

if a subchannel switch or object bind is done while NOTIFY_PENDING is set, CTXSW_NOTIFY error is raised.

NOTE: NV1 has a 1-bit NOTIFY_PENDING field, allowing it to do notifier writes with interrupts, but lacks support for setting it via the NOTIFY method. This functionality thus has to be emulated by the driver if needed.

DMA_NOTIFY method

On NV4+, the notifier DMA object can be bound by submitting the DMA_NOTIFY method. This functionality can be disabled by the driver in PGRAPH settings registers if not desired.

mthd 0x180: DMA_NOTIFY [all NV4:GF100 graph objects]
Sets the notifier DMA object. When submitted through PFIFO, this method will undergo handle -> address translation via RAMHT.
Operation::
if (DMA_METHODS_ENABLE) {
/* XXX: list the validation checks */ NOTIFY_DMA = param;
} else {
throw(INVALID_METHOD);

}

NOP method

On NV4+ a NOP method was added to enable asking for a notifier write without having to submit an actual method to the object. The NOP method does nothing, but still counts as a graph object method and will thus trigger a notifier write/interrupt if one was previously requested.

mthd 0x100: NOP [all NV4+ graph objects]
Does nothing.
Operation::
/* nothing */

Todo

figure out if this method can be disabled for NV1 compat