PGRAPH overview¶
Contents
NV1/NV3 graph object types¶
The following graphics objects exist on NV1:NV4:
Todo
check Direct3D version
NV4+ graph object classes¶
Not really graph objects, but usable as parameters for some object-bind methods [all NV4:GF100]:
class | name | description |
---|---|---|
0x0030 | NV1_NULL | does nothing |
0x0002 | NV1_DMA_R | DMA object for reading |
0x0003 | NV1_DMA_W | DMA object for writing |
0x003d | NV3_DMA | read/write DMA object |
Todo
document NV1_NULL
NV1-style operation objects [all NV4:NV5]:
class | name | description |
---|---|---|
0x0010 | NV1_OP_CLIP | clipping |
0x0011 | NV1_OP_BLEND_AND | blending |
0x0013 | NV1_OP_ROP_AND | raster operation |
0x0015 | NV1_OP_CHROMA | color key |
0x0064 | NV1_OP_SRCCOPY_AND | source copy with 0-alpha discard |
0x0065 | NV3_OP_SRCCOPY | source copy |
0x0066 | NV4_OP_SRCCOPY_PREMULT | pre-multiplying copy |
0x0067 | NV4_OP_BLEND_PREMULT | pre-multiplied blending |
Memory to memory copy objects:
class | variants | name | description |
---|---|---|---|
0x0039 | NV4:G80 | NV3_M2MF | copies data from one buffer to another |
0x5039 | G80:GF100 | G80_M2MF | copies data from one buffer to another |
0x9039 | GF100:GK104 | GF100_M2MF | copies data from one buffer to another |
0xa040 | GK104:GK110 GK20A | GK104_P2MF | copies data from FIFO to memory buffer |
0xa140 | GK110:GK20A GM107- | GK110_P2MF | copies data from FIFO to memory buffer |
Context objects:
Solids rendering objects:
class | variants | name | description |
---|---|---|---|
0x001c | NV4:NV40 | NV1_LIN | renders a lin |
0x005c | NV4:G80 | NV4_LIN | renders a lin |
0x035c | NV30:NV40 | NV30_LIN | renders a lin |
0x305c | NV40:G84 | NV30_LIN | renders a lin |
0x001d | NV4:NV40 | NV1_TRI | renders a triangle |
0x005d | NV4:G84 | NV4_TRI | renders a triangle |
0x001e | NV4:NV40 | NV1_RECT | renders a rectangle |
0x005e | NV4:NV40 | NV4_RECT | renders a rectangle |
Image upload from CPU objects:
class | variants | name | description |
---|---|---|---|
0x0021 | NV4:NV40 | NV1_IFC | image from CPU |
0x0061 | NV4:G80 | NV4_IFC | image from CPU |
0x0065 | NV5:G80 | NV5_IFC | image from CPU |
0x008a | NV10:G80 | NV10_IFC | image from CPU |
0x038a | NV30:NV40 | NV30_IFC | image from CPU |
0x308a | NV40:G84 | NV40_IFC | image from CPU |
0x0036 | NV4:G80 | NV1_SIFC | stretched image from CPU |
0x0076 | NV4:G80 | NV4_SIFC | stretched image from CPU |
0x0066 | NV5:G80 | NV5_SIFC | stretched image from CPU |
0x0366 | NV30:NV40 | NV30_SIFC | stretched image from CPU |
0x3066 | NV40:G84 | NV40_SIFC | stretched image from CPU |
0x0060 | NV4:G80 | NV4_INDEX | indexed image from CPU |
0x0064 | NV5:G80 | NV5_INDEX | indexed image from CPU |
0x0364 | NV30:NV40 | NV30_INDEX | indexed image from CPU |
0x3064 | NV40:G84 | NV40_INDEX | indexed image from CPU |
0x007b | NV10:G80 | NV10_TEXTURE | texture from CPU |
0x037b | NV30:NV40 | NV30_TEXTURE | texture from CPU |
0x307b | NV40:G80 | NV40_TEXTURE | texture from CPU |
Todo
figure out wtf is the deal with TEXTURE objects
Other 2d source objects:
class | variants | name | description |
---|---|---|---|
0x001f | NV4:G80 | NV1_BLIT | blits inside framebuffer |
0x005f | NV4:G84 | NV4_BLIT | blits inside framebuffer |
0x009f | NV15:G80 | NV15_BLIT | blits inside framebuffer |
0x0037 | NV4:G80 | NV3_SIFM | scaled image from memory |
0x0077 | NV4:G80 | NV4_SIFM | scaled image from memory |
0x0063 | NV10:G80 | NV5_SIFM | scaled image from memory |
0x0089 | NV10:NV40 | NV10_SIFM | scaled image from memory |
0x0389 | NV30:NV40 | NV30_SIFM | scaled image from memory |
0x3089 | NV40:G80 | NV30_SIFM | scaled image from memory |
0x5089 | G80:G84 | G80_SIFM | scaled image from memory |
0x004b | NV4:NV40 | NV3_GDI | draws GDI primitives |
0x004a | NV4:G80 | NV4_GDI | draws GDI primitives |
YCbCr two-source blending objects:
class | variants | name |
---|---|---|
0x0038 | NV4:G80 | NV4_DVD_SUBPICTURE |
0x0088 | NV10:G80 | NV10_DVD_SUBPICTURE |
Todo
find better name for these two
class | variants | name |
---|---|---|
0x502d | G80:GF100 | G80_2D |
0x902d | GF100- | GF100_2D |
NV3-style 3d objects:
class | variants | name | description |
---|---|---|---|
0x0048 | NV4:NV15 | NV3_D3D | Direct3D textured triangles |
0x0054 | NV4:NV20 | NV4_D3D5 | Direct3D 5 textured triangles |
0x0094 | NV10:NV20 | NV10_D3D5 | Direct3D 5 textured triangles |
0x0055 | NV4:NV20 | NV4_D3D6 | Direct3D 6 multitextured triangles |
0x0095 | NV10:NV20 | NV10_D3D6 | Direct3D 6 multitextured triangles |
Todo
check NV3_D3D version
NV10-style 3d objects:
class | variants | name | description |
---|---|---|---|
0x0056 | NV10:NV30 | NV10_3D | Celsius Direct3D 7 engine |
0x0096 | NV15:NV30 | NV15_3D | Celsius Direct3D 7 engine |
0x0098 | NV17:NV20 | NV11_3D | Celsius Direct3D 7 engine |
0x0099 | NV17:NV20 | NV17_3D | Celsius Direct3D 7 engine |
0x0097 | NV20:NV34 | NV20_3D | Kelvin Direct3D 8 SM 1 engine |
0x0597 | NV25:NV40 | NV25_3D | Kelvin Direct3D 8 SM 1 engine |
0x0397 | NV30:NV40 | NV30_3D | Rankine Direct3D 9 SM 2 engine |
0x0497 | NV35:NV34 | NV35_3D | Rankine Direct3D 9 SM 2 engine |
0x3597 | NV40:NV41 | NV35_3D | Rankine Direct3D 9 SM 2 engine |
0x0697 | NV34:NV40 | NV34_3D | Rankine Direct3D 9 SM 2 engine |
0x4097 | NV40:G80 !TC | NV40_3D | Curie Direct3D 9 SM 3 engine |
0x4497 | NV40:G80 TC | NV44_3D | Curie Direct3D 9 SM 3 engine |
0x5097 | G80:G200 | G80_3D | Tesla Direct3D 10 engine |
0x8297 | G84:G200 | G84_3D | Tesla Direct3D 10 engine |
0x8397 | G200:GT215 | G200_3D | Tesla Direct3D 10 engine |
0x8597 | GT215:MCP89 | GT215_3D | Tesla Direct3D 10.1 engine |
0x8697 | MCP89:GF100 | MCP89_3D | Tesla Direct3D 10.1 engine |
0x9097 | GF100:GK104 | GF100_3D | Fermi Direct3D 11 engine |
0x9197 | GF108:GK104 | GF108_3D | Fermi Direct3D 11 engine |
0x9297 | GF110:GK104 | GF110_3D | Fermi Direct3D 11 engine |
0xa097 | GK104:GK110 | GK104_3D | Kepler Direct3D 11.1 engine |
0xa197 | GK110:GK20A | GK110_3D | Kepler Direct3D 11.1 engine |
0xa297 | GK20A:GM107 | GK20A_3D | Kepler Direct3D 11.1 engine |
0xb097 | GM107- | GM107_3D | Maxwell Direct3D 12 engine |
And the compute objects:
class | variants | name | description |
---|---|---|---|
0x50c0 | G80:GF100 | G80_COMPUTE | CUDA 1.x engine |
0x85c0 | GT215:GF100 | GT215_COMPUTE | CUDA 1.x engine |
0x90c0 | GF100:GK104 | GF100_COMPUTE | CUDA 2.x engine |
0x91c0 | GF110:GK104 | GF110_COMPUTE | CUDA 2.x engine |
0xa0c0 | GK104:GK110 GK20A:GM107 | GK104_COMPUTE | CUDA 3.x engine |
0xa1c0 | GK110:GK20A | GK110_COMPUTE | CUDA 3.x engine |
0xb0c0 | GM107:GM204 | GM107_COMPUTE | CUDA 4.x engine |
0xb1c0 | GM204:- | GM200_COMPUTE | CUDA 4.x engine |
The NULL object¶
Todo
write me
The graphics context¶
Todo
write something here
Channel context¶
The following information makes up non-volatile graphics context. This state is per-channel and thus will apply to all objects on it, unless software does trap-swap-restart trickery with object switches. It is guaranteed to be unaffected by subchannel switches and object binds. Some of this state can be set by submitting methods on the context objects, some can only be set by accessing PGRAPH context registers.
the beta factor - set by BETA object
the 8-bit raster operation - set by ROP object
the A1R10G10B10 color for chroma key - set by CHROMA object
the A1R10G10B10 color for plane mask - set by PLANE object
the user clip rectangle - set by CLIP object:
- ???
the pattern state - set by PATTERN object:
- shape: 8x8, 64x1, or 1x64
- 2x A8R10G10B10 pattern color
- the 64-bit pattern itself
the NOTIFY DMA object - pointer to DMA object used by NOTIFY methods. NV1 only - moved to graph object options on NV3+. Set by direct PGRAPH access only.
the main DMA object - pointer to DMA object used by IFM and ITM objects. NV1 only - moved to graph object options on NV3+. Set by direct PGRAPH access only.
On NV1, framebuffer setup - set by direct PGRAPH access only:
- ???
On NV3+, rendering surface setup:
- ???
There are 4 copies of this state, one for each surface used by PGRAPH:
- DST - the 2d destination surface
- SRC - the 2d source surface [used by BLIT object only]
- COLOR - the 3d color surface
- ZETA - the 3d depth surface
Note that the M2MF source/destination, ITM destination, IFM/SIFM source, and D3D texture don’t count as surfaces - even though they may be configured to access the same data as surfaces on NV3+, they’re accessed through the DMA circuitry, not the surface circuitry, and their setup is part of volatile state.
Todo
beta factor size
Todo
user clip state
Todo
NV1 framebuffer setup
Todo
NV3 surface setup
Todo
figure out the extra clip stuff, etc.
Todo
update for NV4+
Graph object options¶
In addition to the per-channel state, there is also per-object non-volatile state, called graph object options. This state is stored in the RAMHT entry for the object [NV1], or in a RAMIN structure [NV3-]. On subchannel switches and object binds, the PFIFO will send this state [NV1] or the pointer to this state [NV3-] to PGRAPH via method 0. On NV1:NV4, this state cannot be modified by any object methods and requires RAMHT/RAMIN access to change. On NV4+, PGRAPH can bind DMA objects on its own when requested via methods, and update the DMA object pointers in RAMIN. On NV5+, PGRAPH can modify most of this state when requested via methods. All NV4+ automatic options modification methods can be disabled by software, if so desired.
The graph options contain the following information:
- 2d pipeline configuration
- 2d color and mono format
- NOTIFY_VALID flag - if set, NOTIFY method will be enabled. If unset, NOTIFY method will cause an interrupt. Can be used by the driver to emulate per-object DMA_NOTIFY setting - this flag will be set on objects whose emulated DMA_NOTIFY value matches the one currently in PGRAPH context, and interrupt will cause a switch of the PGRAPH context value followed by a method restart.
- SUBCONTEXT_ID - a single-bit flag that can be used to emulate more than one PGRAPH context on one channel. When an object is bound and its SUBCONTEXT_ID doesn’t match PGRAPH’s current SUBCONTEXT_ID, a context switch interrupt is raised to allow software to load an alternate context.
Todo
NV3+
See NV1 PGRAPH: graphics engine for detailed format.
Volatile state¶
In addition to the non-volatile state described above, PGRAPH also has plenty of “volatile” state. This state deals with the currently requested operation and may be destroyed by switching to a new subchannel or binding a new object [though not by full channel switches - the channels are supposed to be independent after all, and kernel driver is supposed to save/restore all state, including volatile state].
Volatile state is highly object-specific, but common stuff is listed here:
- the “notifier write pending” flag and requested notification type
Todo
more stuff?
Notifiers¶
The notifiers are 16-byte memory structures accessed via DMA objects, used for synchronization. Notifiers are written by PGRAPH when certain operations are completed. Software can poll on the memory structure, waiting for it to be written by PGRAPH. The notifier structure is:
- base+0x0:
- 64-bit timestamp - written by PGRAPH with current PTIMER time as of the notifier write. The timestamp is a concatenation of current values of TIME_LOW and TIME_HIGH registers When big-endian mode is in effect, this becomes a 64-bit big-endian number as expected.
- base+0x8:
- 32-bit word always set to 0 by PGRAPH. This field may be used by software to put a non-0 value for software-written error-caused notifications.
- base+0xc:
- 32-bit word always set to 0 by PGRAPH. This is used for synchronization - the software is supposed to set this field to a non-0 value before submitting the notifier write request, then wait for it to become 0. Since the notifier fields are written in order, it is guaranteed that the whole notifier structure has been written by the time this field is set to 0.
Todo
verify big endian on non-G80
There are two types of notifiers: ordinary notifiers [NV1-] and M2MF notifiers [NV3-]. Normal notifiers are written when explicitely requested by the NOTIFY method, M2MF notifiers are written on M2MF transfer completion. M2MF notifiers cannot be turned off, thus it’s required to at least set up a notifier DMA object if M2MF is used, even if the software doesn’t wish to use notifiers for synchronization.
Todo
figure out NV20 mysterious warning notifiers
Todo
describe GF100+ notifiers
The notifiers are always written to the currently bound notifier DMA object. The M2MF notifiers share the DMA object with ordinary notifiers. The layout of the DMA object used for notifiers is fixed:
- 0x00: ordinary notifier #0
- 0x10: M2MF notifier [NV3-]
- 0x20: ordinary notifier #2 [NV3:NV4 only]
- 0x30: ordinary notifier #3 [NV3:NV4 only]
- 0x40: ordinary notifier #4 [NV3:NV4 only]
- 0x50: ordinary notifier #5 [NV3:NV4 only]
- 0x60: ordinary notifier #6 [NV3:NV4 only]
- 0x70: ordinary notifier #7 [NV3:NV4 only]
- 0x80: ordinary notifier #8 [NV3:NV4 only]
- 0x90: ordinary notifier #9 [NV3:NV4 only]
- 0xa0: ordinary notifier #10 [NV3:NV4 only]
- 0xb0: ordinary notifier #11 [NV3:NV4 only]
- 0xc0: ordinary notifier #12 [NV3:NV4 only]
- 0xd0: ordinary notifier #13 [NV3:NV4 only]
- 0xe0: ordinary notifier #14 [NV3:NV4 only]
- 0xf0: ordinary notifier #15 [NV3:NV4 only]
Todo
0x20 - NV20 warning notifier?
Note that the notifiers always have to reside at the very beginning of the DMA object. On NV1 and NV4+, this effectively means that only 1 notifier of each type can be used per DMA object, requiring mulitple DMA objects if more than one notifier per type is to be used, and likely requiring a dedicated DMA object for the notifiers. On NV3:NV4, up to 15 ordinary notifiers may be used in a single DMA object, though that DMA object likely still needs to be dedicated for notifiers, and only one of the notifiers supports interrupt generation.
NOTIFY method¶
Ordinary notifiers are requested via the NOTIFY method. Note that the NOTIFY method schedules a notifier write on completion of the method following the NOTIFY - NOTIFY merely sets “a notifier write is pending” state.
It is an error if a NOTIFY method is followed by another NOTIFY method, a DMA_NOTIFY method, an object bind, or a subchannel switch.
In addition to a notifier write, the NOTIFY method may also request a NOTIFY interrupt to be triggered on PGRAPH after the notifier write.
- mthd 0x104: NOTIFY [all NV1:GF100 graph objects]
Requests a notifier write and maybe an interrupt. The write/interrupt will be actually performed after the next method completes. Possible parameter values are:
0: WRITE - write ordinary notifier #0 1: WRITE_AND_AWAKEN - write ordinary notifier 0, then trigger NOTIFY
interrupt [NV3-]2: WRITE_2 - write ordinary notifier #2 [NV3:NV4] 3: WRITE_3 - write ordinary notifier #3 [NV3:NV4] […] 15: WRITE_15 - write ordinary notifier #15 [NV3:NV4]
- Operation::
- if (!cur_grobj.NOTIFY_VALID) {
- /* DMA notify object not set, or needs to be swapped in by sw */ throw(INVALID_NOTIFY);
- } else if ((param > 0 && gpu == NV1)
- || (param > 15 && gpu >= NV3 && gpu < NV4) || (param > 1 && gpu >= NV4)) {
/* XXX: what state is changed? */ throw(INVALID_VALUE);
- } else if (NOTIFY_PENDING) {
- /* tried to do two NOTIFY methods in row / / XXX: what state is changed? */ throw(DOUBLE_NOTIFY);
- } else {
- NOTIFY_PENDING = 1; NOTIFY_TYPE = param;
}
After every method other than NOTIFY and DMA_NOTIFY, the following is done:
if (NOTIFY_PENDING) {
int idx = NOTIFY_TYPE;
if (idx == 1)
idx = 0;
dma_write64(NOTIFY_DMA, idx*0x10+0x0, PTIMER.TIME_HIGH << 32 | PTIMER.TIME_LOW);
dma_write32(NOTIFY_DMA, idx*0x10+0x8, 0);
dma_write32(NOTIFY_DMA, idx*0x10+0xc, 0);
if (NOTIFY_TYPE == 1)
irq_trigger(NOTIFY);
NOTIFY_PENDING = 0;
}
if a subchannel switch or object bind is done while NOTIFY_PENDING is set, CTXSW_NOTIFY error is raised.
NOTE: NV1 has a 1-bit NOTIFY_PENDING field, allowing it to do notifier writes with interrupts, but lacks support for setting it via the NOTIFY method. This functionality thus has to be emulated by the driver if needed.
DMA_NOTIFY method¶
On NV4+, the notifier DMA object can be bound by submitting the DMA_NOTIFY method. This functionality can be disabled by the driver in PGRAPH settings registers if not desired.
- mthd 0x180: DMA_NOTIFY [all NV4:GF100 graph objects]
- Sets the notifier DMA object. When submitted through PFIFO, this method will undergo handle -> address translation via RAMHT.
- Operation::
- if (DMA_METHODS_ENABLE) {
- /* XXX: list the validation checks */ NOTIFY_DMA = param;
- } else {
- throw(INVALID_METHOD);
}
NOP method¶
On NV4+ a NOP method was added to enable asking for a notifier write without having to submit an actual method to the object. The NOP method does nothing, but still counts as a graph object method and will thus trigger a notifier write/interrupt if one was previously requested.
- mthd 0x100: NOP [all NV4+ graph objects]
- Does nothing.
- Operation::
- /* nothing */
Todo
figure out if this method can be disabled for NV1 compat