I didn't file an Issue as I understand the cause and have a successful patch that I've been running here for several days. gitlab.freedesktop.org is RO (ie, I can't fork/stage a PR there) so I committed the absurdity of creating a PR on the public gitlab mirror in order to have a URL to point to: https://gitlab.com/freedesktop-mirror/drm/-/merge_requests/1 copypasting the commit message: nv50 IB-mode DMA crash fixes Address the following nv50 IB-mode dma code errors: 1) nouveau_drm.h labels NV50_DMA_PUSH_MAX_LENGTH "Maximum push buffer size". Perhaps originally true(?), but nothing suggests, guards, or enforces the assumption, and it's not true in practice. 0x7fffff is not the maximum possible push buffer size submitted for IB-mode DMA transfers, nor is it the maximum legal DMA transfer size (it is three bytes too large according to our reverse-engineering docs). 2) nouveau_gem_ioctl_pushbuf() still implicitly treats bit 23 of the pushbuffer buffer[].length field as NOUVEAU_GEM_PUSHBUF_NO_PREFETCH, despite previous commits moving the flag out of the length field for lower levels as this is an nv50-hardware-specific arrangement. It also conflicts with point 1 above as pushbuffers can be equal to or greater than 1<<23 in practice. (Does the ioctl code in question have reason to set or expose no_prefetch?) 3) Bits 0 and 1 of the length field are passed through and placed into the IB entry, but they don't signify length in the DMA request. Bit 0 is labeled an unknown flag, bit 1 specifies use of the non_main transfer counter. DMA transfers can only be whole words, despite pushbuffers not being restricted to whole-word lengths. So there are two ways this can bite: Setting flags we don't mean to set and not transferring trailing bytes in incomplete words. Changes: This PR makes the following changes toward the single logical change of 'Don't lock up Quadros when using CAD': 1) NV50_DMA_PUSH_MAX_LENGTH is reduced from 0x7fffff to 0x7ffffc, the maximum value that can actually be encoded in a DMA request. 2) nouveau_gem_ioctl_pushbuf() no longer looks for NOUVEAU_GEM_PUSHBUF_NO_PREFETCH in bit 23 of the length field, and always sets no_prefetch to 0 in the call to nv50_dma_push() when using IB-mode. 3) nouveau_gem_ioctl_pushbuf() specifies (push[].length + 0x3) << 2 >> 2 for DMA transfer length to enforce complete-word IB-mode transfers that also send trailing bytes. 4) nouveau_gem_ioctl_pushbuf() splits pushbuffers longer than NV50_DMA_PUSH_MAX_LENGTH (as calculated in 3 above) into multiple calls to nv50_dma_push() when using nv50 IB-mode. Although I understand the cause, the patch raises a few other unknowns-to-me I'd like some commentary on: 1) The nv50 DMA code clearly believed push buffers could not (or should not) exceed NV50_DMA_PUSH_MAX_LENGTH, when my analysis logging shows that they regularly exceed this maximum by an order of magnitude or more. Was the assumption originally true? If so, when/why did it change? Is the more correct fix to enforce individual pushbuf size limits? 2) Are any of the documented bit flags that 'hug' the length field (NON_MAIN, NO_PREFETCH, etc) actually used/useful at the level of the GEM API? Right now, two are exposed by accident (because the code thinks the length is byte-granularity when the lower two bits are actually flags) and the third is broken out explicitly--- but it's actually only ever set by a clobber from push[i].length exceeding 0x7fffff. Monty