On 2022-12-10 10:32, Mikhail Krylov wrote: > On Wed, Nov 30, 2022 at 11:07:32AM -0500, Alex Deucher wrote: >> On Wed, Nov 30, 2022 at 10:42 AM Robin Murphy <robin.murphy@xxxxxxx> wrote: >>> >>> On 2022-11-30 14:28, Alex Deucher wrote: >>>> On Wed, Nov 30, 2022 at 7:54 AM Robin Murphy <robin.murphy@xxxxxxx> wrote: >>>>> >>>>> On 2022-11-29 17:11, Mikhail Krylov wrote: >>>>>> On Tue, Nov 29, 2022 at 11:05:28AM -0500, Alex Deucher wrote: >>>>>>> On Tue, Nov 29, 2022 at 10:59 AM Mikhail Krylov <sqarert@xxxxxxxxx> wrote: >>>>>>>> >>>>>>>> On Tue, Nov 29, 2022 at 09:44:19AM -0500, Alex Deucher wrote: >>>>>>>>> On Mon, Nov 28, 2022 at 3:48 PM Mikhail Krylov <sqarert@xxxxxxxxx> wrote: >>>>>>>>>> >>>>>>>>>> On Mon, Nov 28, 2022 at 09:50:50AM -0500, Alex Deucher wrote: >>>>>>>>>> >>>>>>>>>>>>> [excessive quoting removed] >>>>>>>>>> >>>>>>>>>>>> So, is there any progress on this issue? I do understand it's not a high >>>>>>>>>>>> priority one, and today I've checked it on 6.0 kernel, and >>>>>>>>>>>> unfortunately, it still persists... >>>>>>>>>>>> >>>>>>>>>>>> I'm considering writing a patch that will allow user to override >>>>>>>>>>>> need_dma32/dma_bits setting with a module parameter. I'll have some time >>>>>>>>>>>> after the New Year for that. >>>>>>>>>>>> >>>>>>>>>>>> Is it at all possible that such a patch will be merged into kernel? >>>>>>>>>>>> >>>>>>>>>>> On Mon, Nov 28, 2022 at 9:31 AM Mikhail Krylov <sqarert@xxxxxxxxx> wrote: >>>>>>>>>>> Unless someone familiar with HIMEM can figure out what is going wrong >>>>>>>>>>> we should just revert the patch. >>>>>>>>>>> >>>>>>>>>>> Alex >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Okay, I was suggesting that mostly because >>>>>>>>>> >>>>>>>>>> a) it works for me with dma_bits = 40 (I understand that's what it is >>>>>>>>>> without the original patch applied); >>>>>>>>>> >>>>>>>>>> b) there's a hint of uncertainity on this line >>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/radeon/radeon_device.c#n1359 >>>>>>>>>> saying that for AGP dma_bits = 32 is the safest option, so apparently there are >>>>>>>>>> setups, unlike mine, where dma_bits = 32 is better than 40. >>>>>>>>>> >>>>>>>>>> But I'm in no position to argue, just wanted to make myself clear. >>>>>>>>>> I'm okay with rebuilding the kernel for my machine until the original >>>>>>>>>> patch is reverted or any other fix is applied. >>>>>>>>> >>>>>>>>> What GPU do you have and is it AGP? If it is AGP, does setting >>>>>>>>> radeon.agpmode=-1 also fix it? >>>>>>>>> >>>>>>>>> Alex >>>>>>>> >>>>>>>> That is ATI Radeon X1950, and, unfortunately, radeon.agpmode=-1 doesn't >>>>>>>> help, it just makes 3D acceleration in games such as OpenArena stop >>>>>>>> working. >>>>>>> >>>>>>> Just to confirm, is the board AGP or PCIe? >>>>>>> >>>>>>> Alex >>>>>> >>>>>> It is AGP. That's an old machine. >>>>> >>>>> Can you check whether dma_addressing_limited() is actually returning the >>>>> expected result at the point of radeon_ttm_init()? Disabling highmem is >>>>> presumably just hiding whatever problem exists, by throwing away all >>>>> >32-bit RAM such that use_dma32 doesn't matter. >>>> >>>> The device in question only supports a 32 bit DMA mask so >>>> dma_addressing_limited() should return true. Bounce buffers are not >>>> really usable on GPUs because they map so much memory. If >>>> dma_addressing_limited() returns false, that would explain it. >>> >>> Right, it appears to be the only part of the offending commit that >>> *could* reasonably make any difference, so I'm primarily wondering if >>> dma_get_required_mask() somehow gets confused. >> >> Mikhail, >> >> Can you see that dma_addressing_limited() and dma_get_required_mask() >> return in this case? >> >> Alex >> >> >>> >>> Thanks, >>> Robin. > > Hello again, I was able to confirm by adding printk() to the functions > and recompiling the kernel that dma_addressing_limited() returns > *false* on the kernel with the bug. > > And dma_get_required_mask() returns 0x7fffffff, as I said before. Yes, dma_addressing_limited() evaluates to "false" in your case, and this is the correct answer according to the function's comment: "Return %true if the devices DMA mask is too small to address all memory in the system, else %false." In this case the device's DMA mask is 0xFFFFFFFF and the mask for the 1.5 GiB memory is 0x7FFFFFFF, so the static inline returns "false". (dma_direct_get_required_mask() returns this for your memory size.) It would appear that dma_addressing_limited() isn't answering the question which the last parameter to ttm_device_init(), "use GFP_DMA32", wants answered. Perhaps we should use another method to make sure that that parameter is set in the scenario in question. Regards, Luben