Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ping

Andrey

On 2/10/21 5:01 PM, Andrey Grodzovsky wrote:

On 2/9/21 10:40 AM, Christian König wrote:
Am 09.02.21 um 15:30 schrieb Andrey Grodzovsky:
[SNIP]
Question - Why can't we just set those PTEs to point to system memory (another RO dummy page)
filled with 1s ?


Then writes are not discarded. E.g. the 1s would change to something else.

Christian.


I see but, what about marking the mappings as RO and discarding the write access page faults continuously until the device is finalized ? Regarding using an unused range behind the upper bridge as Daniel suggested, I wonder will this interfere with the upcoming feature to support BARs movement  during hot plug - https://www.spinics.net/lists/linux-pci/msg103195.html ?

In the picture in my head the address will never be used.

But it doesn't even needs to be an unused range of the root bridge. That one is usually stuffed full by the BIOS.

For my BAR resize work I looked at quite a bunch of NB documentation. At least for AMD CPUs we should always have an MMIO address which is never ever used. That makes this platform/CPU dependent, but the code is just minimal.

The really really nice thing about this approach is that you could unit test and audit the code for problems even without *real* hotplug hardware. E.g. we can swap the kernel PTEs and see how the whole power and display code reacts to that etc....

Christian.


Tried to play with hacking mmio tracer as Daniel suggested but just hanged the machine so...

Can you tell me how to dynamically obtain this kind of unused MMIO address ? I think given such address, writing to which is dropped and reading from return all 1s, I can then do something like bellow, if that what u meant -

for (address = adev->rmmio; address < adev->rmmio_size; adress += PAGE_SIZE) {

    old_pte = get_locked_pte(init_mm, address)
    dummy_pte = pfn_pte(fake_mmio_address, 0);
    set_pte(&old_pte, dummy_pte)

}

flush_tlb ???

P.S I hope to obtain thunderbolt eGPU adapter soon so even if this won't work I still will be able to test how the driver handles all 1s.

Andrey



Andrey





Andrey



Christian.

It's a nifty idea indeed otherwise ...
-Daniel

Regards,
Christian.


But ugh ...

Otoh validating an entire driver like amdgpu without such a trick
against 0xff reads is practically impossible. So maybe you need to add
this as one of the tasks here?
Or I could just for validation purposes return ~0 from all reg reads in the code and ignore writes if drm_dev_unplugged, this could already easily validate a big
portion of the code flow under such scenario.
Hm yeah if your really wrap them all, that should work too. Since
iommappings have __iomem pointer type, as long as amdgpu is sparse
warning free, should be doable to guarantee this.
Problem is that ~0 is not always a valid register value.

You would need to audit every register read that it doesn't use the returned
value blindly as index or similar. That is quite a bit of work.
Yeah that's the entire crux here :-/
-Daniel


_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux