Re: [PATCH v4 11/14] drm/amdgpu: Guard against write accesses after device removal

Andrey Grodzovsky <Andrey.Grodzovsky@xxxxxxx> · Sun, 7 Feb 2021 16:28:17 -0500

On 2/5/21 5:10 PM, Daniel Vetter wrote:
On Fri, Feb 5, 2021 at 5:22 PM Andrey Grodzovsky
<Andrey.Grodzovsky@xxxxxxx> wrote:

Daniel, ping. Also, please refer to the other thread with Bjorn from pci-dev
on the same topic I added you to.

Summarizing my take over there for here plus maybe some more
clarification. There's two problems:

- You must guarantee that after the ->remove callback of your driver
is finished, there's no more mmio or any other hw access. A
combination of stopping stuff and drm_dev_enter/exit can help with
that. This prevents the use-after-free issue.

- For the actual hotunplug time, i.e. anything that can run while your
driver is used up to the point where ->remove callback has finished
stopp hw access you must guarantee that code doesn't blow up when it
gets bogus reads (in the form of 0xff values). drm_dev_enter/exit
can't help you with that. Plus you should make sure that we're not
spending forever waiting for a big pile of mmio access all to time out
because you never bail out - some coarse-grained drm_dev_enter/exit
might help here.

Plus finally the userspace access problem: You must guarantee that
after ->remove has finished that none of the uapi or cross-driver
access points (driver ioctl, dma-buf, dma-fence, anything else that
hangs around) can reach the data structures/memory mappings/whatever
which have been released as part of your ->remove callback.
drm_dev_enter/exit is again the tool of choice here.

So you have to use drm_dev_enter/exit for some of the problems we face
on hotunplug, but it's not the tool that can handle the actual hw
hotunplug race conditions for you.

Unfortunately the hw hotunplug race condition is an utter pain to
test, since essentially you need to validate your driver against
spurious 0xff reads at any moment. And I don't even have a clever idea
to simulate this, e.g. by forcefully replacing the iobar mapping: What
we'd need is a mapping that allows reads (so we can fill a page with
0xff and use that everywhere), but instead of rejecting writes, allows
them, but drops them (so that the 0xff stays intact). Maybe we could
simulate this with some kernel debug tricks (kinda like mmiotrace)
with a read-only mapping and dropping every write every time we fault.

Clarification - as far as I know there are no page fault handlers for kernel
mappings. And we are talking about kernel mappings here, right ?  If there were 
I could solve all those issues the same as I do for user mappings, by
invalidating all existing mappings in the kernel (both kmaps and ioreamps)and 
insert dummy zero or ~0 filled page instead.
Also, I assume forcefully remapping the IO BAR to ~0 filled page would involve
ioremap API and it's not something that I think can be easily done according to
am answer i got to a related topic a few weeks ago 
https://www.spinics.net/lists/linux-pci/msg103396.html (that was the only reply 
i got)

But ugh ...

Otoh validating an entire driver like amdgpu without such a trick
against 0xff reads is practically impossible. So maybe you need to add
this as one of the tasks here?

Or I could just for validation purposes return ~0 from all reg reads in the code
and ignore writes if drm_dev_unplugged, this could already easily validate a big 
portion of the code flow under such scenario.

Andrey

-Daniel

Andrey

On 1/29/21 2:25 PM, Christian König wrote:
Am 29.01.21 um 18:35 schrieb Andrey Grodzovsky:

On 1/29/21 10:16 AM, Christian König wrote:
Am 28.01.21 um 18:23 schrieb Andrey Grodzovsky:

On 1/19/21 1:59 PM, Christian König wrote:
Am 19.01.21 um 19:22 schrieb Andrey Grodzovsky:

On 1/19/21 1:05 PM, Daniel Vetter wrote:
[SNIP]
So say writing in a loop to some harmless scratch register for many times
both for plugged
and unplugged case and measure total time delta ?

I think we should at least measure the following:

1. Writing X times to a scratch reg without your patch.
2. Writing X times to a scratch reg with your patch.
3. Writing X times to a scratch reg with the hardware physically disconnected.

I suggest to repeat that once for Polaris (or older) and once for Vega or
Navi.

The SRBM on Polaris is meant to introduce some delay in each access, so it
might react differently then the newer hardware.

Christian.

See attached results and the testing code. Ran on Polaris (gfx8) and
Vega10(gfx9)

In summary, over 1 million WWREG32 in loop with and without this patch you
get around 10ms of accumulated overhead ( so 0.00001 millisecond penalty for
each WWREG32) for using drm_dev_enter check when writing registers.

P.S Bullet 3 I cannot test as I need eGPU and currently I don't have one.

Well if I'm not completely mistaken that are 100ms of accumulated overhead.
So around 100ns per write. And even bigger problem is that this is a ~67%
increase.

My bad, and 67% from what ? How u calculate ?

My bad, (308501-209689)/209689=47% increase.

I'm not sure how many write we do during normal operation, but that sounds
like a bit much. Ideas?

Well, u suggested to move the drm_dev_enter way up but as i see it the problem
with this is that it increase the chance of race where the
device is extracted after we check for drm_dev_enter (there is also such
chance even when it's placed inside WWREG but it's lower).
Earlier I propsed that instead of doing all those guards scattered all over
the code simply delay release of system memory pages and unreserve of
MMIO ranges to until after the device itself is gone after last drm device
reference is dropped. But Daniel opposes delaying MMIO ranges unreserve to after
PCI remove code because according to him it will upset the PCI subsytem.

Yeah, that's most likely true as well.

Maybe Daniel has another idea when he's back from vacation.

Christian.

Andrey

Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CAndrey.Grodzovsky%40amd.com%7C7810d8d6f03443ce2e0408d8ca22ea99%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637481598615581693%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=zTV6FTpL3titmMTVEPxxVT8e5lTKVsLViwZudEsNn%2Bw%3D&amp;reserved=0

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel