Am 22.12.21 um 21:53 schrieb Daniel Vetter:
On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote: [SNIP] Still sounds funky. I think minimally we should have an ack from CRIU developers that this is officially the right way to solve this problem. I really don't want to have random one-off hacks that don't work across the board, for a problem where we (drm subsystem) really shouldn't be the only one with this problem. Where "this problem" means that the mmap space is per file description, and not per underlying inode or real device or whatever. That part sounds like a CRIU problem, and I expect CRIU folks want a consistent solution across the board for this. Hence please grab an ack from them.
Unfortunately it's a KFD design problem. AMD used a single device node, then mmaped different objects from the same offset to different processes and expected it to work the rest of the fs subsystem without churn.
So yes, this is indeed because the mmap space is per file descriptor for the use case here.
And thanks for pointing this out, this indeed makes the whole change extremely questionable.
Regards, Christian.
Cheers, Daniel