On 13/05/2022 06:05, Dixit, Ashutosh wrote:
On Thu, 12 May 2022 00:48:08 -0700, Tvrtko Ursulin wrote:
Hi Tvrtko,
On 12/05/2022 00:15, Dixit, Ashutosh wrote:
On Tue, 10 May 2022 03:41:57 -0700, Andrzej Hajda wrote:
On 10.05.2022 11:48, Tvrtko Ursulin wrote:
On 10/05/2022 10:39, Andrzej Hajda wrote:
On 10.05.2022 10:18, Tvrtko Ursulin wrote:
Was there closure/agreement on the matter of whether or not there is
a potential race between "kfree(gt)" and sysfs access (last put from
sysfs that is)? I've noticed Andrzej and Ashutosh were discussing it
but did not read all the details.
Not really :)
IMO docs are against this practice, Ashutosh shows examples of this
practice in code and according to his analysis it is safe.
I gave up looking for contradictions :) Either it is OK, kobject is
not fully shared object, docs are obsolete and needs update, either
the patch is wrong.
Anyway finally I tend to accept this solution, I failed to prove it is
wrong :)
Like a question of whether hotunplug can be triggered while userspace
is sitting in a sysfs hook? Final kfree then has to be delayed until
userspace exists.
Btw where is the "kfree(gt)" for the tiles on the PCI remove path? I
can't find it.. Do we have a leak?
intel_gt_tile_cleanup ?
Called from intel_gt_release_all, whose only caller is the failure path
of i915_driver_probe. Feels like something is missing?
This is final proof this patch is safe - no kfree, no UAF :)
Apparently it is broken in internal branch as well.
Should I take care of it?
See Daniele's comment here:
https://patchwork.freedesktop.org/patch/478856/?series=101551&rev=1
Yeah we found that same leak yesterday, or the day before in this thread.
We clean up the gt sysfs during PCI device remove (i915_driver_remove ->
i915_driver_unregister -> intel_gt_driver_unregister ->
intel_gt_sysfs_unregister (added in this patch)). But from Daniele's mail
it appears that "kfree(gt)" can only be done from i915_driver_release().
So as long as i915_driver_release() always happens after
i915_driver_remove() (which seems to be the case though I couldn't figure
out why (i.e. who is putting the final reference of the drm device)) there
is no UAF and no race. Thanks!
No worried by the unknown?
Well if release() happens before or during remove() then (as is clear from
Daniele's mail) we have a much bigger problem than sysfs on our hands and
will see UAF crashes during device remove/unbind. But as far as we know no
such crashes have been reported.
I had a quick look whether core_hotunplug tests for sysfs interactions
but couldn't spot it. What I had in mind is userspace stuck in a sysfs
hook (say read into a userfaultfd buffer) with device hotunplug in
parallel. Maybe it is all handled already, not claiming that it isn't.
This is the 20 year old issue mentioned by Andrzej here:
https://lwn.net/Articles/36850/
So I thought I'll try this out today and see what actually happens to
settle this. And you will see it makes perfect sense. So this is what I
did:
* Change IGT to add a 20 second sleep after opening a sysfs file
* In that 20 second period, with an open fd, unbind the device using:
echo -n "0000:03:00.0" > /sys/bus/pci/drivers/i915/unbind
And also rmmod i915.
So this is what we see when we do this:
* As soon as the device is unbound, the complete i915 sysfs tree (under
/sys/card/drm/card0) is cleanly removed (even with the open fd in IGT).
* The fd open in IGT is now orphan/invalid, so when IGT resumes and tries
to use that fd IGT crashes.
* So no problem with device unbind but if IGT is still hanging around rmmod
fails (saying module is in use, most likely due to the still open drm fd)
but after IGT is completely killed rmmod is also fine.
So this confirms all this is correctly handled.
I was unsure what does "IGT crashes" exactly meant so I went to try it
out myself. It's -ENODEV from read(2) it receives so it all indeed seems
handled fine.
Although hotunplug seems generally very unhealthy, at least on
5.18.0-rc8 I tested on.. I'll send my subtest to the mailing list in
case it is consider useful to have it.
Regards,
Tvrtko
Separately, note that kobject_put's introduced in this patch are only
needed for freeing the memory allocated for the kobject's themselves (or
their containing struct's). kobject_put's don't play a role in cleaning up
the sysfs hierarchy itself (which will get cleaned up even without the
kobject_put's). Further, child kobject's take a reference on their parents
so child kobjects need a kobject_put before the parent kobject_put to free
the memory allocated for the parent (i.e. doing a kobject_put on the parent
will not automatically free all the child kobjects).
Thanks.
--
Ashutosh