On 12/05/2022 00:15, Dixit, Ashutosh wrote:
On Tue, 10 May 2022 03:41:57 -0700, Andrzej Hajda wrote:
On 10.05.2022 11:48, Tvrtko Ursulin wrote:
On 10/05/2022 10:39, Andrzej Hajda wrote:
On 10.05.2022 10:18, Tvrtko Ursulin wrote:
Was there closure/agreement on the matter of whether or not there is
a potential race between "kfree(gt)" and sysfs access (last put from
sysfs that is)? I've noticed Andrzej and Ashutosh were discussing it
but did not read all the details.
Not really :)
IMO docs are against this practice, Ashutosh shows examples of this
practice in code and according to his analysis it is safe.
I gave up looking for contradictions :) Either it is OK, kobject is
not fully shared object, docs are obsolete and needs update, either
the patch is wrong.
Anyway finally I tend to accept this solution, I failed to prove it is
wrong :)
Like a question of whether hotunplug can be triggered while userspace
is sitting in a sysfs hook? Final kfree then has to be delayed until
userspace exists.
Btw where is the "kfree(gt)" for the tiles on the PCI remove path? I
can't find it.. Do we have a leak?
intel_gt_tile_cleanup ?
Called from intel_gt_release_all, whose only caller is the failure path
of i915_driver_probe. Feels like something is missing?
This is final proof this patch is safe - no kfree, no UAF :)
Apparently it is broken in internal branch as well.
Should I take care of it?
See Daniele's comment here:
https://patchwork.freedesktop.org/patch/478856/?series=101551&rev=1
Yeah we found that same leak yesterday, or the day before in this thread.
We clean up the gt sysfs during PCI device remove (i915_driver_remove ->
i915_driver_unregister -> intel_gt_driver_unregister ->
intel_gt_sysfs_unregister (added in this patch)). But from Daniele's mail
it appears that "kfree(gt)" can only be done from i915_driver_release().
So as long as i915_driver_release() always happens after
i915_driver_remove() (which seems to be the case though I couldn't figure
out why (i.e. who is putting the final reference of the drm device)) there
is no UAF and no race. Thanks!
No worried by the unknown? I had a quick look whether core_hotunplug
tests for sysfs interactions but couldn't spot it. What I had in mind is
userspace stuck in a sysfs hook (say read into a userfaultfd buffer)
with device hotunplug in parallel. Maybe it is all handled already, not
claiming that it isn't.
Regards,
Tvrtko