On Sat, Apr 13, 2024 at 2:11 AM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > On Sat, Apr 13, 2024 at 12:26:07AM +0800, Sam Sun wrote: > > On Fri, Apr 12, 2024 at 10:40 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > > I suspect the usb_hub_to_struct_hub() call is racing with the > > > spinlock-protected region in hub_disconnect() (in hub.c). > > > > > > > If there is any other thing I could help, please let me know. > > > > > > Try the patch below. It should eliminate that race, which hopefully > > > will fix the problem. > > > I applied this patch and tried to execute several times, no more > > kernel core dump in my environment. I think this bug is fixed by the > > patch. But I do have one more question about it. Since it is a data > > race bug, it has reproducibility issues originally. How can I confirm > > if a racy bug is fixed by test? This kind of bug might still have a > > race window but is harder to trigger. Just curious, not for this > > patch. I think this patch eliminates the racy window. > > If you don't what what is racing, then testing cannot prove that a race > is eliminated. However, if you do know where a race occurs then it's > easy to see how mutual exclusion can prevent the race from happening. > > In this case the bug might have had a different cause, something other > than a race between usb_hub_to_struct_hub() and hub_disconnect(). If > that's so then testing this patch would not be a definite proof that the > bug is gone. But if that race _is_ the cause of the bug then this patch > will fix it -- you can see that just by reading the code with no need > for testing. > > Besides, the patch is needed in any case because that race certainly > _can_ occur. And maybe not only on this pathway. > Thanks for explaining! I will check the related code next time. > May I add your "Reported-and-tested-by:" to the patch? Sure, thanks for your help! Best Regards, Yue