Re: [PATCH 1/5] md: fix a lock order reversal in md_alloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 03 Sep 2021, Guoqing Jiang wrote:
> 
> On 9/1/21 7:38 PM, Christoph Hellwig wrote:
> > Commit b0140891a8cea3 ("md: Fix race when creating a new md device.")
> > not only moved assigning mddev->gendisk before calling add_disk, which
> > fixes the races described in the commit log, but also added a
> > mddev->open_mutex critical section over add_disk and creation of the
> > md kobj.  Adding a kobject after add_disk is racy vs deleting the gendisk
> > right after adding it, but md already prevents against that by holding
> > a mddev->active reference.
> 
> Assuming you mean md_open calls mddev_find -> mddev_get
>   -> atomic_inc(&mddev->active), but the path had already existed
> before b0140891a8c,  and md_alloc also called mddev_find at
> that time, not sure how it prevents the race though I probably missed
> something. Cc Neil.

It is a long time since I've looked much at this code, but I *think*
that the reason calling kobject_add() after add_disk() is safe is
because that gendisk is *only* deleted by md_free() which happens when
that kobject is 'put' - in mddev_delayed_delete().  That only gets
called when mddev->active is zero (from mddev_put().
So as Christoph says, the fact that md_alloc() hold a count on
mddev->active (until it calls mddeV_put() at the very end) makes this
safe.

As for the ->open_mutex locking - I have a vague memory that it was to
protect something that might happen *after* md_open() from running
before md_alloc() had completed.
add_disk() itself adds sufficient exclusion via disk->open_mutex, so it
could only be the kobject/sysfs manipulations.

All I can think if is that add_disk() would send an event to udev and udev
might do something before the 'md' directory had been created.
But the udev rules seem safe against that.  In particular they check if
md/array_state exists before checking if it is clear or inactive.

I did struggle with some races like that, but if that was the problem,
I'm surprised that I didn't explain it in the commit message.

So in short: I cannot see why I added that locking, and I cannot see
that removing it would break anything.
  
Reviewed-by: NeilBrown <neilb@xxxxxxx>

NeilBrown


> 
> > On the other hand taking this lock added a lock order reversal with what
> > is not disk->open_mutex (used to be bdev->bd_mutex when the commit was
> > added) for partition devices, which need that lock for the internal open
> > for the partition scan, and a recent commit also takes it for
> > non-partitioned devices, leading to further lockdep splatter.
> >
> > Fixes: b0140891a8ce ("md: Fix race when creating a new md device.")
> > Fixes: d62633873590 ("block: support delayed holder registration")
> 
> IIUC, the issue appeared after d6263387359 (which was for dm issue),
> perhaps stable maintainer should not apply this to any stable kernel
> if it only includes b0140891a8ce.
> 
> Thanks,
> Guoqing
> 
> 




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux