Re: [PATCH v2] Fix race of "mdadm --add" and "mdadm --incremental"

Martin Wilck <mwilck@xxxxxxxx> · Wed, 06 Sep 2023 15:31:42 +0200

On Wed, 2023-09-06 at 16:51 +0800, Li Xiao Keng wrote:
> 
> 
> On 2023/9/6 3:08, Martin Wilck wrote:
> > On Wed, 2023-09-06 at 00:17 +0800, Coly Li wrote:
> > > Hi Xiao Keng,
> > > 
> > > Thanks for the updated version, I add my comments inline.
> > > 
> > > On Tue, Sep 05, 2023 at 08:02:06PM +0800, Li Xiao Keng wrote:
> > > > When we add a new disk to a raid, it may return -EBUSY.
> > > 
> > > Where is above -EBUSY from? do you mean mdadm command returns
> > > -EBUSY or it is returned by some specific function in mdadm
> > > source code.
> > > 
> 
> Because the new disk is added to the raid by "mdadm --incremental",
> the "mdadm --add" will return the err.
> 
> > > > 
> > > > The main process of --add:
> > > > 1. dev_open
> > > > 2. store_super1(st, di->fd) in write_init_super1
> > > > 3. fsync(di->fd) in write_init_super1
> > > > 4. close(di->fd)
> > > > 5. ioctl(ADD_NEW_DISK)
> > > > 
> > > > However, there will be some udev(change) event after step4.
> > > > Then
> > > > "/usr/sbin/mdadm --incremental ..." will be run, and the new
> > > > disk
> > > > will be add to md device. After that, ioctl will return -EBUSY.
> > > > 
> > > 
> > > Dose returning -EBUSY hurt anything? Or only returns -EBUSY and
> > > other stuffs all work as expected?
> > 
> > IIUC, it does not. The manual --add command will fail. Li Xiao Keng
> > has
> > described the problem in earlier emails.
> Yes！ The disk is add to the raid, but the manual --add command will
> fail.
> We will decide the next action based on the return value.
> 
> >  
> > > > Here we add map_lock before write_init_super in "mdadm --add"
> > > > to fix this race.
> > > > 
> > > 
> > > I am not familiar this part of code, but I see ignoring the
> > > failure
> > > from map_lock() in Assemble() is on purpose by Neil. Therefore I
> > > just guess simply return from Assemble when map_lock() fails
> > > might
> > > not be what you wanted.
> > > 
> > > 
> > > > Signed-off-by: Li Xiao Keng <lixiaokeng@xxxxxxxxxx>
> > > > Signed-off-by: Guanqin Miao <miaoguanqin@xxxxxxxxxx>
> > > > ---
> > > >  Assemble.c |  5 ++++-
> > > >  Manage.c   | 25 +++++++++++++++++--------
> > > >  2 files changed, 21 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/Assemble.c b/Assemble.c
> > > > index 49804941..086890ed 100644
> > > > --- a/Assemble.c
> > > > +++ b/Assemble.c
> > > > @@ -1479,8 +1479,11 @@ try_again:
> > > >          * to our list.  We flag them so that we don't try to
> > > > re-
> > > > add,
> > > >          * but can remove if they turn out to not be wanted.
> > > >          */
> > > > -       if (map_lock(&map))
> > > > +       if (map_lock(&map)) {
> > > >                 pr_err("failed to get exclusive lock on mapfile
> > > > -
> > > > continue anyway...\n");
> > > > +               return 1;
> > > 
> > > Especially when the error message noticed "continue anyway" but a
> > > return 1
> > > followed, the behavior might be still confusing.
> > 
> > Now as you're saying it, I recall I had the same comment last time
> > ;-)
> > 
> I'm very sorry for this stupid mistake. I I find I send v1 patch but
> not
> v2. I will send patch v2 to instead of it.
> 
> -       if (map_lock(&map))
> -               pr_err("failed to get exclusive lock on mapfile -
> continue anyway...\n");
> +       if (map_lock(&map)) {
> +               pr_err("failed to get exclusive lock on mapfile when
> assemble raid.\n");
> +               return 1;
> +       }
> 
> > I might add that "return 1" is dangerous, as it pretends that
> > Manage_add() was successful and actually added a device, which is
> > not
> > the case. In the special case that Li Xiao Keng wants to fix, it's
> > true
> > (sort of) because the asynchronous "mdadm -I" will have added the
> > device already. But there could be other races where Assemble_map()
> > can't obtain the lock and still the device will not be added later.
> > 
> 
> Do I missunstandings
> "AFAICS it would only help if the code snipped above did not only
> pr_err() but exit if it can't get an exclusive lock." ?
> 

> Anyway, map_lock is a blocking function. If it can't get the lock, it
blocks.
> If map_lock() return error, Assemble() return 1. When -add unlock it,
> Assemble() will go ahead but not return at map_lock().

Maybe *I* was misunderstanding. I thought map_lock() returned error if
the lock was held by the other process. What exactly does an error
return from map_lock() mean? If it does not mean "lock held by another
process", why does your patch solve the race issue?

Martin