On Thu, 2023-09-07 at 11:02 +0800, Li Xiao Keng wrote: > > > On 2023/9/6 21:31, Martin Wilck wrote: > > On Wed, 2023-09-06 at 16:51 +0800, Li Xiao Keng wrote: > > > > > > > > > On 2023/9/6 3:08, Martin Wilck wrote: > > > > On Wed, 2023-09-06 at 00:17 +0800, Coly Li wrote: > > > > > Hi Xiao Keng, > > > > > > > > > > Thanks for the updated version, I add my comments inline. > > > > > > > > > > On Tue, Sep 05, 2023 at 08:02:06PM +0800, Li Xiao Keng wrote: > > > > > > When we add a new disk to a raid, it may return -EBUSY. > > > > > > > > > > Where is above -EBUSY from? do you mean mdadm command returns > > > > > -EBUSY or it is returned by some specific function in mdadm > > > > > source code. > > > > > > > > > > > Because the new disk is added to the raid by "mdadm -- > > > incremental", > > > the "mdadm --add" will return the err. > > > > > > > > > > > > > > > The main process of --add: > > > > > > 1. dev_open > > > > > > 2. store_super1(st, di->fd) in write_init_super1 > > > > > > 3. fsync(di->fd) in write_init_super1 > > > > > > 4. close(di->fd) > > > > > > 5. ioctl(ADD_NEW_DISK) > > > > > > > > > > > > However, there will be some udev(change) event after step4. > > > > > > Then > > > > > > "/usr/sbin/mdadm --incremental ..." will be run, and the > > > > > > new > > > > > > disk > > > > > > will be add to md device. After that, ioctl will return - > > > > > > EBUSY. > > > > > > > > > > > > > > > > Dose returning -EBUSY hurt anything? Or only returns -EBUSY > > > > > and > > > > > other stuffs all work as expected? > > > > > > > > IIUC, it does not. The manual --add command will fail. Li Xiao > > > > Keng > > > > has > > > > described the problem in earlier emails. > > > Yes! The disk is add to the raid, but the manual --add command > > > will > > > fail. > > > We will decide the next action based on the return value. > > > > > > > > > > > > > Here we add map_lock before write_init_super in "mdadm -- > > > > > > add" > > > > > > to fix this race. > > > > > > > > > > > > > > > > I am not familiar this part of code, but I see ignoring the > > > > > failure > > > > > from map_lock() in Assemble() is on purpose by Neil. > > > > > Therefore I > > > > > just guess simply return from Assemble when map_lock() fails > > > > > might > > > > > not be what you wanted. > > > > > > > > > > > > > > > > Signed-off-by: Li Xiao Keng <lixiaokeng@xxxxxxxxxx> > > > > > > Signed-off-by: Guanqin Miao <miaoguanqin@xxxxxxxxxx> > > > > > > --- > > > > > > Assemble.c | 5 ++++- > > > > > > Manage.c | 25 +++++++++++++++++-------- > > > > > > 2 files changed, 21 insertions(+), 9 deletions(-) > > > > > > > > > > > > diff --git a/Assemble.c b/Assemble.c > > > > > > index 49804941..086890ed 100644 > > > > > > --- a/Assemble.c > > > > > > +++ b/Assemble.c > > > > > > @@ -1479,8 +1479,11 @@ try_again: > > > > > > * to our list. We flag them so that we don't try > > > > > > to > > > > > > re- > > > > > > add, > > > > > > * but can remove if they turn out to not be > > > > > > wanted. > > > > > > */ > > > > > > - if (map_lock(&map)) > > > > > > + if (map_lock(&map)) { > > > > > > pr_err("failed to get exclusive lock on > > > > > > mapfile > > > > > > - > > > > > > continue anyway...\n"); > > > > > > + return 1; > > > > > > > > > > Especially when the error message noticed "continue anyway" > > > > > but a > > > > > return 1 > > > > > followed, the behavior might be still confusing. > > > > > > > > Now as you're saying it, I recall I had the same comment last > > > > time > > > > ;-) > > > > > > > I'm very sorry for this stupid mistake. I I find I send v1 patch > > > but > > > not > > > v2. I will send patch v2 to instead of it. > > > > > > - if (map_lock(&map)) > > > - pr_err("failed to get exclusive lock on mapfile - > > > continue anyway...\n"); > > > + if (map_lock(&map)) { > > > + pr_err("failed to get exclusive lock on mapfile > > > when > > > assemble raid.\n"); > > > + return 1; > > > + } > > > > > > > I might add that "return 1" is dangerous, as it pretends that > > > > Manage_add() was successful and actually added a device, which > > > > is > > > > not > > > > the case. In the special case that Li Xiao Keng wants to fix, > > > > it's > > > > true > > > > (sort of) because the asynchronous "mdadm -I" will have added > > > > the > > > > device already. But there could be other races where > > > > Assemble_map() > > > > can't obtain the lock and still the device will not be added > > > > later. > > > > > > > > > > Do I missunstandings > > > "AFAICS it would only help if the code snipped above did not only > > > pr_err() but exit if it can't get an exclusive lock." ? > > > > > > > > Anyway, map_lock is a blocking function. If it can't get the > > > lock, it > > blocks. > > > If map_lock() return error, Assemble() return 1. When -add unlock > > > it, > > > Assemble() will go ahead but not return at map_lock(). > > > > Maybe *I* was misunderstanding. I thought map_lock() returned error > > if > > the lock was held by the other process. What exactly does an error > > return from map_lock() mean? If it does not mean "lock held by > > another > > process", why does your patch solve the race issue? > > > > The -add locks map_lock() before udev(change) event happen, and > unlocks it > until ioctl(ADD_NEW_DISK) finishing to solve the race issue. This > makes > manual -add return success and -incremental (triggered by uevent) > will > fail, which is same as the previous successful execution of the -add > command. OK. Then maybe you should just leave the code for handling the map_lock() error condition as-is. It tends to confuse readers. Martin