On 2023/9/6 3:08, Martin Wilck wrote: > On Wed, 2023-09-06 at 00:17 +0800, Coly Li wrote: >> Hi Xiao Keng, >> >> Thanks for the updated version, I add my comments inline. >> >> On Tue, Sep 05, 2023 at 08:02:06PM +0800, Li Xiao Keng wrote: >>> When we add a new disk to a raid, it may return -EBUSY. >> >> Where is above -EBUSY from? do you mean mdadm command returns >> -EBUSY or it is returned by some specific function in mdadm >> source code. >> Because the new disk is added to the raid by "mdadm --incremental", the "mdadm --add" will return the err. >>> >>> The main process of --add: >>> 1. dev_open >>> 2. store_super1(st, di->fd) in write_init_super1 >>> 3. fsync(di->fd) in write_init_super1 >>> 4. close(di->fd) >>> 5. ioctl(ADD_NEW_DISK) >>> >>> However, there will be some udev(change) event after step4. Then >>> "/usr/sbin/mdadm --incremental ..." will be run, and the new disk >>> will be add to md device. After that, ioctl will return -EBUSY. >>> >> >> Dose returning -EBUSY hurt anything? Or only returns -EBUSY and >> other stuffs all work as expected? > > IIUC, it does not. The manual --add command will fail. Li Xiao Keng has > described the problem in earlier emails. Yes! The disk is add to the raid, but the manual --add command will fail. We will decide the next action based on the return value. > >>> Here we add map_lock before write_init_super in "mdadm --add" >>> to fix this race. >>> >> >> I am not familiar this part of code, but I see ignoring the failure >> from map_lock() in Assemble() is on purpose by Neil. Therefore I >> just guess simply return from Assemble when map_lock() fails might >> not be what you wanted. >> >> >>> Signed-off-by: Li Xiao Keng <lixiaokeng@xxxxxxxxxx> >>> Signed-off-by: Guanqin Miao <miaoguanqin@xxxxxxxxxx> >>> --- >>> Assemble.c | 5 ++++- >>> Manage.c | 25 +++++++++++++++++-------- >>> 2 files changed, 21 insertions(+), 9 deletions(-) >>> >>> diff --git a/Assemble.c b/Assemble.c >>> index 49804941..086890ed 100644 >>> --- a/Assemble.c >>> +++ b/Assemble.c >>> @@ -1479,8 +1479,11 @@ try_again: >>> * to our list. We flag them so that we don't try to re- >>> add, >>> * but can remove if they turn out to not be wanted. >>> */ >>> - if (map_lock(&map)) >>> + if (map_lock(&map)) { >>> pr_err("failed to get exclusive lock on mapfile - >>> continue anyway...\n"); >>> + return 1; >> >> Especially when the error message noticed "continue anyway" but a >> return 1 >> followed, the behavior might be still confusing. > > Now as you're saying it, I recall I had the same comment last time ;-) > I'm very sorry for this stupid mistake. I I find I send v1 patch but not v2. I will send patch v2 to instead of it. - if (map_lock(&map)) - pr_err("failed to get exclusive lock on mapfile - continue anyway...\n"); + if (map_lock(&map)) { + pr_err("failed to get exclusive lock on mapfile when assemble raid.\n"); + return 1; + } > I might add that "return 1" is dangerous, as it pretends that > Manage_add() was successful and actually added a device, which is not > the case. In the special case that Li Xiao Keng wants to fix, it's true > (sort of) because the asynchronous "mdadm -I" will have added the > device already. But there could be other races where Assemble_map() > can't obtain the lock and still the device will not be added later. > Do I missunstandings "AFAICS it would only help if the code snipped above did not only pr_err() but exit if it can't get an exclusive lock." ? Anyway, map_lock is a blocking function. If it can't get the lock, it blocks. If map_lock() return error, Assemble() return 1. When -add unlock it, Assemble() will go ahead but not return at map_lock(). Do you think it should remain as it is? Like: if (map_lock(&map)) pr_err("failed to get exclusive lock on mapfile - continue anyway...\n"); >>> >>> + if (map_lock(&map)) { >>> + pr_err("failed to get exclusive lock on mapfile >>> when add disk\n"); >>> + return -1; >>> + } >>> if (array->not_persistent==0) { >>> int dfd; >>> if (dv->disposition == 'j') >>> @@ -911,9 +916,9 @@ int Manage_add(int fd, int tfd, struct >>> mddev_dev *dv, >>> dfd = dev_open(dv->devname, O_RDWR |