RE: [PATCH] Create.c: Try few more times to stop array after failed creation

"Baldysiak, Pawel" <pawel.baldysiak@xxxxxxxxx> · Tue, 9 Sep 2014 10:04:38 +0000




> On: Monday, September 08, 2014 8:34 AM NeilBrown wrote:
> To: Baldysiak, Pawel
> Cc: linux-raid@xxxxxxxxxxxxxxx; Paszkiewicz, Artur
> Subject: Re: [PATCH] Create.c: Try few more times to stop array after failed
> creation
> 
> On Fri, 05 Sep 2014 16:26:13 +0200 Pawel Baldysiak
> <pawel.baldysiak@xxxxxxxxx> wrote:
> 
> > Sometimes after failure in creation (exp. due to duplicate devices in
> > create command) newly created empty md array will not be stopped due
> > to openers>1 (create_mddev will not manage to drop lock).
> > In this case ioctl() will return error - this needs to be checked and
> > if occurs - sending STOP_ARRAY should be repeat after delay to make
> > sure that mddev is stopped correctly.
> >
> > Signed-off-by: Pawel Baldysiak <pawel.baldysiak@xxxxxxxxx>
> > ---
> >  Create.c |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/Create.c b/Create.c
> > index 330c5b4..7c8e53e 100644
> > --- a/Create.c
> > +++ b/Create.c
> > @@ -904,7 +904,12 @@ int Create(struct supertype *st, char *mddev,
> >  				if (st->ss->add_to_super(st, &inf->disk,
> >  							 fd, dv->devname,
> >  							 dv->data_offset)) {
> > -					ioctl(mdfd, STOP_ARRAY, NULL);
> > +					int count = 5;
> > +					while (count &&
> > +					       (ioctl(mdfd, STOP_ARRAY, NULL)
> < 0)) {
> > +						usleep(100000);
> > +						count--;
> > +					}
> >  					goto abort_locked;
> >  				}
> >  				st->ss->getinfo_super(st, inf, NULL);
> 
> I don't like this.  I don't really like any of the other loops like this that are
> already in the code either.  I wonder if we can avoid the need for it.
> 
> Given that the array hasn't been started yet, no other process can actually be
> *using* the array.  And given that we have an O_EXCL open at this point, no
> other process can be trying to stop/start the array.
> So it should be safe to change the kernel to not fail in this situation.
> 
> If you apply this kernel patch:
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c index
> 1294238610df..1bf3fe1ecc79 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5362,7 +5362,7 @@ static int do_md_stop(struct mddev * mddev, int
> mode,
>  	mddev_lock_nointr(mddev);
> 
>  	mutex_lock(&mddev->open_mutex);
> -	if (atomic_read(&mddev->openers) > !!bdev ||
> +	if ((mddev->pers && atomic_read(&mddev->openers) > !!bdev) ||
>  	    mddev->sysfs_active ||
>  	    mddev->sync_thread ||
>  	    (bdev && !test_bit(MD_STILL_CLOSED, &mddev->flags))) {
> 
> 
> does that fir your problem?  Can you see any reason not to allow
> STOP_ARRAY to succeed in this situation?
> 
Hi Neil
Thanks for your answer.
To fix this problem same thing needs to be added in one more place in kernel:

@@ -6454,7 +6454,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 		 * and writes
 		 */
 		mutex_lock(&mddev->open_mutex);
-		if (atomic_read(&mddev->openers) > 1) {
+		if (mddev->pers && (atomic_read(&mddev->openers) > 1)) {
 			mutex_unlock(&mddev->open_mutex);
 			err = -EBUSY;
 			goto abort;

Should I prepare the patch, or you can do it?

Thanks,
Pawel Baldysiak

> Thanks,
> NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html