Re: [PATCH v2] md: don't create mddev in md_open

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 2, 2021 at 1:58 AM heming.zhao@xxxxxxxx
<heming.zhao@xxxxxxxx> wrote:
>
> On 4/2/21 1:58 PM, Song Liu wrote:
> > On Thu, Apr 1, 2021 at 6:03 AM heming.zhao@xxxxxxxx
> > <heming.zhao@xxxxxxxx> wrote:
> >>
> >> On 4/1/21 2:17 PM, Christoph Hellwig wrote:
> >>> On Thu, Apr 01, 2021 at 09:34:56AM +0800, Zhao Heming wrote:
> >>>> commit d3374825ce57 ("md: make devices disappear when they are no longer
> >>>> needed.") introduced protection between mddev creating & removing. The
> >>>> md_open shouldn't create mddev when all_mddevs list doesn't contain
> >>>> mddev. With currently code logic, there will be very easy to trigger
> >>>> soft lockup in non-preempt env.
> >>>
> >>> As mention below, please don't make this even more of a mess than it
> >>> needs to.  We can just pick the two patches doing this from the series
> >>> I sent:
> >>>
> >>
> >> Hi,
> >>
> >> I already got your meaning on previously email.
> >> I sent v2 patch for Song's review comment. My patch is bugfix, it may need
> >> to back port into branch maintenance.
> >>
> >> Your attachment patch files is partly my patch.
> >> The left part is in md_open (response [PATCH 01/15] md: remove the code to flush
> >> an old instance in md_open)
> >> I still think you directly use bdev->bd_disk->private_data as mddev pointer is not safe.
> >>
> >
> > Hi Christoph and Heming,
> >
> > Trying to understand the whole picture here. Please let me know if I
> > misunderstood anything.
> >
> > IIUC, the primary goal of Christoph's set is to get rid of the
> > ERESTARTSYS hack from md,
> > and eventually move bd_mutext. 02/15 to 07/15 of this set cleans up
> > code in md.c, and they
> > are not very important for the rest of the set (is this correct?).
> >
> > Heming, you mentioned "the solution may simply return -EBUSY (instead
> > of -ENODEV) to
> > fail the open path". Could you please show the code? Maybe that would
> > be enough to unblock
> > the second half of Christoph's set (08/15 to 15/15)?
> >
>
> I already sent the whole picture of md_open (email data: 2021-4-1,
> subject: Re: [PATCH] md: don't create mddev in md_open).
> My mail may fail to be delivered (at least, I got the "can't be delivered"
> info on my last mail for linux-raid & guoqing's address). I put the related
> email on attachment, anyone can read it again.
>
> For the convenience of discussion, I also pasted **patched** md_open below.
> (I added more comments than enclosed email)
>
> ```
> static int md_open(struct block_device *bdev, fmode_t mode)
> {
>     /* section 1 */
>     struct mddev *mddev = mddev_find(bdev->bd_dev); //hm: the new style, only do searching job
>     int err;
>
>     if (!mddev) //hm: this will cover freeing path 2.2 (refer enclosed file)
>         return -ENODEV;
>
>     if (mddev->gendisk != bdev->bd_disk) { //hm: for freeing path 2.1 (refer enclosed file)
>         /* we are racing with mddev_put which is discarding this
>          * bd_disk.
>          */
>         mddev_put(mddev);
>         /* Wait until bdev->bd_disk is definitely gone */
>         if (work_pending(&mddev->del_work))
>             flush_workqueue(md_misc_wq);
>         return -EBUSY; //hm: fail this path. it also makes __blkdev_get return fail, userspace can try later.
>                        //
>                        // the legacy code flow:
>                        //   return -ERESTARTSYS here, later __blkdev_get reentry.
>                        //   it will trigger first 'if' in this functioin, then
>                        //   return -ENODEV.
>     }
>
>     /* section 2 */
>     /* hm: below same as Christoph's [PATCH 01/15] */
>     err = mutex_lock_interruptible(&mddev->open_mutex);
>     if (err)
>         return err;
>
>     if (test_bit(MD_CLOSING, &mddev->flags)) {
>         mutex_unlock(&mddev->open_mutex);
>         return -ENODEV;
>     }
>
>     mddev_get(mddev);
>     atomic_inc(&mddev->openers);
>     mutex_unlock(&mddev->open_mutex);
>
>     bdev_check_media_change(bdev);
>     return 0;
> }
> ```
>
> I wrote again:
> > Christoph's patch [01/15] totally dropped <section 1>, and use bdev->bd_disk->private_data
> > to get mddev pointer, it's not safe.
>
> And with above **patched** md_open, Christoph's patches 02-07 can be work happily.
> He only needs to adjust/modify 01 patch.
> The md layer behavior will change from return -ENODEV to -EBUSY in some racing scenario.

Thanks for the explanation.

Could you please send official patch of modified 01/15 and your fix
(either in a set or merge into
one patch)? This patch(set) would go to stable. Then, we can apply
Christoph's 02 - 15 on top of
it.

Thanks,
Song



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux