On Fri, Apr 2, 2021 at 1:58 AM heming.zhao@xxxxxxxx <heming.zhao@xxxxxxxx> wrote: > > On 4/2/21 1:58 PM, Song Liu wrote: > > On Thu, Apr 1, 2021 at 6:03 AM heming.zhao@xxxxxxxx > > <heming.zhao@xxxxxxxx> wrote: > >> > >> On 4/1/21 2:17 PM, Christoph Hellwig wrote: > >>> On Thu, Apr 01, 2021 at 09:34:56AM +0800, Zhao Heming wrote: > >>>> commit d3374825ce57 ("md: make devices disappear when they are no longer > >>>> needed.") introduced protection between mddev creating & removing. The > >>>> md_open shouldn't create mddev when all_mddevs list doesn't contain > >>>> mddev. With currently code logic, there will be very easy to trigger > >>>> soft lockup in non-preempt env. > >>> > >>> As mention below, please don't make this even more of a mess than it > >>> needs to. We can just pick the two patches doing this from the series > >>> I sent: > >>> > >> > >> Hi, > >> > >> I already got your meaning on previously email. > >> I sent v2 patch for Song's review comment. My patch is bugfix, it may need > >> to back port into branch maintenance. > >> > >> Your attachment patch files is partly my patch. > >> The left part is in md_open (response [PATCH 01/15] md: remove the code to flush > >> an old instance in md_open) > >> I still think you directly use bdev->bd_disk->private_data as mddev pointer is not safe. > >> > > > > Hi Christoph and Heming, > > > > Trying to understand the whole picture here. Please let me know if I > > misunderstood anything. > > > > IIUC, the primary goal of Christoph's set is to get rid of the > > ERESTARTSYS hack from md, > > and eventually move bd_mutext. 02/15 to 07/15 of this set cleans up > > code in md.c, and they > > are not very important for the rest of the set (is this correct?). > > > > Heming, you mentioned "the solution may simply return -EBUSY (instead > > of -ENODEV) to > > fail the open path". Could you please show the code? Maybe that would > > be enough to unblock > > the second half of Christoph's set (08/15 to 15/15)? > > > > I already sent the whole picture of md_open (email data: 2021-4-1, > subject: Re: [PATCH] md: don't create mddev in md_open). > My mail may fail to be delivered (at least, I got the "can't be delivered" > info on my last mail for linux-raid & guoqing's address). I put the related > email on attachment, anyone can read it again. > > For the convenience of discussion, I also pasted **patched** md_open below. > (I added more comments than enclosed email) > > ``` > static int md_open(struct block_device *bdev, fmode_t mode) > { > /* section 1 */ > struct mddev *mddev = mddev_find(bdev->bd_dev); //hm: the new style, only do searching job > int err; > > if (!mddev) //hm: this will cover freeing path 2.2 (refer enclosed file) > return -ENODEV; > > if (mddev->gendisk != bdev->bd_disk) { //hm: for freeing path 2.1 (refer enclosed file) > /* we are racing with mddev_put which is discarding this > * bd_disk. > */ > mddev_put(mddev); > /* Wait until bdev->bd_disk is definitely gone */ > if (work_pending(&mddev->del_work)) > flush_workqueue(md_misc_wq); > return -EBUSY; //hm: fail this path. it also makes __blkdev_get return fail, userspace can try later. > // > // the legacy code flow: > // return -ERESTARTSYS here, later __blkdev_get reentry. > // it will trigger first 'if' in this functioin, then > // return -ENODEV. > } > > /* section 2 */ > /* hm: below same as Christoph's [PATCH 01/15] */ > err = mutex_lock_interruptible(&mddev->open_mutex); > if (err) > return err; > > if (test_bit(MD_CLOSING, &mddev->flags)) { > mutex_unlock(&mddev->open_mutex); > return -ENODEV; > } > > mddev_get(mddev); > atomic_inc(&mddev->openers); > mutex_unlock(&mddev->open_mutex); > > bdev_check_media_change(bdev); > return 0; > } > ``` > > I wrote again: > > Christoph's patch [01/15] totally dropped <section 1>, and use bdev->bd_disk->private_data > > to get mddev pointer, it's not safe. > > And with above **patched** md_open, Christoph's patches 02-07 can be work happily. > He only needs to adjust/modify 01 patch. > The md layer behavior will change from return -ENODEV to -EBUSY in some racing scenario. Thanks for the explanation. Could you please send official patch of modified 01/15 and your fix (either in a set or merge into one patch)? This patch(set) would go to stable. Then, we can apply Christoph's 02 - 15 on top of it. Thanks, Song