----- Original Message ----- > From: "Shaohua Li" <shli@xxxxxxxxxx> > To: "Xiao Ni" <xni@xxxxxxxxxx> > Cc: linux-raid@xxxxxxxxxxxxxxx, "Shaohua Li" <shli@xxxxxx> > Sent: Friday, May 18, 2018 2:08:46 AM > Subject: Re: [PATCH] md: simplify flush request handling > > On Thu, May 17, 2018 at 09:19:03AM -0700, Shaohua Li wrote: > > On Thu, May 17, 2018 at 12:56:48PM +0800, Xiao Ni wrote: > > > > > > > > > On 05/15/2018 12:49 AM, Shaohua Li wrote: > > > > On Fri, May 11, 2018 at 10:46:59AM +0800, Xiao Ni wrote: > > > > > Hi Shaohua > > > > > > > > > > > > > > > On 05/11/2018 06:23 AM, Shaohua Li wrote: > > > > > > From: Shaohua Li <shli@xxxxxx> > > > > > > > > > > > > The recent flush request handling seems unncessary complicated. The > > > > > > main > > > > > > issue is in rdev_end_flush we can either get rdev of the bio or the > > > > > > flush_info, not both, or we need extra memory to for the other. > > > > > > With the > > > > > > extra memory, we need reallocate the memory in disk hotadd/remove. > > > > > > Actually the original patch forgets one case of add_new_disk for > > > > > > memory > > > > > > allocation, and we have kernel crash. > > > > > add_new_disk just adds disk to md as a spare disk. After reshape raid > > > > > disks update_raid_disks realloc memory. Why is there a kernel crash? > > > > > Could you explain more? > > > > Not always reshape. It's very easy to reproduce. Just create a linear > > > > array, > > > > grow up one disk, and run some file io triggering flush. > > > Hi Shaohua > > > > > > How do I grow up one disk after creating a raid0 array? Something like > > > those > > > steps? > > > mdadm -CR /dev/md0 -l1 -n2 /dev/sdb1 /dev/sdc1 --size=50M > > > mdadm -CR /dev/md1 -l0 /dev/md0 /dev/sdd2 > > > mdadm /dev/md0 --grow --size=max > > > > > > I read the code again and I still can't catch the point. In > > > md_flush_request > > > it just sends > > > flush bio to md_rdev which is not faulty and the raid_disk is not -1. > > > After > > > add_new_disk > > > the md_rdev->raid_disk is -1. It can change to the role number only after > > > reshaping. So > > > for this situation it can't panic. > > > > Not raid0, it's linear array. > > > > > > > > > > > > The idea is always to increase all rdev reference in > > > > > > md_flush_request > > > > > > and decrease the references after bio finish. In this way, > > > > > > rdev_end_flush doesn't need to know rdev, so we don't need to > > > > > > allocate > > > > > > extra memory. > > > > > Is there a situation like this? It plugs one disk to the raid after > > > > > the > > > > > flush > > > > > bios submitted to underlayer disks. After those flush bios come back > > > > > there > > > > > is one more rdev in the list mddev->disks. If decrements all rdev > > > > > reference > > > > > at one time, it can decrease the rdev reference which doesn't submit > > > > > flush > > > > > bio. It's the reason I try to allocate memory for bios[0] in > > > > > flush_info. > > > > I think we wait all IO finish before the hot add/remove disks. > > > > > > I read the related codes and did some tests. In fact it don't wait IO > > > finish. I call raid1 get_unqueued_pending > > > in add_new_disk before bind_rdev_to_array(list_add_rcu(&rdev->same_set, > > > &mddev->disks); ). it shows > > > there are I/O. Should it call mddev_suspend in add_new_disk? If it can > > > wait > > > all IO finish, there is no need to > > > alloc memory for bios[] > > > > You are right, we didn't wait for IO for hot add for all personalities. > > Need to > > find a better way to solve this issue. > > Hmm, seems no better solution. Now I'm in favor of your old verion (with > mempool). > Could you please repost that one and I'll merge it. Sure, I'm doing test and will re-send the patch again after testing. > > Thanks, > Shaohua > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html