On 2022-07-09 02:17, Christoph Hellwig wrote: > On Fri, Jul 08, 2022 at 09:55:55AM -0600, Logan Gunthorpe wrote: >> I agree it's a mess, probably buggy and could use a cleanup with a >> free_disk method. But I'm not sure the all_mdevs lifetime issues are the >> problem here. If the entry in all_mdevs outlasts the disk, then >> md_alloc() will just fail earlier. Many test scripts rely on the fact >> that you can stop an mddev and recreate it immediately after. We need >> some way of ensuring any deleted disks are fully deleted before trying >> to make a new mddev, in case the new one has the same name as one being >> deleted. > > I think those tests are broken. But fortunately that is just an > assumption in the tests, while device name reuse is a real problem. > > I could not reproduce your problem, but on the for-5.20/block > tree I see a hang in 10ddf-geometry when running the tests. Ah, strange. I never saw an issue with that test, though I didn't run that one repeatedly with the latest branch. So maybe it was an intermittent like I saw with 11spare-migration and I just missed it. > The branch here: > > git://git.infradead.org/users/hch/block.git md-lifetime-fixes > http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/md-lifetime-fixes > > fixes that for me and does not introduce new regressions. Can you > check if that helps your problem? Yes, I can confirm those patches fix the bug I was seeing too. I did a fairly quick review of the patches: - In the patch that introduces md_free_disk() it looks like md_free() can still be called from the error path of md_alloc() before mddev->gendisk is set... which seems to make things rather complicated seeing we then can't use free_disk to finish the cleanup if the disk hasn't been created yet. I probably need to take closer look at this but, it might make more sense for the cleanup to remain in md_free() but call kobject_put() in md_free_disk() and del_gendisk() in mdev_delayed_delete(). Then md_alloc() can still use kobject_put() in the error path and it makes a little more sense seeing we'd still be freeing the kobject stuff in it's own release method instead of the disks free method. - In the patch with md_rdevs_overlap, it looks like we remove the rcu_read_lock(), which definitely seems out of place and probably isn't correct. But the comment that was recreated still references the rcu so probably should be changed. - The last patch has a typo in the title (disk is *a* freed). Thanks! Logan