On 8/7/19 2:45 AM, Jan Kara wrote: > On Mon 05-08-19 09:41:39, Bart Van Assche wrote: >> On 7/30/19 6:36 AM, Jan Kara wrote: >>> On Tue 30-07-19 12:16:46, Jan Kara wrote: >>>> On Tue 30-07-19 10:36:59, John Lenton wrote: >>>>> On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@xxxxxxx> wrote: >>>>>> >>>>>> Thanks for the notice and the references. What's your version of >>>>>> util-linux? What your test script does is indeed racy. You have there: >>>>>> >>>>>> echo Running: >>>>>> for i in {a..z}{a..z}; do >>>>>> mount $i.squash /mnt/$i & >>>>>> done >>>>>> >>>>>> So all mount(8) commands will run in parallel and race to setup loop >>>>>> devices with LOOP_SET_FD and mount them. However util-linux (at least in >>>>>> the current version) seems to handle EBUSY from LOOP_SET_FD just fine and >>>>>> retries with the new loop device. So at this point I don't see why the patch >>>>>> makes difference... I guess I'll need to reproduce and see what's going on >>>>>> in detail. >>>>> >>>>> We've observed this in arch with util-linux 2.34, and ubuntu 19.10 >>>>> (eoan ermine) with util-linux 2.33. >>>>> >>>>> just to be clear, the initial reports didn't involve a zany loop of >>>>> mounts, but were triggered by effectively the same thing as systemd >>>>> booted a system with a lot of snaps. The reroducer tries to makes >>>>> things simpler to reproduce :-). FWIW, systemd versions were 244 and >>>>> 242 for those systems, respectively. >>>> >>>> Thanks for info! So I think I see what's going on. The two mounts race >>>> like: >>>> >>>> MOUNT1 MOUNT2 >>>> num = ioctl(LOOP_CTL_GET_FREE) >>>> num = ioctl(LOOP_CTL_GET_FREE) >>>> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >>>> - returns OK >>>> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >>>> - acquires exclusine loop$num >>>> reference >>>> mount("/dev/loop$num", ...) >>>> - sees exclusive reference from MOUNT2 and fails >>>> - sees loop device is already >>>> bound and fails >>>> >>>> It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block >>>> perfectly valid mount. I'll think how to fix this... >>> >>> So how about attached patch? It fixes the regression for me. > > Hi Bart, > >> A new kernel warning is triggered by blktests block/001 that did not happen >> without this patch. Reverting commit 89e524c04fa9 ("loop: Fix mount(2) >> failure due to race with LOOP_SET_FD") makes that kernel warning disappear. >> Is this reproducible on your setup? > > Thanks for report! Hum, no, it seems the warning doesn't trigger in my test > VM. But reviewing the mentioned commit with fresh head, I can see where I > did a mistake during my conversion of blkdev_get(). Does attached patch fix > the warning for you? I've queued this up, Jan. Thanks for taking a look at it. -- Jens Axboe