Re: Racy loop device reuse logic

Jan Kara <jack@xxxxxxx> · Thu, 20 Jan 2022 13:13:19 +0100

On Wed 19-01-22 22:34:49, Jan Kara wrote:
> On Wed 19-01-22 12:39:00, Karel Zak wrote:
> > On Wed, Jan 19, 2022 at 09:52:47AM +0100, Jan Kara wrote:
> > > Ping? Any opinion?
> > 
> >  Sorry for the delay.
> > 
> > > On Thu 13-01-22 16:47:35, Jan Kara wrote:
> > > > Hello,
> > > > 
> > > > Tetsuo has been doing some changes to the loop device shutdown in the
> > > > kernel and that broke LTP that is doing essentially the following loop:
> > > > 
> > > > while :; do mount -o loop,ro isofs.iso isofs/; umount isofs/; done
> > > > 
> > > > And this loop is broken because of a subtle interaction with systemd-udev
> > > > that also opens the loop device. The race seems to be in mount(8) handling
> > > > itself and the altered kernel timing makes it happen. It look like:
> > > > 
> > > > bash					systemd-udev
> > > >   mount -o loop,ro isofs.iso isofs/
> > > >     /dev/loop0 is created and bound to isofs.iso, autoclear is set for
> > > >     loop0
> > > >   					opens /dev/loop0
> > > >   umount isofs/
> > > >   loop0 still lives because systemd-udev still has device open
> > > >   mount -o loop,ro isofs.iso isofs/
> > > >     gets to mnt_context_setup_loopdev()
> > > >       loopcxt_find_overlap()
> > > >       sees loop0 is still valid and with proper parameters
> > > >       reuse = true;
> > > > 					close /dev/loop0
> > > > 					  last fd closed => loop0 is
> > > > 					    cleaned up
> > > >       loopcxt_get_fd()
> > > >         opens loop0 but it is no longer the device we wanted!
> > > >     calls mount(2) which fails because we cannot read from the loop device
> > > > 
> > > > It seems to me that mnt_context_setup_loopdev() should actually recheck
> > > > that loop device parameters still match what we need after opening
> > > > /dev/loop0 (if LOOP_GET_STATUS ioctl succeeds on the fd, you are guaranteed
> > > > the loop device is in that state and will not be torn down under your
> > > > hands). What do you think?
> > 
> > Seems like elegant solution. Please, send a patch.
> 
> OK, will do.

Patch sent which fixes the problem for me. Related to this, I have found out
that loop device reuse detection code sometimes fails with EPERM (if I run
the mount loop long enough). I have tracked this down to e.g.
loopcxt_get_sizelimit() calling ul_path_read_u64() which fails (loop device
just got torn down) and ul_path_read_u64() always returns -1 on error which
gets somewhat confusingly translated to EPERM error. I think this should be
handled more gracefully so I'll send patches to fix this as well.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR