Hi Christoph, > -----Original Message----- > From: hch@xxxxxx <hch@xxxxxx> > Sent: Tuesday, July 2, 2024 9:20 PM > To: Gulam Mohamed <gulam.mohamed@xxxxxxxxxx> > Cc: hch@xxxxxx; linux-block@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > yukuai1@xxxxxxxxxxxxxxx; axboe@xxxxxxxxx > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > Hi Gulam, > > On Sun, Jun 30, 2024 at 10:11:14PM +0000, Gulam Mohamed wrote: > > With our latest version of the patch V6, the "kernel robot test" > > failed in the ioctl_loop06 test (LTP tests) as in below mail. > > the reason for the failure is, the deferring of the "detach" loop > > device to release function. The test opens the loop device, sends > > LOOP_SET_BLOCK_SIZE and LOOP_CONFIGURE commands and in between > that, > > it will also detach the loop device. At the end of the test, while > > cleanup, it will close the loop device. As we deferred the detach to > > last close, the detach will be at the end only but before that we are > > setting the lo_state to Lo_rundown. This setting of Lo_rundown we are > > doing in the beginning because, there was another LTP test case failed > > earlier due to the same reason. > > > > So, when the LOOP_CONFIGURE was sent, the loop device was still in > > Lo_rundown state (Lo_unbound will be set after detach in > > __loop_clr_fd()) due to which kernel returned the EBUSY error causing > > the test to fail. > > Before we'd end up in Lo_unbound toward the end of __loop_clr_fd if there > was a single opener. > > > I have noticed that a good number of test cases are having a behaviour > > that it will send different loop commands and in between the detach > > command also, with only a single open. And close happens at the end. > > Due to this, I think a couple of test cases needs to be modified. > > > > Now, as per my understanding, we have two options here: > > > > 1. Continue with this kernel patch and modify few test cases to > > accommodate this new kernel behaviour > > That would be my preference. Any code that is doing a clear_fd and then tries > to configure it again is prone to races vs other openers. It also does not seem > very useful outside of test code. > But if we end up breaking real code and not test cases we might have to go > and bring it back. Requested the maintainers of the LTP test cases for the modification to accomodate the new kernel behavior.