Re: [PATCH 3/5] blktrace: refcount the request_queue during ioctl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2020-04-14 23:16, Luis Chamberlain wrote:
> On Tue, Apr 14, 2020 at 08:40:44AM -0700, Christoph Hellwig wrote:
>> Hmm, where exactly does the race come in so that it can only happen
>> after where you take the reference, but not before it?  I'm probably
>> missing something, but that just means it needs to be explained a little
>> better :)
> 
>>From the trace on patch 2/5:
> 
>     BLKTRACE_SETUP(loop0) #2
>     [   13.933961] == blk_trace_ioctl(2, BLKTRACESETUP) start
>     [   13.936758] === do_blk_trace_setup(2) start
>     [   13.938944] === do_blk_trace_setup(2) creating directory
>     [   13.941029] === do_blk_trace_setup(2) using what debugfs_lookup() gave
>     
>     ---> From LOOP_CTL_DEL(loop0) #2
>     [   13.971046] === blk_trace_cleanup(7) end
>     [   13.973175] == __blk_trace_remove(7) end
>     [   13.975352] == blk_trace_shutdown(7) end
>     [   13.977415] = __blk_release_queue(7) calling blk_mq_debugfs_unregister()
>     [   13.980645] ==== blk_mq_debugfs_unregister(7) begin
>     [   13.980696] ==== blk_mq_debugfs_unregister(7) debugfs_remove_recursive(q->debugfs_dir)
>     [   13.983118] ==== blk_mq_debugfs_unregister(7) end q->debugfs_dir is NULL
>     [   13.986945] = __blk_release_queue(7) blk_mq_debugfs_unregister() end
>     [   13.993155] = __blk_release_queue(7) end
>     
>     ---> From BLKTRACE_SETUP(loop0) #2
>     [   13.995928] === do_blk_trace_setup(2) end with ret: 0
>     [   13.997623] == blk_trace_ioctl(2, BLKTRACESETUP) end
> 
> The BLKTRACESETUP above works on request_queue which later
> LOOP_CTL_DEL races on and sweeps the debugfs dir underneath us.
> If you use this commit alone though, this doesn't fix the race issue
> however, and that's because of both still the debugfs_lookup() use
> and that we're still using asynchronous removal at this point.
> 
> refcounting will just ensure we don't take the request_queue underneath
> our noses.

I think the above trace reveals a bug in the loop driver. The loop
driver shouldn't allow the associated request queue to disappear while
the loop device is open. One may want to have a look at sd_open() in the
sd driver. The scsi_disk_get() call in that function not only increases
the reference count of the SCSI disk but also of the underlying SCSI device.

Thanks,

Bart.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux