Re: [PATCH V2 2/2] block: fix "Directory XXXXX with parent 'block' already present!"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 25, 2022 at 11:32:15AM +0200, Hannes Reinecke wrote:
> On 4/25/22 11:07, Ming Lei wrote:
> > On Mon, Apr 25, 2022 at 07:10:46AM +0200, Greg Kroah-Hartman wrote:
> > > On Mon, Apr 25, 2022 at 09:28:27AM +0800, Ming Lei wrote:
> > > > On Sun, Apr 24, 2022 at 03:45:59PM +0200, Greg Kroah-Hartman wrote:
> > > > > On Sun, Apr 24, 2022 at 08:04:59PM +0800, Ming Lei wrote:
> > > > > > On Sun, Apr 24, 2022 at 01:51:45PM +0200, Hannes Reinecke wrote:
> > > > > > > On 4/24/22 11:28, Ming Lei wrote:
> > > > > > > > On Sun, Apr 24, 2022 at 10:53:29AM +0200, Hannes Reinecke wrote:
> > > > > > > > > On 4/23/22 16:39, Ming Lei wrote:
> > > > > > > > > > q->debugfs_dir is used by blk-mq debugfs and blktrace. The dentry is
> > > > > > > > > > created when adding disk, and removed when releasing request queue.
> > > > > > > > > > 
> > > > > > > > > > There is small window between releasing disk and releasing request
> > > > > > > > > > queue, and during the period, one disk with same name may be created
> > > > > > > > > > and added, so debugfs_create_dir() may complain with "Directory XXXXX
> > > > > > > > > > with parent 'block' already present!"
> > > > > > > > > > 
> > > > > > > > > > Fixes the issue by moving debugfs_create_dir() into blk_alloc_queue(),
> > > > > > > > > > and the dir name is named with q->id from beginning, and switched to
> > > > > > > > > > disk name when adding disk, and finally changed to q->id in disk_release().
> > > > > > > > > > 
> > > > > > > > > > Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
> > > > > > > > > > Reported-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > > > > > > > > > Cc: yukuai (C) <yukuai3@xxxxxxxxxx>
> > > > > > > > > > Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
> > > > > > > > > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > > > > > > > > > ---
> > > > > > > > > >     block/blk-core.c  | 4 ++++
> > > > > > > > > >     block/blk-sysfs.c | 4 ++--
> > > > > > > > > >     block/genhd.c     | 8 ++++++++
> > > > > > > > > >     3 files changed, 14 insertions(+), 2 deletions(-)
> > > > > > > > > > 
> > > > > > > > > Errm.
> > > > > > > > > 
> > > > > > > > > Isn't this superfluous now that Jens merged Yu Kuais patch?
> > > > > > > > 
> > > > > > > > Jens has dropped Yu Kuai's patch which caused kernel panic.
> > > > > > > > 
> > > > > > > Right.
> > > > > > > But still, this patch looks really odd.
> > > > > > > How is userspace supposed to use the directories prior to the renaming?
> > > > > > 
> > > > > > That doesn't make any difference for current uses, but we may extend it
> > > > > > to support debugfs for non-blk request queue in future by exporting q->id
> > > > > > somewhere. Even though now the interested q->id can be figured out
> > > > > > easily by very simple ebpf trace prog.
> > > > > > 
> > > > > > > 
> > > > > > > And as you already have identified the places where we can safely create
> > > > > > > (and remove) the debugfs directories, why can't we move the call to create
> > > > > > > and remove the debugfs directories to those locations and do away with the
> > > > > > > renaming?
> > > > > > 
> > > > > > First it needs more change to fix the kernel panic.
> > > > > > 
> > > > > > Second removing debugfs dir in del_gendisk will break blktests block/002.
> > > > > 
> > > > > Then fix the test?  debugfs interactions that cause kernel bugs should
> > > > > be ok to change the functionality of.  Remember, this is for
> > > > > debugging...
> > > > 
> > > > But what is wrong with the test? Isn't it reasonable to keep debugfs dir
> > > > when blktrace is collecting log?
> > > 
> > > How can you collect something from a device that is gone?
> > 
> > Here the 'gone' may be just in logical/soft viewpoint, such as, one disk
> > is removed by sysfs, and the driver still may send sync cache command
> > to make sure the cache inside drive is flushed, such as scsi's
> > SYNCHRONIZE_CACHE.
> > 
> And that is my argument: what does this buy us?

Isn't the posted patch simple enough for fixing the whole issue?

Not only in lines of code, but also in principle.

So far q->debugfs_dir is used by elevator, rq_qos, blktrace and blk-mq
debugfs.

The 1st three can have same lifetime with gendisk, but blk-mq debugfs
more share same lifetime with request_queue.

That is why I make ->debugfs_dir sharing same lifetime with request
queue since request queue has longer lifetime than gendisk.

With this way, we can clean the mess for delaying to add blk-mq debugfs.

Not mention this approach can allow us to add debugfs support for
non-disk request queue.

> Is is relevant (for blktrace) to have the SYNCHRONIZE_CACHE to be present in
> the logs?

SYNCHRONIZE_CACHE is just one example, and there can be more from
/dev/sg or kernel. As one user of trace tool, it is important to get
intact request trace.

> From my POV, blktrace is there to analyze I/O flow; device shutdown is not
> really relevant for that as the results of that operation depend on other
> factors which won't show up in blktrace at all.
> 
> So we're not losing much by (maybe) missing shutdown commands in blktrace;
> if needs be device shutdown can be traced by other means.
> 
> I'd rather keep the code simple, and not having an operation in the core
> block layer which requires quite some explanation.

Please write one workable patch following your idea, then compare yours
and this patch, then you will see which one is simpler.



Thanks,
Ming




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux