On 2021/10/18 下午12:47, Christoph Hellwig wrote:
On Mon, Oct 18, 2021 at 12:43:38PM +0800, Zqiang wrote:
This is the details of the test, Hope it helps you
Call me stupid, but I can only find the trace and linked to unreadable
google sites that wan't me to log in somehow and no actual details.
If you have a direct link to the reproducer (an attachment would do
it as well) I'd love to try it myself.
Otherwise this commit in the block-5.15 tree should help to catch what
I suspect is the root cause (final ref drop before unregister) earlier
and with a better backtrace:
https://git.kernel.dk/cgit/linux-block/commit/?h=block-5.15&id=a20417611b98e12a724e5c828c472ea16990b71f
I found the following calltrace
Call Trace:
[ 326.460593][T27634] dump_stack_lvl+0x8d/0xcf
[ 326.461773][T27634] should_fail+0x13c/0x160
[ 326.462921][T27634] should_failslab+0x5/0x10
[ 326.464038][T27634] slab_pre_alloc_hook.constprop.100+0x4e/0xc0
[ 326.466040][T27634] kmem_cache_alloc+0x44/0x2a0
[ 326.466921][T27634] __kernfs_new_node+0x68/0x350
[ 326.469602][T27634] kernfs_new_node+0x5a/0x90
[ 326.470441][T27634] __kernfs_create_file+0x56/0x150
[ 326.471386][T27634] sysfs_add_file_mode_ns+0xe6/0x290
[ 326.472358][T27634] internal_create_group+0x186/0x4e0
[ 326.473331][T27634] internal_create_groups.part.4+0x4d/0xb0
[ 326.474288][T27634] sysfs_create_groups+0x28/0x40
[ 326.474918][T27634] device_add+0x4c3/0xc60
[ 326.476286][T27634] add_partition+0x262/0x450
[ 326.476919][T27634] bdev_disk_changed+0x3ec/0x800
[ 326.477615][T27634] loop_reread_partitions+0x2d/0x70
[ 326.478515][T27634] loop_set_status+0x274/0x320
[ 326.479373][T27634] lo_ioctl+0x392/0x920
[ 326.481271][T27634] blkdev_ioctl+0x2ff/0x370
[ 326.482438][T27634] block_ioctl+0x55/0x70
[ 326.483605][T27634] __x64_sys_ioctl+0xb6/0x100
[ 326.484241][T27634] do_syscall_64+0x34/0xb0
[ 326.484843][T27634] entry_SYSCALL_64_after_hwframe+0x44/0xae
I find in add_partition(), if the device_add() return error, we will
drop disk object reference count,
but i find put_device(pdev) (will call part_release())and
put_disk(disk), both will reduce the reference of the disk object ,
however we call get_device(disk_to_dev(disk)) only once
or Did I miss something and didn't analyze it?
Thanks
Zqiang