Re: [PATCH -next v2 00/28] md: synchronize io with array reconfiguration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kuai,

Thanks for the patchset!

I have got the following panic with mdadm test 23rdev-lifetime.
Could you please look into it?

I pushed the test code to this branch:

https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/log/?h=md-test-28

Thanks,
Song


[  173.143010] ==================================================================
[  173.144256] BUG: KASAN: null-ptr-deref in __mutex_lock+0xc0/0x920
[  173.145232] Read of size 8 at addr 00000000000000a8 by task test/1215
[  173.146138]
[  173.146375] CPU: 26 PID: 1215 Comm: test Not tainted 6.6.0-rc2+ #8
[  173.147254] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[  173.148840] Call Trace:
[  173.149202]  <TASK>
[  173.149531]  dump_stack_lvl+0xb5/0x100
[  173.150093]  ? __pfx_dump_stack_lvl+0x10/0x10
[  173.150724]  ? _printk+0xac/0xf0
[  173.151251]  ? lock_acquired+0xff/0x680
[  173.151852]  print_report+0xe6/0x510
[  173.152372]  ? __might_resched+0x1a1/0x3d0
[  173.152997]  ? __mutex_lock+0xc0/0x920
[  173.153566]  kasan_report+0x119/0x150
[  173.154114]  ? lock_acquire+0x18a/0x390
[  173.154667]  ? __mutex_lock+0xc0/0x920
[  173.155225]  ? mddev_suspend+0xbc/0x260
[  173.155799]  __mutex_lock+0xc0/0x920
[  173.156332]  ? lock_acquire+0x18a/0x390
[  173.156928]  ? kernfs_find_and_get_ns+0x4c/0xb0
[  173.157578]  ? __pfx___mutex_lock+0x10/0x10
[  173.158177]  ? down_read+0x6b2/0x800
[  173.158696]  ? lock_is_held_type+0xdb/0x150
[  173.159300]  mddev_suspend+0xbc/0x260
[  173.159832]  ? __pfx_lock_release+0x10/0x10
[  173.160427]  ? lock_is_held_type+0xdb/0x150
[  173.161074]  ? __pfx_mddev_suspend+0x10/0x10
[  173.161698]  rdev_attr_store+0x5ba/0x600
[  173.162282]  ? __pfx_sysfs_kf_write+0x10/0x10
[  173.162915]  kernfs_fop_write_iter+0x1d1/0x280
[  173.163595]  vfs_write+0x45d/0x5d0
[  173.164113]  ? __pfx_vfs_write+0x10/0x10
[  173.164709]  ? __pfx_lock_release+0x10/0x10
[  173.165352]  ksys_write+0xed/0x1a0
[  173.165912]  ? __pfx_ksys_write+0x10/0x10
[  173.166501]  ? __audit_syscall_entry+0x1cf/0x200
[  173.167191]  ? syscall_enter_from_user_mode+0x181/0x220
[  173.168034]  do_syscall_64+0x43/0x90
[  173.168588]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  173.169355] RIP: 0033:0x7f4e65ced648
[  173.169830] md: could not open device unknown-block(7,0).
[  173.169914] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
d4 55
[  173.173324] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[  173.174398] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
[  173.175405] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
[  173.176416] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
[  173.177417] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
[  173.178418] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
[  173.179441]  </TASK>
[  173.179775] ==================================================================
[  173.180838] Disabling lock debugging due to kernel taint
[  173.181662] BUG: kernel NULL pointer dereference, address: 00000000000000a8
[  173.182654] #PF: supervisor read access in kernel mode
[  173.183408] #PF: error_code(0x0000) - not-present page
[  173.184152] PGD 0 P4D 0
[  173.184531] Oops: 0000 [#1] PREEMPT SMP KASAN PTI
[  173.185224] CPU: 26 PID: 1215 Comm: test Tainted: G    B
  6.6.0-rc2+ #8
[  173.186320] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[  173.187912] RIP: 0010:__mutex_lock+0xc0/0x920
[  173.188557] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
fe 4d
[  173.191203] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
[  173.191958] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
[  173.192968] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
[  173.193976] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
[  173.194986] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
[  173.196263] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
[  173.197274] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
knlGS:0000000000000000
[  173.198466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  173.199316] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
[  173.200327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  173.201382] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  173.202430] Call Trace:
[  173.202810]  <TASK>
[  173.203173]  ? __die_body+0x63/0xb0
[  173.203678]  ? page_fault_oops+0x2f3/0x440
[  173.204338]  ? __pfx_page_fault_oops+0x10/0x10
[  173.204981]  ? vprintk_emit+0x455/0x520
[  173.205593]  ? __pfx_vprintk_emit+0x10/0x10
[  173.206276]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[  173.207068]  ? do_user_addr_fault+0x796/0x840
[  173.207694]  ? _printk+0xac/0xf0
[  173.208188]  ? __pfx_do_user_addr_fault+0x10/0x10
[  173.208879]  ? rcu_is_watching+0x30/0x60
[  173.209475]  ? exc_page_fault+0x7d/0x290
[  173.210043]  ? asm_exc_page_fault+0x22/0x30
[  173.210639]  ? mddev_suspend+0xbc/0x260
[  173.211294]  ? add_taint+0x41/0x90
[  173.211798]  ? __mutex_lock+0xc0/0x920
[  173.212352]  ? lock_acquire+0x18a/0x390
[  173.212914]  ? kernfs_find_and_get_ns+0x4c/0xb0
[  173.213623]  ? __pfx___mutex_lock+0x10/0x10
[  173.214243]  ? down_read+0x6b2/0x800
[  173.214773]  ? lock_is_held_type+0xdb/0x150
[  173.215374]  mddev_suspend+0xbc/0x260
[  173.215941]  ? __pfx_lock_release+0x10/0x10
[  173.216541]  ? lock_is_held_type+0xdb/0x150
[  173.217148]  ? __pfx_mddev_suspend+0x10/0x10
[  173.217776]  rdev_attr_store+0x5ba/0x600
[  173.218343]  ? __pfx_sysfs_kf_write+0x10/0x10
[  173.218977]  kernfs_fop_write_iter+0x1d1/0x280
[  173.219618]  vfs_write+0x45d/0x5d0
[  173.220126]  ? __pfx_vfs_write+0x10/0x10
[  173.220689]  ? __pfx_lock_release+0x10/0x10
[  173.221342]  ksys_write+0xed/0x1a0
[  173.221850]  ? __pfx_ksys_write+0x10/0x10
[  173.222421]  ? __audit_syscall_entry+0x1cf/0x200
[  173.223090]  ? syscall_enter_from_user_mode+0x181/0x220
[  173.223845]  do_syscall_64+0x43/0x90
[  173.224362]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  173.225083] RIP: 0033:0x7f4e65ced648
[  173.225599] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
d4 55
[  173.228199] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[  173.229267] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
[  173.230273] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
[  173.231274] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
[  173.232323] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
[  173.233323] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
[  173.234333]  </TASK>
[  173.234657] Modules linked in:
[  173.235118] CR2: 00000000000000a8
[  173.235601] ---[ end trace 0000000000000000 ]---
[  173.236270] RIP: 0010:__mutex_lock+0xc0/0x920
[  173.236906] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
fe 4d
[  173.239538] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
[  173.240286] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
[  173.241293] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
[  173.242342] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
[  173.243343] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
[  173.244346] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
[  173.245384] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
knlGS:0000000000000000
[  173.246548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  173.247362] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
[  173.248371] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  173.249390] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  173.250395] Kernel panic - not syncing: Fatal exception
[  173.251612] Kernel Offset: disabled
[  173.252133] ---[ end Kernel panic - not syncing: Fatal exception ]---


On Sun, Aug 27, 2023 at 7:04 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> Changes in v2:
>  - rebase with latest md-next
>  - remove some follow up cleanup patches, these patches will be sent
>  later after this patchset.
>
> After previous four patchset of preparatory work, this patchset impelement
> a new version of mddev_suspend(), the new apis:
>  - reconfig_mutex is not required;
>  - the weird logical that suspend array hold 'reconfig_mutex' for
>    mddev_check_recovery() to update superblock is not needed;
>  - the special handling, 'pers->prepare_suspend', for raid456 is not
>    needed;
>  - It's safe to be called at any time once mddev is allocated, and it's
>    designed to be used from slow path where array configuration is changed;
>
> And use the new api to replace:
>
> mddev_lock
> mddev_suspend or not
> // array reconfiguration
> mddev_resume or not
> mddev_unlock
>
> With:
>
> mddev_suspend
> mddev_lock
> // array reconfiguration
> mddev_unlock
> mddev_resume
>
> However, the above change is not possible for raid5 and raid-cluster in
> some corner cases, and mddev_suspend/resume() is replaced with quiesce()
> callback, which will suspend the array as well.
>
> This patchset is tested in my VM with mdadm testsuite with loop device
> except for 10ddf tests(they always fail before this patchset).
>
> A lot of cleanups will be started after this patchset.
>
> Yu Kuai (28):
>   md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
>   md: use 'mddev->suspended' for is_md_suspended()
>   md: add new helpers to suspend/resume array
>   md: add new helpers to suspend/resume and lock/unlock array
>   md: use new apis to suspend array for suspend_lo/hi_store()
>   md: use new apis to suspend array for level_store()
>   md: use new apis to suspend array for serialize_policy_store()
>   md/dm-raid: use new apis to suspend array
>   md/md-bitmap: use new apis to suspend array for location_store()
>   md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
>   md/raid5-cache: use new apis to suspend array for
>     r5c_disable_writeback_async()
>   md/raid5-cache: use new apis to suspend array for
>     r5c_journal_mode_store()
>   md/raid5: use new apis to suspend array for raid5_store_stripe_size()
>   md/raid5: use new apis to suspend array for raid5_store_skip_copy()
>   md/raid5: use new apis to suspend array for
>     raid5_store_group_thread_cnt()
>   md/raid5: use new apis to suspend array for
>     raid5_change_consistency_policy()
>   md/raid5: replace suspend with quiesce() callback
>   md: quiesce before md_kick_rdev_from_array() for md-cluster
>   md: use new apis to suspend array for ioctls involed array
>     reconfiguration
>   md: use new apis to suspend array for adding/removing rdev from
>     state_store()
>   md: use new apis to suspend array for bind_rdev_to_array()
>   md: use new apis to suspend array related to serial pool in
>     state_store()
>   md: use new apis to suspend array in backlog_store()
>   md: suspend array in md_start_sync() if array need reconfiguration
>   md: cleanup mddev_create/destroy_serial_pool()
>   md/md-linear: cleanup linear_add()
>   md: remove old apis to suspend the array
>   md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
>
>  drivers/md/dm-raid.c       |   8 +-
>  drivers/md/md-autodetect.c |   4 +-
>  drivers/md/md-bitmap.c     |  18 ++-
>  drivers/md/md-linear.c     |   2 -
>  drivers/md/md.c            | 250 ++++++++++++++++++++++---------------
>  drivers/md/md.h            |  52 ++++++--
>  drivers/md/raid5-cache.c   |  61 +++++----
>  drivers/md/raid5.c         |  56 ++++-----
>  8 files changed, 253 insertions(+), 198 deletions(-)
>
> --
> 2.39.2
>




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux