On Thu, 5 Sep 2024 15:42:00 +0800 Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > Hi, > > 在 2024/09/05 5:51, Song Liu 写道: > > On Mon, Sep 2, 2024 at 1:38 AM Kinga Stefaniuk > > <kinga.stefaniuk@xxxxxxxxx> wrote: > >> > >> In mdadm commit 49b69533e8 ("mdmonitor: check if udev has finished > >> events processing") mdmonitor has been learnt to wait for udev to > >> finish processing, and later in commit 9935cf0f64f3 ("Mdmonitor: > >> Improve udev event handling") pooling for MD events on > >> /proc/mdstat file has been deprecated because relying on udev > >> events is more reliable and less bug prone (we are not competing > >> with udev). > >> > >> After those changes we are still observing missing mdmonitor > >> events in some scenarios, especially SpareEvent is likely to be > >> missed. With this patch MD will be able to generate more change > >> uevents and wakeup mdmonitor more frequently to give it > >> possibility to notice events. MD has md_new_events() functionality > >> to trigger events and with this patch this function is extended to > >> generate udev CHANGE uevents. It cannot be done directly because > >> this function is called on interrupts context, so appropriate > >> workqueue is created. Uevents are less time critical, it is safe > >> to use workqueue. It is limited to CHANGE event as there is no > >> need to generate other uevents for now. With this change, > >> mdmonitor events are less likely to be missed. Our internal tests > >> suite confirms that, mdmonitor reliability is (again) improved. > >> Start using irq methods on all_mddevs_lock, because it can be > >> reached by interrupt context. > >> > >> Signed-off-by: Mateusz Grzonka <mateusz.grzonka@xxxxxxxxx> > >> Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@xxxxxxxxx> > > > > I am seeing new failures from mdadm tests, for example, test > > 01replace. Please run these tests and fix the issues. > > I just test this myself in my VM, I didn't see 01replace failed, > howerver, test 13imsm-r0_r5_3d-grow-r0_r5_4d start to hang: > > [16098.862049] INFO: task systemd-udevd:57927 blocked for more than > 368 seconds.^M > [16098.863049] Not tainted 6.11.0-rc1-00078-g761e5afb6ddb-dirty > #362^M [16098.863802] "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message.^M > [16098.865773] ^M > [16098.865773] Showing all locks held in the system:^M > [16098.866702] 1 lock held by khungtaskd/31:^M > [16098.867233] #0: ffffffff8a789b40 (rcu_read_lock){....}-{1:2}, at: > debug_show_all_locks+0x46/0x320^M > [16098.868589] 1 lock held by systemd-journal/203:^M > [16098.869276] 1 lock held by systemd-udevd/57927:^M > [16098.869966] #0: ffff8881a61fa1a8 > (mapping.invalidate_lock#2){++++}-{3:3}, at: > page_cache_ra_unbounded+0x73/0x2d0^M > [16098.871477] 4 locks held by mdadm/58163:^M > [16098.872099] #0: ffff88817d4b4400 (sb_writers#5){.+.+}-{0:0}, at: > vfs_write+0x32d/0x470^M > [16098.873303] #1: ffff888193dcd688 (&of->mutex#2){+.+.}-{3:3}, at: > kernfs_fop_write_iter+0x143/0x280^M > [16098.874620] #2: ffff8881323cb010 (kn->active#98){.+.+}-{0:0}, at: > kernfs_fop_write_iter+0x153/0x280^M > [16098.876005] #3: ffff888193d4a0a8 > (&mddev->suspend_mutex){+.+.}-{3:3}, at: mddev_suspend+0x59/0x380 > [md_mod]^M > > [root@fedora ~]# cat /proc/57927/stack > [<0>] wait_woken+0xa4/0xd0 > [<0>] raid5_make_request+0x994/0x2080 [raid456] > [<0>] md_handle_request+0x17a/0x4b0 [md_mod] > [<0>] md_submit_bio+0x7c/0x130 [md_mod] > [<0>] __submit_bio+0x12b/0x190 > [<0>] submit_bio_noacct_nocheck+0x22b/0x6a0 > [<0>] submit_bio_noacct+0x259/0xac0 > [<0>] submit_bio+0x58/0x1d0 > [<0>] mpage_readahead+0x195/0x280 > [<0>] blkdev_readahead+0x1d/0x30 > [<0>] read_pages+0x6e/0x550 > [<0>] page_cache_ra_unbounded+0x1c6/0x2d0 > [<0>] do_page_cache_ra+0x4f/0x80 > [<0>] force_page_cache_ra+0x78/0xc0 > [<0>] page_cache_sync_ra+0x60/0x460 > [<0>] filemap_get_pages+0x13f/0xba0 > [<0>] filemap_read+0x122/0x590 > [<0>] blkdev_read_iter+0x7a/0x210 > [<0>] vfs_read+0x27f/0x400 > [<0>] ksys_read+0x85/0x180 > [<0>] __x64_sys_read+0x21/0x30 > [<0>] x64_sys_call+0x45e7/0x4600 > [<0>] do_syscall_64+0xd5/0x230 > [<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Does user space need to change as well? > > Thanks, > Kuai > > > > > Thanks, > > Song > > > > > > . > > > > Hi Thanks for your review. I rebased my patch to md-6.12 branch and met the same symptoms as Kuai. I need to investigate it and will be back with my findings or new patch version. Maybe there is a problem with tests, because I can only reproduce it when I run all of the tests. While running them one-by-one I don't see this problem. Thanks, Kinga