On Fri, 2024-07-05 at 07:13 -1000, Tejun Heo wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > Hello, > > On Fri, Jul 05, 2024 at 03:55:44PM +0800, boy.wu wrote: > > From: Boy Wu <boy.wu@xxxxxxxxxxxx> > > > > In 32bit SMP systems, if the system is stressed on the sys node > > by processes, it may cause blkcg_fill_root_iostats to have a > concurrent > > What is sys node? > > > problem on the seqlock in u64_stats_update, which will cause a > deadlock > > on u64_stats_fetch_begin in blkcg_print_one_stat. > > I'm not following the scenario. Can you please detail the scenario > where > this leads to deadlocks? > > Thanks. > > -- > tejun I am using stress-ng to stress my ARM 32bit SMP system, and there is a test case --sysfs which create processes to read and write the node under /sys/. Then I encountered a deadlock that 3 CPUs are in do_raw_spin_lock(block/blk-cgroup.c:997) in blkcg_print_stat and 1 CPU is in u64_stats_fetch_begin(block/blk-cgroup.c:931) in blkcg_print_stat, and the sync.seq.sequence is an odd number, not an even number. When accessing /sys/fs/cgroup/io.stat, blkcg_print_stat will be called, and there is a small chance that four processes on each CPU core are accessing /sys/fs/cgroup/io.stat, which means four CPUs are calling blkcg_print_stat. As a result, blkcg_fill_root_iostats will be called simultaneously. However, u64_stats_update_begin_irqsave and u64_stats_update_end_irqrestore are not protect by spin_locks, so there is a small chance that the sync.seq.sequence will be an odd number after u64_stats_update_end_irqrestore due to the concurrent CPUs acess, because sync.seq.sequence plus one is not an atomic operation. do_raw_write_seqcount_begin(): /usr/src/kernel/common/include/linux/seqlock.h:469 c05e5cfc: e5963030 ldr r3, [r6, #48] ; 0x30 c05e5d00: e2833001 add r3, r3, #1 c05e5d04: e5863030 str r3, [r6, #48] ; 0x30 /usr/src/kernel/common/include/linux/seqlock.h:470 c05e5d08: f57ff05a dmb ishst do_raw_write_seqcount_end(): /usr/src/kernel/common/include/linux/seqlock.h:489 c05e5d30: f57ff05a dmb ishst /usr/src/kernel/common/include/linux/seqlock.h:490 c05e5d34: e5963030 ldr r3, [r6, #48] ; 0x30 c05e5d38: e2833001 add r3, r3, #1 c05e5d3c: e5863030 str r3, [r6, #48] ; 0x30 To prevent this problem, I added spin_locks in blkcg_fill_root_iostats, and this solution works fine to me when I use the stress-ng --sysfs test. -- boy.wu