On Thu 15-07-21 09:42:06, Boyang Xue wrote: > On Thu, Jul 15, 2021 at 7:46 AM Roman Gushchin <guro@xxxxxx> wrote: > > > > On Thu, Jul 15, 2021 at 12:22:28AM +0800, Boyang Xue wrote: > > > Hi Jan, > > > > > > On Wed, Jul 14, 2021 at 5:26 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > > > On Wed 14-07-21 16:44:33, Boyang Xue wrote: > > > > > Hi Roman, > > > > > > > > > > On Wed, Jul 14, 2021 at 12:12 PM Roman Gushchin <guro@xxxxxx> wrote: > > > > > > > > > > > > On Wed, Jul 14, 2021 at 11:21:12AM +0800, Boyang Xue wrote: > > > > > > > Hello, > > > > > > > > > > > > > > I'm not sure if this is the right place to report this bug, please > > > > > > > correct me if I'm wrong. > > > > > > > > > > > > > > I found kernel-5.14.0-rc1 (built from the Linus tree) crash when it's > > > > > > > running xfstests generic/256 on ext4 [1]. Looking at the call trace, > > > > > > > it looks like the bug had been introduced by the commit > > > > > > > > > > > > > > c22d70a162d3 writeback, cgroup: release dying cgwbs by switching attached inodes > > > > > > > > > > > > > > It only happens on aarch64, not on x86_64, ppc64le and s390x. Testing > > > > > > > was performed with the latest xfstests, and the bug can be reproduced > > > > > > > on ext{2, 3, 4} with {1k, 2k, 4k} block sizes. > > > > > > > > > > > > Hello Boyang, > > > > > > > > > > > > thank you for the report! > > > > > > > > > > > > Do you know on which line the oops happens? > > > > > > > > > > I was trying to inspect the vmcore with crash utility, but > > > > > unfortunately it doesn't work. > > > > > > > > Thanks for report! Have you tried addr2line utility? Looking at the oops I > > > > can see: > > > > > > Thanks for the tips! > > > > > > It's unclear to me that where to find the required address in the > > > addr2line command line, i.e. > > > > > > addr2line -e /usr/lib/debug/lib/modules/5.14.0-0.rc1.15.bx.el9.aarch64/vmlinux > > > <what address here?> > > > > You can use $nm <vmlinux> to get an address of cleanup_offline_cgwbs_workfn() > > and then add 0x320. > > Thanks! Hope the following helps: Thanks for the data! > static void cleanup_offline_cgwbs_workfn(struct work_struct *work) > { > struct bdi_writeback *wb; > LIST_HEAD(processed); > > spin_lock_irq(&cgwb_lock); > > while (!list_empty(&offline_cgwbs)) { > wb = list_first_entry(&offline_cgwbs, struct bdi_writeback, > offline_node); > list_move(&wb->offline_node, &processed); > > /* > * If wb is dirty, cleaning up the writeback by switching > * attached inodes will result in an effective removal of any > * bandwidth restrictions, which isn't the goal. Instead, > * it can be postponed until the next time, when all io > * will be likely completed. If in the meantime some inodes > * will get re-dirtied, they should be eventually switched to > * a new cgwb. > */ > if (wb_has_dirty_io(wb)) > continue; > > if (!wb_tryget(wb)) <=== line#679 > continue; Aha, interesting. So it seems we crashed trying to dereference wb->refcnt->data. So it looks like cgwb_release_workfn() raced with cleanup_offline_cgwbs_workfn() and percpu_ref_exit() got called from cgwb_release_workfn() and then cleanup_offline_cgwbs_workfn() called wb_tryget(). I think the proper fix is to move: spin_lock_irq(&cgwb_lock); list_del(&wb->offline_node); spin_unlock_irq(&cgwb_lock); in cgwb_release_workfn() to the beginning of that function so that we are sure even cleanup_offline_cgwbs_workfn() cannot be working with the wb when it is being released. Roman? Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR