Re: [PATCH] xfs: use current->journal_info for detecting transaction recursion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 23, 2021 at 02:28:37PM +1100, Dave Chinner wrote:
> On Mon, Feb 22, 2021 at 06:15:57PM -0800, Darrick J. Wong wrote:
> > On Tue, Feb 23, 2021 at 10:31:07AM +1100, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > > 
> > > Because the iomap code using PF_MEMALLOC_NOFS to detect transaction
> > > recursion in XFS is just wrong. Remove it from the iomap code and
> > > replace it with XFS specific internal checks using
> > > current->journal_info instead.
> > 
> > It might be worth mentioning that this changes the PF_MEMALLOC_NOFS
> > behavior very slightly -- it's now bound to the allocation and freeing
> > of the transaction, instead of the strange way we used to do this, where
> > we'd set it at reservation time but we don't /clear/ it at unreserve time.
> 
> They are effectively the same thing, so I think you are splitting
> hairs here. The rule is "transaction context is NOFS" so whether it
> is set when the transaction context is entered or a few instructions
> later when we start the reservation is not significant.
> 
> > This doesn't strictly look like a fix patch, but as it is a Dumb
> > Developer Detector(tm) I could try to push it for 5.12 ... or just make
> > it one of the first 5.13 patches.  Any preference?
> 
> Nope. You're going to need to fix the transaction nesting the new gc
> code does before applying this, though, because that is detected as
> transaction recursion by this patch....

Well yes, I was trying to see if I could throw in the fix patch and the
idiot detector, both at the same time... :)

That said, it crashes in xfs/229:

  2822            args->result = __xfs_btree_split(args->cur, args->level, args->ptrp,
  2823                                             args->key, args->curp, args->stat);
  2824            complete(args->done);
  2825
> 2826            xfs_trans_clear_context(args->cur->bc_tp);
  2827            current_restore_flags_nested(&pflags, new_pflags);

It's possible for the original wait_for_completion() in
xfs_btree_split() to wake up immediately after complete() drops the
lock.  If it returns (and blows away the stack variable @args) before
the worker resumes, then the worker will be dereferencing freed stack
memory and blows up:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0 
 Oops: 0000 [#1] PREEMPT SMP
 CPU: 0 PID: 375393 Comm: kworker/0:16 Tainted: G        W         5.11.0-rc4-djw #rc4
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
 Workqueue: xfsalloc xfs_btree_split_worker [xfs]
 RIP: 0010:xfs_btree_split_worker+0xaa/0x120 [xfs]
 Code: 4b d8 48 8b 53 d0 8b 73 c8 48 8b 7b c0 4c 8b 4b e8 4c 8b 43 e0 e8 46 f5 ff ff 48 8b 7b f8 89 43 f0 e8 0a 84 aa e0 48 8b 43 c0 <48> 8b 00 49 3b 84 24 90 08 00 00 74 20 65 48 8b 14 25 80 6d 01 00
 RSP: 0018:ffffc90003123e60 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffffc9000311b8a8 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: 0000000000000286 RDI: 00000000ffffffff
 RBP: 0000000000000000 R08: ffffc9000311b858 R09: ffff88803ec29f80
 R10: ffff8880054a80d0 R11: ffff88803ec2a030 R12: ffff888005235d00
 R13: 0000000004248060 R14: ffff88800cbe5f00 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000000200c002 CR4: 00000000001706b0
 Call Trace:
  process_one_work+0x1dd/0x3b0
  worker_thread+0x57/0x3c0
  ? rescuer_thread+0x3b0/0x3b0
  kthread+0x14f/0x170
  ? __kthread_bind_mask+0x60/0x60
  ret_from_fork+0x1f/0x30
 Modules linked in: xfs dm_thin_pool dm_persistent_data dm_bio_prison dm_snapshot dm_bufio dm_flakey libcrc32c ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 ip_set_hash_ip xt_REDIRECT xt_set xt_tcpudp ip_set_hash_net ip_set_hash_mac ip_set nfnetlink iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bfq sch_fq_codel ip_tables x_tables overlay nfsv4 af_packet [last unloaded: xfs]
 Dumping ftrace buffer:
    (ftrace buffer empty)
 CR2: 0000000000000000
 ---[ end trace 1a6aae037bf68618 ]---

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux