On Fri, Jan 07, 2022 at 08:10:33PM -0400, Jason Gunthorpe wrote: > On Wed, Jan 05, 2022 at 07:45:21PM +0800, Hillf Danton wrote: > > On Mon, 3 Jan 2022 20:47:02 +0200 Leon Romanovsky wrote: > > > On Mon, Jan 03, 2022 at 09:05:16AM -0800, syzbot wrote: > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: a8ad9a2434dc Merge tag 'efi-urgent-for-v5.16-2' of git://g.. > > > > git tree: upstream > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10cf5253b00000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1a86c22260afac2f > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=e3f96c43d19782dd14a7 > > > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > > > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+e3f96c43d19782dd14a7@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > > > > > ================================================================== > > > > BUG: KASAN: use-after-free in ucma_cleanup_multicast drivers/infiniband/core/ucma.c:491 [inline] > > > > BUG: KASAN: use-after-free in ucma_destroy_private_ctx+0x914/0xb70 drivers/infiniband/core/ucma.c:579 > > > > Read of size 8 at addr ffff88801bb74b00 by task syz-executor.1/25529 > > > > > > Jason, > > > > > > Can it be race between ucma_process_join() and "if (refcount_read(&ctx->ref))" > > > check in ucma_destroy_private_ctx()? > > > > Given cmpxchg in both ucma_close() and ucma_destroy_id(), > > ucma_destroy_private_ctx() can not run more than once, in addition to what > > is more weird is that the ucma_fops.release either is running in parallel > > to a writer or completes with a writer left behind. Light on if that weirdness > > is down to anything other than syzbot is highly appreciated. > > It is a stupid mistake, it is because > > xa_for_each(&multicast_table, index, mc) { > if (mc->ctx != ctx) > ^^^^^^^ > > Nothing is locking mc here, this needed to hold the xa_lock to be > correct. > > It is caused by this: > > commit 95fe51096b7adf1d1e7315c49c75e2f75f162584 > Author: Jason Gunthorpe <jgg@xxxxxxxx> > Date: Tue Aug 18 15:05:17 2020 +0300 > > RDMA/ucma: Remove mc_list and rely on xarray > > It is not really necessary to keep a linked list of mcs associated with > each context when we can just scan the xarray to find the right things. > > The removes another overloading of file->mut by relying on the xarray > locking for mc instead. > > Link: https://lore.kernel.org/r/20200818120526.702120-6-leon@xxxxxxxxxx > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > So it is solved either by putting the linked list back, but locking it > using the xa_lock, or by holding the xa_lock when doing the > xa_for_each() > > A list is probably the better choice. I'll submit the fix after internal review complete. Thanks > > Jason