Re: scheduling while atomic in z3fold

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2020-12-02 at 23:08 +0100, Sebastian Andrzej Siewior wrote:
> On 2020-12-02 03:30:27 [+0100], Mike Galbraith wrote:
>
> > What I'm seeing is the below.  rt_mutex_has_waiters() says yup we have
> > a waiter, rt_mutex_top_waiter() emits the missing cached leftmost, and
> > rt_mutex_dequeue_pi() chokes on it.  Lock is buggered.
>
> correct. So this:
>
> diff --git a/mm/z3fold.c b/mm/z3fold.c
> --- a/mm/z3fold.c
> +++ b/mm/z3fold.c
> @@ -342,7 +342,7 @@ static inline void free_handle(unsigned long handle)
>
>  	if (is_free) {
>  		struct z3fold_pool *pool = slots_to_pool(slots);
> -
> +		memset(slots, 0xee, sizeof(struct z3fold_buddy_slots));
>  		kmem_cache_free(pool->c_handle, slots);
>  	}
>  }
> @@ -548,8 +549,10 @@ static void __release_z3fold_page(struct z3fold_header *zhdr, bool locked)
>  		set_bit(HANDLES_ORPHANED, &zhdr->slots->pool);
>  	read_unlock(&zhdr->slots->lock);
>
> -	if (is_free)
> +	if (is_free) {
> +		memset(zhdr->slots, 0xdd, sizeof(struct z3fold_buddy_slots));
>  		kmem_cache_free(pool->c_handle, zhdr->slots);
> +	}
>
>  	if (locked)
>  		z3fold_page_unlock(zhdr);
>
> resulted in:
>
> |[  377.200696] Out of memory: Killed process 284358 (oom01) total-vm:15780488kB, anon-rss:150624kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:16760kB oom_score_adj:0
> |[  377.205438] ------------[ cut here ]------------
> |[  377.205441] pvqspinlock: lock 0xffff8880105c6828 has corrupted value 0xdddddddd!
> |[  377.205448] WARNING: CPU: 6 PID: 72 at kernel/locking/qspinlock_paravirt.h:498 __pv_queued_spin_unlock_slowpath+0xb3/0xc0
> |[  377.205455] Modules linked in:
> |[  377.205456] CPU: 6 PID: 72 Comm: oom_reaper Not tainted 5.10.0-rc6-rt13-rt+ #103
> |[  377.205458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1 04/01/2014
> |[  377.205460] RIP: 0010:__pv_queued_spin_unlock_slowpath+0xb3/0xc0
> …
> |[  377.205475] Call Trace:
> |[  377.205477]  __raw_callee_save___pv_queued_spin_unlock_slowpath+0x11/0x20
> |[  377.205481]  .slowpath+0x9/0xe
> |[  377.205483]  _raw_spin_unlock_irqrestore+0x5/0x50
> |[  377.205486]  rt_mutex_futex_unlock+0x9e/0xb0
> |[  377.205488]  z3fold_free+0x2b0/0x470
> |[  377.205491]  zswap_free_entry+0x7d/0xc0
> |[  377.205493]  zswap_frontswap_invalidate_page+0x87/0x90
> |[  377.205495]  __frontswap_invalidate_page+0x58/0x90
> |[  377.205496]  swap_range_free.constprop.0+0x99/0xb0
> |[  377.205499]  swapcache_free_entries+0x131/0x390
> |[  377.205501]  free_swap_slot+0x99/0xc0
> |[  377.205502]  __swap_entry_free+0x8a/0xa0
> |[  377.205504]  free_swap_and_cache+0x36/0xd0
> |[  377.205506]  zap_pte_range+0x16a/0x940
> |[  377.205509]  unmap_page_range+0x1d8/0x310
> |[  377.205514]  __oom_reap_task_mm+0xe7/0x190
> |[  377.205520]  oom_reap_task_mm+0x5a/0x280
> |[  377.205521]  oom_reaper+0x98/0x1c0
> |[  377.205525]  kthread+0x18c/0x1b0
> |[  377.205528]  ret_from_fork+0x22/0x30
> |[  377.205531] ---[ end trace 0000000000000002 ]---
>
> Then I reverted commit
>    4a3ac9311dac3 ("mm/z3fold.c: add inter-page compaction")
>
> and it seems to work now. Any suggestions? It looks like use-after-free.

Looks like...

d8f117abb380 z3fold: fix use-after-free when freeing handles

...wasn't completely effective.  write_unlock() in handle_free() is
where I see explosions.  Only trouble is that 4a3ac9311dac3 arrived in
5.5, and my 5.[5-7]-rt refuse to reproduce (d8f117abb380 applied),
whereas 5.9 and 5.10 do so quite reliably.  There is a heisenbug aspect
though, one trace_printk() in handle_free() made bug go hide, so the
heisen-fairies in earlier trees were probably just messing with me..
because they can, being the twisted little freaks they are ;-)

	-Mike






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux