Re: [RFC PATCH] mm: silence soft lockups from unlock_page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 22, 2020 at 2:29 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> -#define PAGE_WAIT_TABLE_BITS 8
> +#define PAGE_WAIT_TABLE_BITS 10

Well, that seems harmless even on small machines.

> +       bool first_time = true;
>         bool thrashing = false;
>         bool delayacct = false;
>         unsigned long pflags;
> @@ -1134,7 +1135,12 @@ static inline int wait_on_page_bit_commo
>                 spin_lock_irq(&q->lock);
>
>                 if (likely(list_empty(&wait->entry))) {
> -                       __add_wait_queue_entry_tail(q, wait);
> +                       if (first_time) {
> +                               __add_wait_queue_entry_tail(q, wait);
> +                               first_time = false;
> +                       } else {
> +                               __add_wait_queue(q, wait);
> +                       }
>                         SetPageWaiters(page);
>                 }

This seems very hacky.

And in fact, looking closer, I'd say that there are more serious problems here.

Look at that WQ_FLAG_EXCLUSIVE thing: non-exclusive waits should
always go at the head (because they're not going to steal the bit,
they just want to know when it got cleared), and exclusive waits
should always go at the tail (because of fairness).

But that's not at all what we do.

Your patch adds even more confusion to this nasty area.

And your third one:

> +               if (ret)
> +                       woken++;
>
> -               if (bookmark && (++cnt > WAITQUEUE_WALK_BREAK_CNT) &&
> +               if (bookmark && (++cnt > WAITQUEUE_WALK_BREAK_CNT) && woken &&

I've got two reactions to this

 (a) we should not need a new "woken" variable, we should just set a
high bit of "cnt" and make WAITQUEUE_WALK_BREAK_CNT contain that high
bit

     (Tune "high bit" to whatever you want: it could be either the
_real_ high bit of the variable, or it could be something like "128",
which would mean that you'd break out after 128 non-waking entries).

 (b) Ugh, what hackery and magic behavior regardless

I'm really starting to hate that wait_on_page_bit_common() function.

See a few weeks ago how the function looks buggy to begin with

  https://lore.kernel.org/lkml/CAHk-=wjJA2Z3kUFb-5s=6+n0qbTs8ELqKFt9B3pH85a8fGD73w@xxxxxxxxxxxxxx/

and that never got resolved either (but probably never happens in practice).

              Linus




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux