Re: needed lru_add_drain_all() change

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Wed, 27 Jun 2012 12:27:31 +0200

On Tue, 2012-06-26 at 23:41 -0700, Andrew Morton wrote:
> On Wed, 27 Jun 2012 15:33:09 +0900 Minchan Kim <minchan@xxxxxxxxxx> wrote:
> 
> > Anyway, let's wait further answer, especially, RT folks. 
> 
> rt folks said "it isn't changing", and I agree with them.  It isn't
> worth breaking the rt-prio quality of service because a few odd parts
> of the kernel did something inappropriate.  Especially when those
> few sites have alternatives.

I'm not exactly sure its a 'few' sites.. but yeah there's a few obvious
sites we should look at.

Afaict all lru_add_drain_all() callers do this optimistically, esp.
since there's no hard sync. against adding new entries to the per-cpu
pagevecs.

So there's no hard requirement to wait for completion, now not waiting
has obvious problems as well, but we could cheat and timeout after a few
jiffies or so.

This would avoid the DoS scenario, it will not improve the over-all
quality of the kernel though, since an unflushed pagevec can result in
compaction etc. failing.

The problem with stuffing all this in hardirq context (using
on_each_cpu() and friends) is that these people who do spin in fifo
threads generally don't like interrupt latencies forced on them either.
And I presume its currently scheduled is because its potentially quite
expensive to flush all these pages.

The only alternative I can come up with is scheduling the work like we
do now, wait for it for a few jiffies, track which CPUs completed,
cancel the others, and remote flush their pagevecs from the calling cpu.

But I can't say I like that option either...

As it stands I've always said that doing while(1) from FIFO/RR tasks is
broken and you get to keep the pieces. If we can find good solutions for
this I'm all ears, but I don't think its something we should bend over
backwards for.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href