On Mon, Mar 27, 2017 at 10:55:14AM +0200, Jesper Dangaard Brouer wrote: > On Mon, 27 Mar 2017 03:32:47 -0400 (EDT) > Pankaj Gupta <pagupta@xxxxxxxxxx> wrote: > > > Hello, > > > > It looks like a race with softirq and normal process context. > > > > Just thinking if we really want allocations from 'softirqs' to be > > done using per cpu list? > > Yes, softirq need fast page allocs. The softirq use-case is refilling > the DMA RX rings, which is time critical, especially for NIC drivers. > For this reason most drivers implement different page recycling tricks. > > > Or we can have some check in 'free_hot_cold_page' for softirqs > > to check if we are on a path of returning from hard interrupt don't > > allocate from per cpu list. > > A possible solution, would be use the local_bh_{disable,enable} instead > of the {preempt_disable,enable} calls. But it is slower, using numbers > from [1] (19 vs 11 cycles), thus the expected cycles saving is 38-19=19. > > The problematic part of using local_bh_enable is that this adds a > softirq/bottom-halves rescheduling point (as it checks for pending > BHs). Thus, this might affects real workloads. > > > I'm unsure what the best option is. I'm leaning towards partly > reverting[1] and go back to doing the slower local_irq_save + > local_irq_restore as before. > > Afterwards we can add a bulk page alloc+free call, that can amortize > this 38 cycles cost (of local_irq_{save,restore}). Or add a function > call that MUST only be called from contexts with IRQs enabled, which > allow using the unconditionally local_irq_{disable,enable} as it only > costs 7 cycles. > It's possible to have a separate list for hard/soft IRQ that are protected although great care is needed to drain properly. I have a partial prototype lying around marked as "interesting if we ever need it" but it needs more work. It's sufficiently complex that I couldn't rush it as a fix with the time I currently have available. For 4.11, it's safer to revert and try again later bearing in mind that softirqs are in the critical allocation path for some drivers. I'll prepare a patch. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>