On Wed, Dec 12, 2018 at 09:36:36AM -0700, Jens Axboe wrote: > On 12/12/18 9:28 AM, Keith Busch wrote: > > On Wed, Dec 12, 2018 at 09:18:11AM -0700, Jens Axboe wrote: > >> When boxes are run near (or to) OOM, we have a problem with the discard > >> page allocation in nvme. If we fail allocating the special page, we > >> return busy, and it'll get retried. But since ordering is honored for > >> dispatch requests, we can keep retrying this same IO and failing. Behind > >> that IO could be requests that want to free memory, but they never get > >> the chance. > >> > >> Allocate a fixed discard page per controller for a safe fallback, and use > >> that if the initial allocation fails. > > > > Do we need to allocate this per controller? One page for the whole driver > > may be sufficient to make forward progress, right? > > It should be, but that might create a shit storm if we're OOM and have > tons of drives. I think one per controller is saner, and it's dwarfed > by memory we consume anyway in static allocations. Okay fair enough. Reviewed-by: Keith Busch <keith.busch@xxxxxxxxx>