On 4/24/19 9:32 AM, Bart Van Assche wrote: > On Wed, 2019-04-24 at 08:24 -0700, James Bottomley wrote: >> On Wed, 2019-04-24 at 15:52 +0800, Ming Lei wrote: >>> On Tue, Apr 23, 2019 at 08:37:15AM -0700, Bart Van Assche wrote: >>>> On Tue, 2019-04-23 at 18:32 +0800, Ming Lei wrote: >>>>> #define SCSI_INLINE_PROT_SG_CNT 1 >>>>> >>>>> +#define SCSI_INLINE_SG_CNT 2 >>>> >>>> So this patch inserts one kmalloc() and one kfree() call in the hot >>>> path for every SCSI request with more than two elements in its >>>> scatterlist? Isn't >>> >>> Slab or its variants are designed for fast path, and NVMe PCI uses >>> slab for allocating sg list in fast path too. >> >> Actually, that's not really true base kmalloc can do all sorts of >> things including kick off reclaim so it's not really something we like >> using in the fast path. The only fast and safe kmalloc you can rely on >> in the fast path is GFP_ATOMIC which will fail quickly if no memory >> can easily be found. *However* the sg_table allocation functions are >> all pool backed (lib/sg_pool.c), so they use the lightweight GFP_ATOMIC >> mechanism for kmalloc initially coupled with a backing pool in case of >> failure to ensure forward progress. >> >> So, I think you're both right: you shouldn't simply use kmalloc, but >> this implementation doesn't, it uses the sg_table allocation functions >> which correctly control kmalloc to be lightweight and efficient and >> able to make forward progress. > > Another concern is whether this change can cause a livelock. If the system > is running out of memory and the page cache submits a write request with > a scatterlist with more than two elements, if the kmalloc() for the > scatterlist fails, will that prevent the page cache from making any progress > with writeback? The mempool prevents that - as long as we have at least one reserved unit of the given size, we know we'll always make progress (eventually). The finite life times of the sg pool allocations guarantee that. -- Jens Axboe