On Wed, 2019-04-24 at 08:24 -0700, James Bottomley wrote: +AD4 On Wed, 2019-04-24 at 15:52 +-0800, Ming Lei wrote: +AD4 +AD4 On Tue, Apr 23, 2019 at 08:37:15AM -0700, Bart Van Assche wrote: +AD4 +AD4 +AD4 On Tue, 2019-04-23 at 18:32 +-0800, Ming Lei wrote: +AD4 +AD4 +AD4 +AD4 +ACM-define SCSI+AF8-INLINE+AF8-PROT+AF8-SG+AF8-CNT 1 +AD4 +AD4 +AD4 +AD4 +AD4 +AD4 +AD4 +AD4 +-+ACM-define SCSI+AF8-INLINE+AF8-SG+AF8-CNT 2 +AD4 +AD4 +AD4 +AD4 +AD4 +AD4 So this patch inserts one kmalloc() and one kfree() call in the hot +AD4 +AD4 +AD4 path for every SCSI request with more than two elements in its +AD4 +AD4 +AD4 scatterlist? Isn't +AD4 +AD4 +AD4 +AD4 Slab or its variants are designed for fast path, and NVMe PCI uses +AD4 +AD4 slab for allocating sg list in fast path too. +AD4 +AD4 Actually, that's not really true base kmalloc can do all sorts of +AD4 things including kick off reclaim so it's not really something we like +AD4 using in the fast path. The only fast and safe kmalloc you can rely on +AD4 in the fast path is GFP+AF8-ATOMIC which will fail quickly if no memory +AD4 can easily be found. +ACo-However+ACo the sg+AF8-table allocation functions are +AD4 all pool backed (lib/sg+AF8-pool.c), so they use the lightweight GFP+AF8-ATOMIC +AD4 mechanism for kmalloc initially coupled with a backing pool in case of +AD4 failure to ensure forward progress. +AD4 +AD4 So, I think you're both right: you shouldn't simply use kmalloc, but +AD4 this implementation doesn't, it uses the sg+AF8-table allocation functions +AD4 which correctly control kmalloc to be lightweight and efficient and +AD4 able to make forward progress. Another concern is whether this change can cause a livelock. If the system is running out of memory and the page cache submits a write request with a scatterlist with more than two elements, if the kmalloc() for the scatterlist fails, will that prevent the page cache from making any progress with writeback? Bart.