On Mon, Sep 07, 2020 at 09:29:26AM +0200, Christoph Hellwig wrote: > On Thu, Sep 03, 2020 at 03:18:53PM +0300, Leon Romanovsky wrote: > > From: Maor Gottlieb <maorg@xxxxxxxxxx> > > > > Remove the implementation of ib_umem_add_sg_table and instead > > call to sg_alloc_table_append which already has the logic to > > merge contiguous pages. > > > > Besides that it removes duplicated functionality, it reduces the > > memory consumption of the SG table significantly. Prior to this > > patch, the SG table was allocated in advance regardless consideration > > of contiguous pages. > > > > In huge pages system of 2MB page size, without this change, the SG table > > would contain x512 SG entries. > > E.g. for 100GB memory registration: > > > > Number of entries Size > > Before 26214400 600.0MB > > After 51200 1.2MB > > > > Signed-off-by: Maor Gottlieb <maorg@xxxxxxxxxx> > > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx> > > Looks sensible for now, but the real fix is of course to avoid > the scatterlist here entirely, and provide a bvec based > pin_user_pages_fast. I'll need to finally get that done.. I'm working on cleaning all the DMA RDMA drivers using ib_umem to the point where doing something like this would become fairly simple. pin_user_pages_fast_bvec/whatever would be a huge improvement here, calling in a loop like this just to get a partial page list to copy to a SGL is horrificly slow due to all the extra overheads. Going directly to the bvec/sgl/etc inside all the locks will be a lot faster Jason