On Tue, Mar 16, 2021 at 01:09:01PM +0000, Praveen Kumar Kannoju wrote: > To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to > 1 MB (order-8) memory, depending on the size of the MR. This costly > allocation can sometimes take very long to return (a few seconds), > especially if the system is fragmented and does not have any free chunks > for orders >= 3. This causes the calling application to hang for a long > time. To avoid these long latency spikes, limit max order of allocation to > order 3, and reuse that buffer to populate_xlt() for that MR. This will > increase the latency slightly (in the order of microseconds) for each > mlx5_ib_update_xlt() call, especially for larger MRs (since were making > multiple calls to populate_xlt()), but its a small price to pay to avoid > the large latency spikes with higher order allocations. The flag > __GFP_NORETRY is used while fetching the free pages to ensure that there > are no long compaction stalls when the system's memory is in fragmented > condition. > > Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@xxxxxxxxxx> > drivers/infiniband/hw/mlx5/mr.c | 22 +++------------------- > 1 file changed, 3 insertions(+), 19 deletions(-) > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c > index db05b0e..dac19f0 100644 > +++ b/drivers/infiniband/hw/mlx5/mr.c > @@ -1004,9 +1004,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd, > return mr; > } > > -#define MLX5_MAX_UMR_CHUNK ((1 << (MLX5_MAX_UMR_SHIFT + 4)) - \ > - MLX5_UMR_MTT_ALIGNMENT) > -#define MLX5_SPARE_UMR_CHUNK 0x10000 > +#define MLX5_SPARE_UMR_CHUNK 0x8000 > > /* > * Allocate a temporary buffer to hold the per-page information to transfer to > @@ -1028,30 +1026,16 @@ static void *mlx5_ib_alloc_xlt(size_t *nents, size_t ent_size, gfp_t gfp_mask) > */ > might_sleep(); > > - gfp_mask |= __GFP_ZERO; > + gfp_mask |= __GFP_ZERO | __GFP_NORETRY; > > - /* > - * If the system already has a suitable high order page then just use > - * that, but don't try hard to create one. This max is about 1M, so a > - * free x86 huge page will satisfy it. > - */ > size = min_t(size_t, ent_size * ALIGN(*nents, xlt_chunk_align), > - MLX5_MAX_UMR_CHUNK); > + MLX5_SPARE_UMR_CHUNK); > *nents = size / ent_size; > res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN, > get_order(size)); > if (res) > return res; > > - if (size > MLX5_SPARE_UMR_CHUNK) { > - size = MLX5_SPARE_UMR_CHUNK; > - *nents = get_order(size) / ent_size; > - res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN, > - get_order(size)); > - if (res) > - return res; > - } Why did you delete this and make the size smaller? Isn't GFP_NORETRY enough? Jason