On 1/3/2019 7:23 PM, Stephen Warren wrote: > From: Stephen Warren <swarren@xxxxxxxxxx> > > This patch solves a crash at the time of mlx4 driver unload or system > shutdown. The crash occurs because dma_alloc_coherent() returns one > value in mlx4_alloc_icm_coherent(), but a different value is passed to > dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because > when allocated, that pointer is passed to sg_set_buf() to record it, > then when freed it is re-calculated by calling > lowmem_page_address(sg_page()) which returns a different value. Solve > this by recording the value that dma_alloc_coherent() returns, and > passing this to dma_free_coherent(). > > This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get > rid of page operation after dma_alloc_coherent"). > > Based-on-code-from: Christoph Hellwig <hch@xxxxxx> > Signed-off-by: Stephen Warren <swarren@xxxxxxxxxx> > --- > v4 (Jan 3): > - Shortened commit description. > - Use bool not int in struct mlx4_icm_chunk. > - Tariq said "Thanks for your patch. It looks good to me." for v3. > v3 (Dec 19): > - Rework chunk data structure to store all data for coherent allocations > separately from the sg list. Code from Christoph Hellwig with fixes by > me. Notes: > - chunk->coherent is an int not a bool since checkpatch complains about > using bool in structs; see https://lkml.org/lkml/2017/11/21/384. > - chunk->coherent is used rather than chunk->table->coherent since the > table pointer isn't available when creating chunks. This duplicates > data, but simplifies the patch. > v2: > - Rework mlx4_table_find() to explicitly calculate the returned address > differently depending on wheter the table was allocated using > dma_alloc_coherent() or alloc_pages(), which in turn allows the > changes to mlx4_alloc_icm_pages() to be dropped. > - Drop changes to mlx4_alloc/free_icm_pages. This path uses > pci_map_sg() which can re-write the sg list which in turn would cause > chunk->mem[] (the sg list) and chunk->buf[] to become inconsistent. > - Enhance commit description. > > Note: I've tested this patch in a downstream 4.14 based kernel (using > ibping, ib_read_bw, and ib_write_bw), but can't test it in mainline > since my system isn't supported there yet. I have compile-tested it in > mainline at least, for ARM64. > --- > drivers/net/ethernet/mellanox/mlx4/icm.c | 92 ++++++++++++++---------- > drivers/net/ethernet/mellanox/mlx4/icm.h | 22 +++++- > 2 files changed, 75 insertions(+), 39 deletions(-) > Reviewed-by: Tariq Toukan <tariqt@xxxxxxxxxxxx> Thanks.