Re: [PATCH V3] net/mlx4: Get rid of page operation after dma_alloc_coherent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/20/18 10:43 AM, Jason Gunthorpe wrote:
On Wed, Dec 19, 2018 at 11:20:31AM -0700, Stephen Warren wrote:
From: Stephen Warren <swarren@xxxxxxxxxx>

This patch solves a crash at the time of mlx4 driver unload or system
shutdown. The crash occurs because dma_alloc_coherent() returns one
value in mlx4_alloc_icm_coherent(), but a different value is passed to
dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because
when allocated, that pointer is passed to sg_set_buf() to record it,
then when freed it is re-calculated by calling
lowmem_page_address(sg_page()) which returns a different value. Solve
this by recording the value that dma_alloc_coherent() returns, and
passing this to dma_free_coherent().

This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get
rid of page operation after dma_alloc_coherent"). That patch was
described as:

In general, dma_alloc_coherent() returns a CPU virtual address and
a DMA address, and we have no guarantee that the underlying memory
even has an associated struct page at all.

This patch gets rid of the page operation after dma_alloc_coherent,
and records the VA returned form dma_alloc_coherent in the struct
of hem in hns RoCE driver.

However, this patch reworks the code to store all information about
coherent chunks separately from the sg list, since using sg lists for
them doesn't make sense. Hence, the structure of this patch is quite
different compared to the hns patch.

Based-on-code-from: Christoph Hellwig <hch@xxxxxx>
Signed-off-by: Stephen Warren <swarren@xxxxxxxxxx>
---
v3:
- Rework chunk data structure to store all data for coherent allocations
   separately from the sg list. Code from Christoph Hellwig with fixes by
   me. Notes:
   - chunk->coherent is an int not a bool since checkpatch complains about
     using bool in structs; see https://lkml.org/lkml/2017/11/21/384.

:( bool is much more readable and there is no performance concern in
this struct. I think checkpatch is overzealous here.

   - chunk->coherent is used rather than chunk->table->coherent since the
     table pointer isn't available when creating chunks. This duplicates
     data, but simplifies the patch.
v2:
- Rework mlx4_table_find() to explicitly calculate the returned address
   differently depending on wheter the table was allocated using
   dma_alloc_coherent() or alloc_pages(), which in turn allows the
   changes to mlx4_alloc_icm_pages() to be dropped.
- Drop changes to mlx4_alloc/free_icm_pages. This path uses
   pci_map_sg() which can re-write the sg list which in turn would cause
   chunk->mem[] (the sg list) and chunk->buf[] to become inconsistent.
- Enhance commit description.

Note: I've tested this patch in a downstream 4.14 based kernel (using
ibping, ib_read_bw, and ib_write_bw), but can't test it in mainline
since my system isn't supported there yet. I have compile-tested it in
mainline at least, for ARM64.
---
  drivers/net/ethernet/mellanox/mlx4/icm.c | 92 ++++++++++++++----------
  drivers/net/ethernet/mellanox/mlx4/icm.h | 22 +++++-
  2 files changed, 75 insertions(+), 39 deletions(-)

I think this needs an ack from the driver maintainers, and I gather
they are all on break for the next two weeks or so.

Lets revisit in January?

But at first glance it looks OK to me, though I would tidy the commit
message somewhat, your new leading paragraph fully explains the
problem, no need for the hns quote - and we can now see that the hns
should't have used a sgl either...

Tariq, I assume you're back from vacation now/soon. What do you think of this patch, aside from the bool-vs-int issue that I plan to fix up soon. Thanks.



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux