Re: [PATCH net 1/1] s390/ism: fix receive message buffer allocation

Gerd Bayer <gbayer@xxxxxxxxxxxxx> · Thu, 04 Apr 2024 13:10:20 +0200

On Thu, 2024-04-04 at 10:13 +0200, Paolo Abeni wrote:
> On Thu, 2024-03-28 at 16:41 +0100, Gerd Bayer wrote:
> > Since [1], dma_alloc_coherent() does not accept requests for
> > GFP_COMP anymore, even on archs that may be able to fulfill this.
> > Functionality that relied on the receive buffer being a compound
> > page broke at that point:
> > The SMC-D protocol, that utilizes the ism device driver, passes
> > receive buffers to the splice processor in a struct
> > splice_pipe_desc with a single entry list of struct pages. As the
> > buffer is no longer a compound page, the splice processor now
> > rejects requests to handle more than a page worth of data.
> > 
> > Replace dma_alloc_coherent() and allocate a buffer with kmalloc()
> > then create a DMA map for it with dma_map_page(). Since only 
> > receive buffers on ISM devices use DMA, qualify the mapping as
> > FROM_DEVICE.
> > Since ISM devices are available on arch s390, only and on that arch
> > all DMA is coherent, there is no need to introduce and export some
> > kind of dma_sync_to_cpu() method to be called by the SMC-D protocol
> > layer.
> > 
> > Analogously, replace dma_free_coherent by a two step
> > dma_unmap_page, then kfree to free the receive buffer.
> > 
> > [1] https://lore.kernel.org/all/20221113163535.884299-1-hch@xxxxxx/
> > 
> > Fixes: c08004eede4b ("s390/ism: don't pass bogus GFP_ flags to
> > dma_alloc_coherent")
> > 

[...]

> > @@ -315,14 +319,27 @@ static int ism_alloc_dmb(struct ism_dev *ism,
> > struct ism_dmb *dmb)
> >  	    test_and_set_bit(dmb->sba_idx, ism->sba_bitmap))
> >  		return -EINVAL;
> >  
> > -	dmb->cpu_addr = dma_alloc_coherent(&ism->pdev->dev, dmb-
> > >dmb_len,
> > -					   &dmb->dma_addr,
> > -					   GFP_KERNEL |
> > __GFP_NOWARN |
> > -					   __GFP_NOMEMALLOC |
> > __GFP_NORETRY);
> > -	if (!dmb->cpu_addr)
> > -		clear_bit(dmb->sba_idx, ism->sba_bitmap);
> > +	dmb->cpu_addr = kmalloc(dmb->dmb_len, GFP_KERNEL |
> > __GFP_NOWARN |
> > +				__GFP_COMP | __GFP_NOMEMALLOC |
> > __GFP_NORETRY);
> 
> Out of sheer ignorance on my side, the __GFP_COMP flag looks
> suspicious here. I *think* that is relevant only for the page
> allocator. 
> 
> Why can't you use get_free_pages() (or similar) here? (possibly
> rounding up to the relevant page_aligned size). 

Thanks Paolo for your suggestion. However, I wanted to stay as close to
the implementation pre [1] - that used to use __GFP_COMP, too. I'd
rather avoid to change interfaces from "cpu_addr" to "struct page*" at
this point. In the long run, I'd like to drop the requirement for
compound pages entirely, since that *appears* to exist primarily for a
simplified handling of the interface to splice_to_pipe() in
net/smc/smc_rx.c. And of course there might be performance
implications...

At this point, I'm more concerned about my usage of the DMA API with
this patch.

Thanks again,
Gerd