On Thu, 2024-04-04 at 10:13 +0200, Paolo Abeni wrote: > On Thu, 2024-03-28 at 16:41 +0100, Gerd Bayer wrote: > > Since [1], dma_alloc_coherent() does not accept requests for > > GFP_COMP anymore, even on archs that may be able to fulfill this. > > Functionality that relied on the receive buffer being a compound > > page broke at that point: > > The SMC-D protocol, that utilizes the ism device driver, passes > > receive buffers to the splice processor in a struct > > splice_pipe_desc with a single entry list of struct pages. As the > > buffer is no longer a compound page, the splice processor now > > rejects requests to handle more than a page worth of data. > > > > Replace dma_alloc_coherent() and allocate a buffer with kmalloc() > > then create a DMA map for it with dma_map_page(). Since only > > receive buffers on ISM devices use DMA, qualify the mapping as > > FROM_DEVICE. > > Since ISM devices are available on arch s390, only and on that arch > > all DMA is coherent, there is no need to introduce and export some > > kind of dma_sync_to_cpu() method to be called by the SMC-D protocol > > layer. > > > > Analogously, replace dma_free_coherent by a two step > > dma_unmap_page, then kfree to free the receive buffer. > > > > [1] https://lore.kernel.org/all/20221113163535.884299-1-hch@xxxxxx/ > > > > Fixes: c08004eede4b ("s390/ism: don't pass bogus GFP_ flags to > > dma_alloc_coherent") > > [...] > > @@ -315,14 +319,27 @@ static int ism_alloc_dmb(struct ism_dev *ism, > > struct ism_dmb *dmb) > > test_and_set_bit(dmb->sba_idx, ism->sba_bitmap)) > > return -EINVAL; > > > > - dmb->cpu_addr = dma_alloc_coherent(&ism->pdev->dev, dmb- > > >dmb_len, > > - &dmb->dma_addr, > > - GFP_KERNEL | > > __GFP_NOWARN | > > - __GFP_NOMEMALLOC | > > __GFP_NORETRY); > > - if (!dmb->cpu_addr) > > - clear_bit(dmb->sba_idx, ism->sba_bitmap); > > + dmb->cpu_addr = kmalloc(dmb->dmb_len, GFP_KERNEL | > > __GFP_NOWARN | > > + __GFP_COMP | __GFP_NOMEMALLOC | > > __GFP_NORETRY); > > Out of sheer ignorance on my side, the __GFP_COMP flag looks > suspicious here. I *think* that is relevant only for the page > allocator. > > Why can't you use get_free_pages() (or similar) here? (possibly > rounding up to the relevant page_aligned size). Thanks Paolo for your suggestion. However, I wanted to stay as close to the implementation pre [1] - that used to use __GFP_COMP, too. I'd rather avoid to change interfaces from "cpu_addr" to "struct page*" at this point. In the long run, I'd like to drop the requirement for compound pages entirely, since that *appears* to exist primarily for a simplified handling of the interface to splice_to_pipe() in net/smc/smc_rx.c. And of course there might be performance implications... At this point, I'm more concerned about my usage of the DMA API with this patch. Thanks again, Gerd