On Wed, 4 Dec 2024 at 15:25, Jyothi Kumar Seerapu <quic_jseerapu@xxxxxxxxxxx> wrote: > > > > On 12/4/2024 6:15 PM, Dmitry Baryshkov wrote: > > On Wed, Dec 04, 2024 at 05:50:59PM +0530, Jyothi Kumar Seerapu wrote: > >> The DMA TRE(Transfer ring element) buffer contains the DMA > >> buffer address. Accessing data from this address can cause > >> significant delays in SPI transfers, which can be mitigated to > >> some extent by utilizing immediate DMA support. > >> > >> QCOM GPI DMA hardware supports an immediate DMA feature for data > >> up to 8 bytes, storing the data directly in the DMA TRE buffer > >> instead of the DMA buffer address. This enhancement enables faster > >> SPI data transfers. > >> > >> This optimization reduces the average transfer time from 25 us to > >> 16 us for a single SPI transfer of 8 bytes length, with a clock > >> frequency of 50 MHz. > >> > >> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@xxxxxxxxxxx> > >> --- > >> > >> v2-> v3: > >> - When to enable Immediate DMA support, control is moved to GPI driver > >> from SPI driver. > >> - Optimizations are done in GPI driver related to immediate dma changes. > >> - Removed the immediate dma supported changes in qcom-gpi-dma.h file > >> and handled in GPI driver. > >> > >> Link to v2: > >> https://lore.kernel.org/all/20241128133351.24593-2-quic_jseerapu@xxxxxxxxxxx/ > >> https://lore.kernel.org/all/20241128133351.24593-3-quic_jseerapu@xxxxxxxxxxx/ > >> > >> v1 -> v2: > >> - Separated the patches to dmaengine and spi subsystems > >> - Removed the changes which are not required for this feature from > >> qcom-gpi-dma.h file. > >> - Removed the type conversions used in gpi_create_spi_tre. > >> > >> Link to v1: > >> https://lore.kernel.org/lkml/20241121115201.2191-2-quic_jseerapu@xxxxxxxxxxx/ > >> > >> drivers/dma/qcom/gpi.c | 32 +++++++++++++++++++++++++++----- > >> 1 file changed, 27 insertions(+), 5 deletions(-) > >> > >> diff --git a/drivers/dma/qcom/gpi.c b/drivers/dma/qcom/gpi.c > >> index 52a7c8f2498f..35451d5a81f7 100644 > >> --- a/drivers/dma/qcom/gpi.c > >> +++ b/drivers/dma/qcom/gpi.c > >> @@ -27,6 +27,7 @@ > >> #define TRE_FLAGS_IEOT BIT(9) > >> #define TRE_FLAGS_BEI BIT(10) > >> #define TRE_FLAGS_LINK BIT(11) > >> +#define TRE_FLAGS_IMMEDIATE_DMA BIT(16) > >> #define TRE_FLAGS_TYPE GENMASK(23, 16) > >> > >> /* SPI CONFIG0 WD0 */ > >> @@ -64,6 +65,7 @@ > >> > >> /* DMA TRE */ > >> #define TRE_DMA_LEN GENMASK(23, 0) > >> +#define TRE_DMA_IMMEDIATE_LEN GENMASK(3, 0) > >> > >> /* Register offsets from gpi-top */ > >> #define GPII_n_CH_k_CNTXT_0_OFFS(n, k) (0x20000 + (0x4000 * (n)) + (0x80 * (k))) > >> @@ -1711,6 +1713,8 @@ static int gpi_create_spi_tre(struct gchan *chan, struct gpi_desc *desc, > >> dma_addr_t address; > >> struct gpi_tre *tre; > >> unsigned int i; > >> + int len; > >> + u8 immediate_dma; > >> > >> /* first create config tre if applicable */ > >> if (direction == DMA_MEM_TO_DEV && spi->set_config) { > >> @@ -1763,14 +1767,32 @@ static int gpi_create_spi_tre(struct gchan *chan, struct gpi_desc *desc, > >> tre_idx++; > >> > >> address = sg_dma_address(sgl); > >> - tre->dword[0] = lower_32_bits(address); > >> - tre->dword[1] = upper_32_bits(address); > >> + len = sg_dma_len(sgl); > >> > >> - tre->dword[2] = u32_encode_bits(sg_dma_len(sgl), TRE_DMA_LEN); > >> + immediate_dma = (direction == DMA_MEM_TO_DEV) && len <= 2 * sizeof(tre->dword[0]); > > > > inline this condition, remove extra brackets and split the line after &&. > Hi Dmitry Baryshkov, thanks for the review. > Sure, i will make the changes mentioned below. Please let me know otherwise. > > immediate_dma = direction == DMA_MEM_TO_DEV && > len <= 2 * sizeof(tre->dword[0]);> I was suggesting to _inline_ this condition rather than having a separate variable for it. > >> + > >> + /* Support Immediate dma for write transfers for data length up to 8 bytes */ > >> + if (immediate_dma) { > >> + /* > >> + * For Immediate dma, data length may not always be length of 8 bytes, > >> + * it can be length less than 8, hence initialize both dword's with 0 > >> + */ > >> + tre->dword[0] = 0; > >> + tre->dword[1] = 0; > >> + memcpy(&tre->dword[0], sg_virt(sgl), len); > >> + > >> + tre->dword[2] = u32_encode_bits(len, TRE_DMA_IMMEDIATE_LEN); > >> + } else { > >> + tre->dword[0] = lower_32_bits(address); > >> + tre->dword[1] = upper_32_bits(address); > >> + > >> + tre->dword[2] = u32_encode_bits(len, TRE_DMA_LEN); > >> + } > >> > >> tre->dword[3] = u32_encode_bits(TRE_TYPE_DMA, TRE_FLAGS_TYPE); > >> - if (direction == DMA_MEM_TO_DEV) > >> - tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_IEOT); > >> + tre->dword[3] |= u32_encode_bits(!!immediate_dma, TRE_FLAGS_IMMEDIATE_DMA); > >> + tre->dword[3] |= u32_encode_bits(!!(direction == DMA_MEM_TO_DEV), > >> + TRE_FLAGS_IEOT); > >> > >> for (i = 0; i < tre_idx; i++) > >> dev_dbg(dev, "TRE:%d %x:%x:%x:%x\n", i, desc->tre[i].dword[0], > >> -- > >> 2.17.1 > >> > > -- With best wishes Dmitry