Re: [RFC 5/5] dmaengine: ti: k3-udma: Add glue layer for non DMAengine users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Vinod,

On 2018-10-10 12:44, Peter Ujfalusi wrote:
> 
> 
> On 2018-10-10 07:05, Vinod wrote:
>> On 24-09-18, 16:00, Peter Ujfalusi wrote:
>>> From: Grygorii Strashko <grygorii.strashko@xxxxxx>
>>>
>>> Certain users can not use right now the DMAengine API due to missing
>>> features in the core. Prime example is Networking.
>>
>> What are the features you have in mind?
> 
> The two big ones are:
> 1. Support for out of order completion of descriptors (cookie management)
> 
> This mostly concerns the DEV_TO_MEM.
> Currently we have dma_cookie_t (s32) to track which descriptor is
> completed, which one is issued and which one is pending.
> It is mandatory that descriptors are completed in the same order as they
> were issued.
> 
> We need to have support for out of order completion of the issued
> descriptors and thus the s32 way of tracking is not working.
> 
> We basically have three use cases needed to be supported:
> 1a: in order completion (currently supported by DMAengine)
> 1b: out of order completion in a channel, descriptors can complete in
> different order.
> 1c: out of order completion in a channel, the completion is ordered
> within 'classification type' of descriptors via the channel
> 
> If we issue descriptors 1, 2, 3, 4, 5, 6, 7
> 
> We could receive back them
> 1a: 1, 2, 3, 4, 5, 6, 7
> 1b: 4, 7, 2, 1, 3, 5, 6
> 
> in case of 1c we can classify descriptors when issuing:
> 1(c0), 2(c0), 3(c0), 4(c1), 5(c1), 6(c1), 7(c1)
> 
> and we receive the back like:
> 4(c1), 5(c1), 1(c0), 2(c0), 6(c1), 3(c0), 7(c1)
> 
> The descriptors are coming back in 'random' order, however they are in
> order within their classification.
> 
> The must supported one is 1b at the moment, 1c should be reasonably
> simple to add support for.
> 
> 2. NAPI support
> 
> Via NAPI an RX looks something like this:
> - give the hardware bunch of descriptors to receive packets
> - when the first packet is received get notification
>  - disable the interrupts
>  - schedule NAPI
> - in the NAPI poll function read out the completed descriptors one by
> one, process them, then give back the descriptor (with new buffer) to
> the hardware.
> - after poll finished (no more packets or got the number of packets NAPI
> told us to read out) enable the interrupts and wait for packets to start
> to flow in.

I have spent some time reading the http://www.ti.com/lit/pdf/spruid7
regarding to networking (cpsw, Chapter 12.2.1) and while 1 and 2 is
still valid as missing features there is one more issue I have
discovered which might prevent us using generic (DMAengine) API.

cpsw supports up to 9 ports (net_device) and have one RX DMA channel.
By default it uses 8 priority levels (flows) so depending on the packet
priority the packets are going to be received to different rings
(defined by the rflow configuration).

This is all fine if we could handle the out of order completion, but
packets from _all_ 9 ports are going to be received to these rings and
to make things a bit more problematic the port ID is coming within the
CPPI5 packet itself in the scr_dst_tag:

struct cppi5_desc_hdr_t {
	u32 pkt_info0;	/* Packet info word 0 (n/a in Buffer desc) */
	u32 pkt_info1;	/* Packet info word 1 (n/a in Buffer desc) */
	u32 pkt_info2;	/* Packet info word 2 Buffer reclamation info */
	u32 src_dst_tag; /* Packet info word 3 (n/a in Buffer desc) */
} __packed;

/**
 * Host-mode packet and buffer descriptor definition
 */
struct cppi5_host_desc_t {
	struct cppi5_desc_hdr_t hdr;
	u64 next_desc;	/* w4/5: Linking word */
	u64 buf_ptr;	/* w6/7: Buffer pointer */
	u32 buf_info1;	/* w8: Buffer valid data length */
	u32 org_buf_len; /* w9: Original buffer length */
	u64 org_buf_ptr; /* w10/11: Original buffer pointer */
	u32 epib[0]; /* Extended Packet Info Data (optional, 4 words) */
	/*
	 * Protocol Specific Data (optional, 0-128 bytes in multiples
	 * of 4), and/or Other Software Data (0-N bytes, optional)
	 */
} __packed;

It is not around the metadata section which starts at epib[].
In order to know which port received the packet the cpsw driver needs to
read the src_dst_tag and based on the value can select the correct
net_device.

So far I could only think two ways of handling this, which would be:
a. support for direct hw descriptor submission, so the client driver is
doing all the descriptor setup and the DMA driver is just provides an
interface to submit receive raw descriptors in kind of a passthrough mode.

b. Do not allow metadata pointer mode and create a CPPI5 specific struct
which would be attached and we would copy with CPU to/from it.

struct cppi5_metadata {
	u32 src_dst_tag;
	bool has_epib;
	u32 psdata_len;
	u32 swdata_len;
	u32 data[0];
};

This certainly going to add some overhead in the gigabit speed.

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux