RE: [PATCH v2] crypto/caam: add backlogging support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Herbert,

> -----Original Message-----
> From: Herbert Xu [mailto:herbert@xxxxxxxxxxxxxxxxxxx]
> Sent: Wednesday, September 23, 2015 3:02 PM
> To: Porosanu Alexandru-B06830 <alexandru.porosanu@xxxxxxxxxxxxx>
> Cc: linux-crypto@xxxxxxxxxxxxxxx; Geanta Neag Horia Ioan-B05471
> <Horia.Geanta@xxxxxxxxxxxxx>; Pop Mircea-R19439
> <mircea.pop@xxxxxxxxxxxxx>
> Subject: Re: [PATCH v2] crypto/caam: add backlogging support
> 
> On Fri, Sep 18, 2015 at 02:27:12PM +0000, Porosanu Alexandru wrote:
> >
> > Well, the HW has less than the whole RAM for backlogging requests, it has
> the # of available backlogging requests slots.
> > Then it will start dropping, just like in the out-of-mem case.
> 
> OK I think that's where our misunderstanding is.  For a backlogged request
> you do not give it to the hardware immediately.  In fact a request should only
> be backlogged when the hardware queue is completely full.  It should stay in
> a software queue until the hardware has space for it.  When that happens
> you move it onto the hardware queue and invoke the completion function
> with err set to -EINPROGRESS.  This tells the caller to enqueue that it may
> enqueue more requests.

Yes, you are absolutely right. In this case, I have some reasons why I wouldn't like to use a crypto_queue based approach:

1) we've prototyped a crypto_queue implementation which did not reach the performance expectations due to CPU overhead;

2) the modifications implied by adding support for the crypto_queue in the driver add complexity, as well as increasing the code size;

3) as opposed to f.i. Talitos, there's already a queue that is long enough to be split in half for reserving slots for any backlogging tfm; that's what I've proposed in v4 of this patch;

To elaborate a bit on  in the v4 of this patch, I've introduce a limitation of the # of tfms that can be affined on a JR, equal to half the JR size minus 1. This means that in the worst case scenario where all the 255 (for a JR length of 512) tfms are backlog-enabled, there are at least 2 slots available for each tfm.

An observation about the queue length: having a deep queue (be it HW or SW) doesn't help; the length of the queue should be there to dampen  the spikes. The longer the queue, the worse the latency becomes.  TBH, our current setting of 512 entries in the JR is way too big. Somewhere around 16-64 should be enough. Let's take the following example: 9600B (jumbo) TCP packets on a 10Gbps line; the time interval between the last enqueue in our 512 entries JR and the time it goes out of the JR and back to the net subsystem is equal to ~39 ms.


> 
> Cheers,
> --
> Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> Home Page:
> http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

BR,

Alex P.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux