On 3/5/2019 9:00 AM, Vakul Garg wrote: > Instead of reading job ring's occupancy registers for every req/rsp > enqueued/dequeued respectively, we read these registers once and store > them in memory. After completing a job enqueue/dequeue, we decrement > these values. When these values become zero, we refresh the snapshot of > job ring's occupancy registers. This eliminates need of expensive device > register read operations for every job enqueued and dequeued and hence > makes caam_jr_enqueue() and caam_jr_dequeue() faster. > How expensive? Please share the case you benchmarked and performance improvement you noticed. Somewhat related: it seems that after commit a0ca6ca022ac ("crypto: caam - one tasklet per job ring") the "outlock" spinlock could be removed, this being a good candidate for further improvement. > Signed-off-by: Vakul Garg <vakul.garg@xxxxxxx> > --- > drivers/crypto/caam/intern.h | 1 + > drivers/crypto/caam/jr.c | 12 ++++++++++-- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h > index 5869ad58d497..b6d96e2ecf4c 100644 > --- a/drivers/crypto/caam/intern.h > +++ b/drivers/crypto/caam/intern.h > @@ -59,6 +59,7 @@ struct caam_drv_private_jr { > int out_ring_read_index; /* Output index "tail" */ > int tail; /* entinfo (s/w ring) tail index */ > struct jr_outentry *outring; /* Base of output ring, DMA-safe */ > + u32 inpring_avail; /* Number of free entries in i/p ring*/ Locality: this should be near the other enqueue-related structure members. Nitpick: use "input" instead of "i/p". Thanks, Horia