On 09/01/2017 04:34 AM, Michael Ellerman wrote: > Haren Myneni <haren@xxxxxxxxxxxxxxxxxx> writes: >>> On Mon, Aug 28, 2017 at 7:25 PM, Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote: >>>> Hi Haren, >>>> >>>> Some comments inline ... >>>> >>>> Haren Myneni <haren@xxxxxxxxxxxxxxxxxx> writes: >>>> >>>>> diff --git a/drivers/crypto/nx/nx-842-powernv.c b/drivers/crypto/nx/nx-842-powernv.c >>>>> index c0dd4c7e17d3..13089a0b9dfa 100644 >>>>> --- a/drivers/crypto/nx/nx-842-powernv.c >>>>> +++ b/drivers/crypto/nx/nx-842-powernv.c >>>>> @@ -32,6 +33,9 @@ MODULE_ALIAS_CRYPTO("842-nx"); >>>>> >>>>> #define WORKMEM_ALIGN (CRB_ALIGN) >>>>> #define CSB_WAIT_MAX (5000) /* ms */ >>>>> +#define VAS_RETRIES (10) >>>> >>>> Where does that number come from? >> >> Sometimes HW returns copy/paste failures. So we should retry the >> request again. With 10 retries, Test running 12 hours was successful >> for repeated compression/decompression requests with 1024 threads. > > But why 10. Why not 5, or 100, or 1, or 10,000? VAS spec says small number of retries. During my 12 hour test with 1024 threads - doing continuous compression/decompression requests, noticed around 6 or 7 retries needed. Hence used 10 retries. > > Presumably when we have to retry it means the NX is too busy to service the > request? One possible case. We can also see failures when receive/send credit are exhausted or reached the cached windows limit. > > Do we have any way to find out how long it might be busy for? Hard to know the reason of failure. VAS simply returns success or failure without providing the actual reason. > > Should we try an NX on another chip? Hard to decide whether to fall back on other NX engine since no way to know the actual failure reason on the current NX or no idea whether other NX is free. > > We should also take into account the size of our request, ie. are we > asking the NX to compress one page, or 1GB ? 842 compression/decompression request size for NX is always fixed. So divide in to smaller requests for large buffer. Whereas NX gzip engine is different - can be configurable request size. We can look at this optimization when gzip support is added. > > If it's just one page maybe we should fall back to software immediately. Right now falls back to SW decompression after 10 retries. Whereas user can use SW 842 compression upon failures. We are planning to look in to performance analysis as part of VAS/NX optimizatiion and make necessary changes. Thanks Haren > > cheers >