On Tue, 2021-03-02 at 19:33 +0200, Horia Geantă wrote: > On 12/14/2020 9:00 PM, Lucas Stach wrote: > > Hi all, > > > > I've been looking into a CAAM RNG issue for a while, where I could need > > some input from people knowing the CAAM hardware better than I do. > > Basically the issue is that on some i.MX6 units the RNG functionality > > sometimes fails with this error: > > caam_jr 2101000.jr0: 20003c5b: CCB: desc idx 60: RNG: Hardware error. > > > > I can tell that it is related to the entropy delay. On all failing > > units the RNG4 gets instantiated with the default entropy delay of > > 3200. If I dial up the delay to 3600 or 4000 the RNG works reliably. As > > a negative test I changed the initial delay to 400. With this change > > all units are able to successfully instantiate the RNG handles at an > > entropy delay of 2000 or 2400, but then reliably fail at getting random > > data with the error shown above. I guess the issue is related to > > prediction resistance on the handles, which causes the PRNG to be re- > > seeded from the TRNG fairly often. > > > > Now I don't have a good idea on how to arrive at a reliably working > > entropy delay setting, as apparently the simple "are we able to > > instantiate the handle" check is not enough to actually guarantee a > > working RNG setup. Any suggestions? > > > The successful instantiation of the RNG state handle(s) means that > the HW self-tests passed, but this doesn't mean RNG will work flawlessly. > > A properly configured RNG should have a certain (very low) failure rate. > The logic in the caam rng driver is not checking this rate, since it's > running > only once with a given configuration. > OTOH properly checking the RNG configuration would take some time, so it > would > be better to run it offline. The "characterization" should also account for > temperature, voltage and process (fixed for a given SoC). > > From this perspective, the caam rng driver should be updated to statically > configure the RNG with these offline-determined parameters. > Ideally we'd be able to use a single set of parameters to cover all SoCs > that have the same IP (RNG4 TRNG). > Unfortunately we're not there yet. > > The situation became more visible after changing the caam rng driver to > reseed > the PRNG before every request (practically making the PRNG function like a > TRNG, > a hwrng framework requirement), since the HW self-tests are now running more > often then before. > > Some questions that would give me more details about the exact issue you > and Robert are facing: > > 1. What SoC exactly are you running on? > > 2. How fast and how often is the RNG hardware error occurring? > Does this happen at boot time, only when stressing /dev/hwrng etc.? We are using an iMX6D. In our case, it seems this is occurring relatively rarely - I have only seen this occur on a few boots. When it has happened, it started reporting errors at boot and regularly thereafter - probably as a result of accesses being made by the rngd daemon. > > 3. Try dumping some of the RNG registers using below patch: > > -- >8 -- > > Subject: [PATCH] crypto: caam - rng debugging > > Dump RNG registers at hwrng.init time and in case descriptor returns > RNG HW error. > > Signed-off-by: Horia Geantă <horia.geanta@xxxxxxx> > --- > drivers/crypto/caam/caamrng.c | 9 ++++++++- > drivers/crypto/caam/ctrl.c | 29 +++++++++++++++++++++++++++++ > drivers/crypto/caam/ctrl.h | 2 ++ > drivers/crypto/caam/regs.h | 5 ++++- > 4 files changed, 43 insertions(+), 2 deletions(-) > > diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c > index 77d048dfe5d0..fc2192183696 100644 > --- a/drivers/crypto/caam/caamrng.c > +++ b/drivers/crypto/caam/caamrng.c > @@ -16,6 +16,7 @@ > > #include "compat.h" > > +#include "ctrl.h" > #include "regs.h" > #include "intern.h" > #include "desc_constr.h" > @@ -57,9 +58,12 @@ static void caam_rng_done(struct device *jrdev, u32 *desc, > u32 err, > { > struct caam_rng_job_ctx *jctx = context; > > - if (err) > + if (err) { > *jctx->err = caam_jr_strstatus(jrdev, err); > > + caam_dump_rng_regs(jrdev); > + } > + > complete(jctx->done); > } > > @@ -199,6 +203,9 @@ static int caam_init(struct hwrng *rng) > return err; > } > > + dev_dbg(ctx->jrdev, "CAAM RNG - register status at hwrng.init time\n"); > + caam_dump_rng_regs(ctx->jrdev); > + > /* > * Fill async buffer to have early randomness data for > * hw_random > diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c > index ca0361b2dbb0..52db32b599aa 100644 > --- a/drivers/crypto/caam/ctrl.c > +++ b/drivers/crypto/caam/ctrl.c > @@ -27,6 +27,35 @@ EXPORT_SYMBOL(caam_dpaa2); > #include "qi.h" > #endif > > +void caam_dump_rng_regs(struct device *jrdev) > +{ > + struct device *ctrldev = jrdev->parent; > + struct caam_drv_private *ctrlpriv = dev_get_drvdata(ctrldev); > + struct caam_ctrl __iomem *ctrl; > + struct rng4tst __iomem *r4tst; > + u32 rtmctl; > + > + dev_dbg(jrdev, "RNG register dump:\n"); > + > + ctrl = (struct caam_ctrl __iomem *)ctrlpriv->ctrl; > + r4tst = &ctrl->r4tst[0]; > + > + dev_dbg(jrdev, "\trdsta = 0x%08x\n", rd_reg32(&r4tst->rdsta)); > + > + rtmctl = rd_reg32(&r4tst->rtmctl); > + dev_dbg(jrdev, "\trtmctl = 0x%08x\n", rtmctl); > + dev_dbg(jrdev, "\trtstatus = 0x%08x\n", rd_reg32(&r4tst->rtstatus)); > + > + /* Group of registers that can be read only when RTMCTL[PRGM]=1 */ > + clrsetbits_32(&r4tst->rtmctl, 0, RTMCTL_PRGM | RTMCTL_ACC); > + dev_dbg(jrdev, "\trtscmisc = 0x%08x\n", rd_reg32(&r4tst->rtscmisc)); > + dev_dbg(jrdev, "\trtfrqmin = 0x%08x\n", rd_reg32(&r4tst->rtfrqmin)); > + dev_dbg(jrdev, "\trtfrqmax = 0x%08x\n", rd_reg32(&r4tst->rtfrqmax)); > + clrsetbits_32(&r4tst->rtmctl, RTMCTL_PRGM | RTMCTL_ACC, RTMCTL_ERR); > + > +} > +EXPORT_SYMBOL(caam_dump_rng_regs); > + > /* > * Descriptor to instantiate RNG State Handle 0 in normal mode and > * load the JDKEK, TDKEK and TDSK registers > diff --git a/drivers/crypto/caam/ctrl.h b/drivers/crypto/caam/ctrl.h > index f3ecd67922a7..806f4563990c 100644 > --- a/drivers/crypto/caam/ctrl.h > +++ b/drivers/crypto/caam/ctrl.h > @@ -11,4 +11,6 @@ > /* Prototypes for backend-level services exposed to APIs */ > extern bool caam_dpaa2; > > +void caam_dump_rng_regs(struct device *ctrldev); > + > #endif /* CTRL_H */ > diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h > index af61f3a2c0d4..dfc25a458a55 100644 > --- a/drivers/crypto/caam/regs.h > +++ b/drivers/crypto/caam/regs.h > @@ -493,6 +493,7 @@ struct rngtst { > /* RNG4 TRNG test registers */ > struct rng4tst { > #define RTMCTL_ACC BIT(5) /* TRNG access mode */ > +#define RTMCTL_ERR BIT(12) /* TRNG error */ > #define RTMCTL_PRGM BIT(16) /* 1 -> program mode, 0 -> run mode */ > #define RTMCTL_SAMP_MODE_VON_NEUMANN_ES_SC 0 /* use von Neumann data in > both entropy shifter and > @@ -526,7 +527,9 @@ struct rng4tst { > u32 rtfrqmax; /* PRGM=1: freq. count max. limit register */ > u32 rtfrqcnt; /* PRGM=0: freq. count register */ > }; > - u32 rsvd1[40]; > + u32 rsvd[7]; > + u32 rtstatus; /* TRNG status register */ > + u32 rsvd1[32]; > #define RDSTA_SKVT 0x80000000 > #define RDSTA_SKVN 0x40000000 > #define RDSTA_PR0 BIT(4) -- Robert Hancock Senior Hardware Designer, Calian Advanced Technologies www.calian.com