On Wed, Sep 11, 2019 at 2:35 AM Horia Geanta <horia.geanta@xxxxxxx> wrote: > > On 9/4/2019 5:35 AM, Andrey Smirnov wrote: > > In order to allow caam_jr_enqueue() to lock underlying JR's > > device (via device_lock(), see commit that follows) we need to make > > sure that no code calls caam_jr_enqueue() as a part of caam_jr_probe() > > to avoid a deadlock. Unfortunately, current implementation of caamrng > > code does exactly that in caam_init_buf(). > > I don't think your patch addresses the above, so it can probably be cut out of the message. > > Another big problem with original caamrng initialization is a > > circular reference in the form of: > > > > 1. caam_rng_init() aquires JR via caam_jr_alloc(). Freed only by > > caam_rng_exit() > > > > 2. caam_rng_exit() is only called by unregister_algs() once last JR > > is shut down > > > > 3. caam_jr_remove() won't call unregister_algs() for last JR > > until tfm_count reaches zero, which can only happen via > > unregister_algs() -> caam_rng_exit() call chain. > > > > To avoid all of that, convert caamrng code to a platform device driver > > and extend caam_probe() to create corresponding platform device. > > > > Additionally, this change also allows us to remove any access to > > struct caam_drv_private in caamrng.c as well as simplify resource > > ownership/deallocation via devres. > > > I would avoid adding platform devices that don't have > corresponding DT nodes. > > There's some generic advice here: > https://www.kernel.org/doc/html/latest/driver-api/driver-model/platform.html#legacy-drivers-device-probing > > and there's also previous experience in caam driver: > 6b175685b4a1 crypto: caam/qi - don't allocate an extra platform device > Hmm, I am not sure how that experience relates to the case we have with hwrng, but OK, I'm going to assume that platform driver approach is a no-go. > The issue in caamrng is actually that caam/jr driver (jr.c) tries to call > caam_rng_exit() on the last available jr device. > Instead, caam_rng_exit() must be called on the same jr device that > was used during caam_rng_init(). > > Otherwise, by the time: > > void caam_rng_exit(void) > { > if (!init_done) > return; > > caam_jr_free(rng_ctx->jrdev); > hwrng_unregister(&caam_rng); > [...] > > is executed, rng_ctx->jrdev might have been removed. > > This will cause an oops in caam_jr_free(). > caam_cleanup() - .cleanup hwrng callback that is called when doing > hwrng_unregister() - also needs to be executed on that jr device. > > The problem can be easily reproduced as follows. > > If caamrng was initialized on jr0: > [...] > caam_jr 2101000.jr0: registering rng-caam > [...] > > then by manually unbinding jr0 first, then jr1 (using i.MX6SX): > # echo -n "2101000.jr0" > /sys/bus/platform/drivers/caam_jr/ > # echo -n "2102000.jr1" > /sys/bus/platform/drivers/caam_jr/ > > Unable to handle kernel NULL pointer dereference at virtual address 00000040 > pgd = 572e14e7 > [00000040] *pgd=be2e8831 > Internal error: Oops: 17 [#1] SMP ARM > Modules linked in: > CPU: 0 PID: 629 Comm: sh Not tainted 5.3.0-rc1-00299-g8e2a2738e5d3-dirty #8 > Hardware name: Freescale i.MX6 SoloX (Device Tree) > PC is at caam_jr_free+0xc/0x24 > LR is at caam_rng_exit+0x20/0x3c > pc : [<c08aef20>] lr : [<c08bab1c>] psr: 200f0013 > sp : e9669e68 ip : 00001488 fp : 00000000 > r10: 00000000 r9 : e9669f80 r8 : e9284010 > r7 : e872d410 r6 : e872d410 r5 : e872d400 r4 : c1aa7cd8 > r3 : 00000000 r2 : 00000040 r1 : 00000000 r0 : e872d010 > Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none > Control: 10c5387d Table: a969004a DAC: 00000051 > Process sh (pid: 629, stack limit = 0xfc1b6e94) > Stack: (0xe9669e68 to 0xe966a000) > 9e60: e865c940 c08af7dc e872d410 e872d410 c13a9cec c06b223c > 9e80: c06b2218 e872d410 e81a9410 c06b08dc c13806f0 0000000b c13a9cec c06aeaf8 > 9ea0: 0000000b 00000000 0000000b e9284000 e91189c0 c0318c3c 00000000 00000000 > 9ec0: e95ddbd0 e8843500 c1308b08 c6614c9f e8843500 00438340 e9668000 00000004 > 9ee0: 00000000 c0285e00 00000001 00000000 00000000 c0288a44 00000000 00000000 > 9f00: c1308b28 00000000 00000001 c130911c 00000001 c13e81d1 c0288a44 00000000 > 9f20: e8ed9800 c019df00 e8ed9a7c c028ac08 00000001 00000000 c0288a44 c6614c9f > 9f40: c1308b08 0000000b 00438340 e9669f80 e8843500 00438340 e9668000 c028899c > 9f60: e95ddbc0 c6614c9f e8843500 e8843500 c1308b08 0000000b 00438340 c0288bfc > 9f80: 00000000 00000000 00000000 c6614c9f 0000000b 00438340 b6ef1da0 00000004 > 9fa0: c01011c4 c0101000 0000000b 00438340 00000001 00438340 0000000b 00000000 > 9fc0: 0000000b 00438340 b6ef1da0 00000004 00438340 0000000b 00000000 00000000 > 9fe0: 0000006c bea7f908 b6e19e58 b6e7325c 600f0010 00000001 00000000 00000000 > [<c08aef20>] (caam_jr_free) from [<c08bab1c>] (caam_rng_exit+0x20/0x3c) > [<c08bab1c>] (caam_rng_exit) from [<c08af7dc>] (caam_jr_remove+0x38/0xc0) > [<c08af7dc>] (caam_jr_remove) from [<c06b223c>] (platform_drv_remove+0x24/0x3c) > [<c06b223c>] (platform_drv_remove) from [<c06b08dc>] (device_release_driver_internal+0xdc/0x1a0) > [<c06b08dc>] (device_release_driver_internal) from [<c06aeaf8>] (unbind_store+0x5c/0xcc) > [<c06aeaf8>] (unbind_store) from [<c0318c3c>] (kernfs_fop_write+0xfc/0x1e0) > [<c0318c3c>] (kernfs_fop_write) from [<c0285e00>] (__vfs_write+0x2c/0x1d0) > [<c0285e00>] (__vfs_write) from [<c028899c>] (vfs_write+0xa0/0x180) > [<c028899c>] (vfs_write) from [<c0288bfc>] (ksys_write+0x5c/0xdc) > [<c0288bfc>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28) > Exception stack(0xe9669fa8 to 0xe9669ff0) > 9fa0: 0000000b 00438340 00000001 00438340 0000000b 00000000 > 9fc0: 0000000b 00438340 b6ef1da0 00000004 00438340 0000000b 00000000 00000000 > 9fe0: 0000006c bea7f908 b6e19e58 b6e7325c > Code: eaffff49 e5903040 e2832040 f5d2f000 (e1921f9f) > > Thus, I'd say the following fix is needed: > > -- >8 -- > > When caam_rng_exit() executes, the jr device that was used > during caam_rng_init() must be available. > > This means that current scheme - where caam_rng_exit() is called > when last jr device is removed - is incorrect. > Instead, caam_rng_exit() has to run when the jr acquired > during caam_rng_init() is removed. > > Fixes: 1b46c90c8e00 ("crypto: caam - convert top level drivers to libraries") > Signed-off-by: Horia Geantă <horia.geanta@xxxxxxx> > > diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c > index e8baacaabe07..ec40178fa688 100644 > --- a/drivers/crypto/caam/caamrng.c > +++ b/drivers/crypto/caam/caamrng.c > @@ -300,9 +300,9 @@ static struct hwrng caam_rng = { > .read = caam_read, > }; > > -void caam_rng_exit(void) > +void caam_rng_exit(struct device *jrdev) > { > - if (!init_done) > + if (!init_done || jrdev != rng_ctx->jrdev) > return; > > caam_jr_free(rng_ctx->jrdev); > diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h > index 731b06becd9c..4795530203ad 100644 > --- a/drivers/crypto/caam/intern.h > +++ b/drivers/crypto/caam/intern.h > @@ -165,7 +165,7 @@ static inline void caam_pkc_exit(void) > #ifdef CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API > > int caam_rng_init(struct device *dev); > -void caam_rng_exit(void); > +void caam_rng_exit(struct device *jrdev); > > #else > > @@ -174,7 +174,7 @@ static inline int caam_rng_init(struct device *dev) > return 0; > } > > -static inline void caam_rng_exit(void) > +static inline void caam_rng_exit(struct device *jrdev) > { > } > > diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c > index d11956bc358f..61aea11773a6 100644 > --- a/drivers/crypto/caam/jr.c > +++ b/drivers/crypto/caam/jr.c > @@ -53,7 +53,6 @@ static void unregister_algs(void) > > caam_qi_algapi_exit(); > > - caam_rng_exit(); > caam_pkc_exit(); > caam_algapi_hash_exit(); > caam_algapi_exit(); > @@ -126,6 +125,8 @@ static int caam_jr_remove(struct platform_device *pdev) > jrdev = &pdev->dev; > jrpriv = dev_get_drvdata(jrdev); > > + caam_rng_exit(jrdev); > + > /* > * Return EBUSY if job ring already allocated. > */