On 9/4/2019 5:35 AM, Andrey Smirnov wrote: > In order to allow caam_jr_enqueue() to lock underlying JR's > device (via device_lock(), see commit that follows) we need to make > sure that no code calls caam_jr_enqueue() as a part of caam_jr_probe() > to avoid a deadlock. Unfortunately, current implementation of caamrng > code does exactly that in caam_init_buf(). > > Another big problem with original caamrng initialization is a > circular reference in the form of: > > 1. caam_rng_init() aquires JR via caam_jr_alloc(). Freed only by > caam_rng_exit() > > 2. caam_rng_exit() is only called by unregister_algs() once last JR > is shut down > > 3. caam_jr_remove() won't call unregister_algs() for last JR > until tfm_count reaches zero, which can only happen via > unregister_algs() -> caam_rng_exit() call chain. > > To avoid all of that, convert caamrng code to a platform device driver > and extend caam_probe() to create corresponding platform device. > > Additionally, this change also allows us to remove any access to > struct caam_drv_private in caamrng.c as well as simplify resource > ownership/deallocation via devres. > I would avoid adding platform devices that don't have corresponding DT nodes. There's some generic advice here: https://www.kernel.org/doc/html/latest/driver-api/driver-model/platform.html#legacy-drivers-device-probing and there's also previous experience in caam driver: 6b175685b4a1 crypto: caam/qi - don't allocate an extra platform device The issue in caamrng is actually that caam/jr driver (jr.c) tries to call caam_rng_exit() on the last available jr device. Instead, caam_rng_exit() must be called on the same jr device that was used during caam_rng_init(). Otherwise, by the time: void caam_rng_exit(void) { if (!init_done) return; caam_jr_free(rng_ctx->jrdev); hwrng_unregister(&caam_rng); [...] is executed, rng_ctx->jrdev might have been removed. This will cause an oops in caam_jr_free(). caam_cleanup() - .cleanup hwrng callback that is called when doing hwrng_unregister() - also needs to be executed on that jr device. The problem can be easily reproduced as follows. If caamrng was initialized on jr0: [...] caam_jr 2101000.jr0: registering rng-caam [...] then by manually unbinding jr0 first, then jr1 (using i.MX6SX): # echo -n "2101000.jr0" > /sys/bus/platform/drivers/caam_jr/ # echo -n "2102000.jr1" > /sys/bus/platform/drivers/caam_jr/ Unable to handle kernel NULL pointer dereference at virtual address 00000040 pgd = 572e14e7 [00000040] *pgd=be2e8831 Internal error: Oops: 17 [#1] SMP ARM Modules linked in: CPU: 0 PID: 629 Comm: sh Not tainted 5.3.0-rc1-00299-g8e2a2738e5d3-dirty #8 Hardware name: Freescale i.MX6 SoloX (Device Tree) PC is at caam_jr_free+0xc/0x24 LR is at caam_rng_exit+0x20/0x3c pc : [<c08aef20>] lr : [<c08bab1c>] psr: 200f0013 sp : e9669e68 ip : 00001488 fp : 00000000 r10: 00000000 r9 : e9669f80 r8 : e9284010 r7 : e872d410 r6 : e872d410 r5 : e872d400 r4 : c1aa7cd8 r3 : 00000000 r2 : 00000040 r1 : 00000000 r0 : e872d010 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: a969004a DAC: 00000051 Process sh (pid: 629, stack limit = 0xfc1b6e94) Stack: (0xe9669e68 to 0xe966a000) 9e60: e865c940 c08af7dc e872d410 e872d410 c13a9cec c06b223c 9e80: c06b2218 e872d410 e81a9410 c06b08dc c13806f0 0000000b c13a9cec c06aeaf8 9ea0: 0000000b 00000000 0000000b e9284000 e91189c0 c0318c3c 00000000 00000000 9ec0: e95ddbd0 e8843500 c1308b08 c6614c9f e8843500 00438340 e9668000 00000004 9ee0: 00000000 c0285e00 00000001 00000000 00000000 c0288a44 00000000 00000000 9f00: c1308b28 00000000 00000001 c130911c 00000001 c13e81d1 c0288a44 00000000 9f20: e8ed9800 c019df00 e8ed9a7c c028ac08 00000001 00000000 c0288a44 c6614c9f 9f40: c1308b08 0000000b 00438340 e9669f80 e8843500 00438340 e9668000 c028899c 9f60: e95ddbc0 c6614c9f e8843500 e8843500 c1308b08 0000000b 00438340 c0288bfc 9f80: 00000000 00000000 00000000 c6614c9f 0000000b 00438340 b6ef1da0 00000004 9fa0: c01011c4 c0101000 0000000b 00438340 00000001 00438340 0000000b 00000000 9fc0: 0000000b 00438340 b6ef1da0 00000004 00438340 0000000b 00000000 00000000 9fe0: 0000006c bea7f908 b6e19e58 b6e7325c 600f0010 00000001 00000000 00000000 [<c08aef20>] (caam_jr_free) from [<c08bab1c>] (caam_rng_exit+0x20/0x3c) [<c08bab1c>] (caam_rng_exit) from [<c08af7dc>] (caam_jr_remove+0x38/0xc0) [<c08af7dc>] (caam_jr_remove) from [<c06b223c>] (platform_drv_remove+0x24/0x3c) [<c06b223c>] (platform_drv_remove) from [<c06b08dc>] (device_release_driver_internal+0xdc/0x1a0) [<c06b08dc>] (device_release_driver_internal) from [<c06aeaf8>] (unbind_store+0x5c/0xcc) [<c06aeaf8>] (unbind_store) from [<c0318c3c>] (kernfs_fop_write+0xfc/0x1e0) [<c0318c3c>] (kernfs_fop_write) from [<c0285e00>] (__vfs_write+0x2c/0x1d0) [<c0285e00>] (__vfs_write) from [<c028899c>] (vfs_write+0xa0/0x180) [<c028899c>] (vfs_write) from [<c0288bfc>] (ksys_write+0x5c/0xdc) [<c0288bfc>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28) Exception stack(0xe9669fa8 to 0xe9669ff0) 9fa0: 0000000b 00438340 00000001 00438340 0000000b 00000000 9fc0: 0000000b 00438340 b6ef1da0 00000004 00438340 0000000b 00000000 00000000 9fe0: 0000006c bea7f908 b6e19e58 b6e7325c Code: eaffff49 e5903040 e2832040 f5d2f000 (e1921f9f) Thus, I'd say the following fix is needed: -- >8 -- When caam_rng_exit() executes, the jr device that was used during caam_rng_init() must be available. This means that current scheme - where caam_rng_exit() is called when last jr device is removed - is incorrect. Instead, caam_rng_exit() has to run when the jr acquired during caam_rng_init() is removed. Fixes: 1b46c90c8e00 ("crypto: caam - convert top level drivers to libraries") Signed-off-by: Horia Geantă <horia.geanta@xxxxxxx> diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c index e8baacaabe07..ec40178fa688 100644 --- a/drivers/crypto/caam/caamrng.c +++ b/drivers/crypto/caam/caamrng.c @@ -300,9 +300,9 @@ static struct hwrng caam_rng = { .read = caam_read, }; -void caam_rng_exit(void) +void caam_rng_exit(struct device *jrdev) { - if (!init_done) + if (!init_done || jrdev != rng_ctx->jrdev) return; caam_jr_free(rng_ctx->jrdev); diff --git a/drivers/crypto/caam/intern.h b/drivers/crypto/caam/intern.h index 731b06becd9c..4795530203ad 100644 --- a/drivers/crypto/caam/intern.h +++ b/drivers/crypto/caam/intern.h @@ -165,7 +165,7 @@ static inline void caam_pkc_exit(void) #ifdef CONFIG_CRYPTO_DEV_FSL_CAAM_RNG_API int caam_rng_init(struct device *dev); -void caam_rng_exit(void); +void caam_rng_exit(struct device *jrdev); #else @@ -174,7 +174,7 @@ static inline int caam_rng_init(struct device *dev) return 0; } -static inline void caam_rng_exit(void) +static inline void caam_rng_exit(struct device *jrdev) { } diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c index d11956bc358f..61aea11773a6 100644 --- a/drivers/crypto/caam/jr.c +++ b/drivers/crypto/caam/jr.c @@ -53,7 +53,6 @@ static void unregister_algs(void) caam_qi_algapi_exit(); - caam_rng_exit(); caam_pkc_exit(); caam_algapi_hash_exit(); caam_algapi_exit(); @@ -126,6 +125,8 @@ static int caam_jr_remove(struct platform_device *pdev) jrdev = &pdev->dev; jrpriv = dev_get_drvdata(jrdev); + caam_rng_exit(jrdev); + /* * Return EBUSY if job ring already allocated. */