On Thu, Aug 16, 2018 at 11:08 AM, 정혜연 <hyeon.chung@xxxxxxxxxxx> wrote: > Dear Gilad, > > > > I'm sorry I'm confused call sequence. > > calling function may be not crypto_ahash_walk_first but > crypto_hash_walk_first. > > I don't know how code reaches there (verity_hash_update --> ??? -> > shash_async_update)but below call stack showed like this. > Basically through an inline function that calls shash_async_update via function pointer. > > > int crypto_hash_walk_first(struct ahash_request *req, > struct crypto_hash_walk *walk) > { > walk->total = req->nbytes; > > if (!walk->total) { > walk->entrylen = 0; > return 0; > } > > walk->alignmask = crypto_ahash_alignmask(crypto_ahash_reqtfm(req)); > walk->sg = req->src; > walk->flags = req->base.flags & CRYPTO_TFM_REQ_MASK; > > return hash_walk_new_entry(walk); > } > > So, is it right that walk->flag doesn't have CRYPTO_ALG_ASYNC ? > Yes, I believe any shash implementation (which this is, since I'm seeing *shash* function in the call stack) would not have CRYPTO_ALG_ASYNC in walk->flag. I'm guessing you are using the SHA1 Arm CE implementation, right? this is a synchronous implementation and would not have that flag set. It would be nice if you can verify this assumption, say by planting a printk to print the flag but I'm pretty sure of that. This means two things: 1. There should never be a schedule call in between the kmap_atomic() and kunmap_atomic() since kmap_atomic() disables preemption. The fact that you're seeing one points to something highly irregular. 2. If there is a schedule between the two, there is no wonder the mapping is lost, kmap_atomic is designed to be, well... atomic. Could it be you have some other, possibly unrelated, code or driver that has a bug of enabling preemption one two many? are you using any out of tree drivers or patches with this kernel? Can you please compile the kernel with CONFIG_DEBUG_PREEMPT set (it's in Kernel Hacking section of menuconfig)? that should catch any such code. If this doesn't bring anything to light, please provide a the full kernel panic log (not just the snippet here), your kernel config and what architecture/board you are running this on, as well as details about DM-verity partition setup, size and underlying storage so I may attempt to recreate this. Cheers, Gilad > > > --------- Original Message --------- > > Sender : 정혜연 <hyeon.chung@xxxxxxxxxxx> Principal > Engineer/Platform개발팀(S.LSI)/삼성전자 > > Date : 2018-08-16 14:04 (GMT+9) > > Title : RE: FW: (2) FW: Use-after-free while dm-verity > > > > Thanks for your kind understanding and help :) > > > > Glance at the ahash, > > it seemed that ahash does not use kmap/unmap_atomic becuase of > CRYPTO_ALG_ASYNC flag. > > As we analyzed abnormal situation, memcopy in verity_work was preempted by > 20ms. > > > > < Task > > > 1. time = 889711041874, sp = 18446743524145741088, task = 0xFFFFFFC8DC229B00 > = end+0x48D19DBB00, task_comm = "kworker/u16:3", pid = 15332), > 2. time = 889732425104, sp = 18446743524145740192, task = 0xFFFFFFC8DC229B00 > = end+0x48D19DBB00, task_comm = "kworker/u16:3", pid = 15332), > > < Work > > 3. time = 889711100643, sp = 18446743524145741072, worker = > 0xFFFFFFC03A09F380 = end+0x402F851380, fn = 0xFFFFFF80087DE720 = > verity_work, task_comm = "kworker/u16:3", en = 1), > > < IRQ > > 4. time = 889711657451, sp = 18446743524088036864, irq = 9, fn = > 0xFFFFFF800880D210 = exynos4_mct_tick_isr, action = 0xFFFFFFC8F5E83500 = > end+0x48EB635500, en = 1), > > > > > > int crypto_ahash_walk_first(struct ahash_request *req, > struct crypto_hash_walk *walk) > { > walk->total = req->nbytes; > > > > if (!walk->total) { > walk->entrylen = 0; > return 0; > } > > walk->alignmask = crypto_ahash_alignmask(crypto_ahash_reqtfm(req)); > walk->sg = req->src; > walk->flags = req->base.flags & CRYPTO_TFM_REQ_MASK; > walk->flags |= CRYPTO_ALG_ASYNC; > > BUILD_BUG_ON(CRYPTO_TFM_REQ_MASK & CRYPTO_ALG_ASYNC); > > return hash_walk_new_entry(walk); > } > > > > static int hash_walk_next(struct crypto_hash_walk *walk) > { > unsigned int alignmask = walk->alignmask; > unsigned int offset = walk->offset; > unsigned int nbytes = min(walk->entrylen, > ((unsigned int)(PAGE_SIZE)) - offset); > > if (walk->flags & CRYPTO_ALG_ASYNC) > walk->data = kmap(walk->pg); > else > walk->data = kmap_atomic(walk->pg); > walk->data += offset; > > if (offset & alignmask) { > unsigned int unaligned = alignmask + 1 - (offset & alignmask); > > if (nbytes > unaligned) > nbytes = unaligned; > } > > walk->entrylen -= nbytes; > return nbytes; > } > > > > int crypto_hash_walk_done(struct crypto_hash_walk *walk, int err) > { > unsigned int alignmask = walk->alignmask; > unsigned int nbytes = walk->entrylen; > > walk->data -= walk->offset; > > if (nbytes && walk->offset & alignmask && !err) { > walk->offset = ALIGN(walk->offset, alignmask + 1); > nbytes = min(nbytes, > ((unsigned int)(PAGE_SIZE)) - walk->offset); > walk->entrylen -= nbytes; > > if (nbytes) { > walk->data += walk->offset; > return nbytes; > } > } > > if (walk->flags & CRYPTO_ALG_ASYNC) > kunmap(walk->pg); > else { > kunmap_atomic(walk->data); > /* > * The may sleep test only makes sense for sync users. > * Async users don't need to sleep here anyway. > */ > crypto_yield(walk->flags); > } > > if (err) > return err; > > if (nbytes) { > walk->offset = 0; > walk->pg++; > return hash_walk_next(walk); > } > > if (!walk->total) > return 0; > > walk->sg = sg_next(walk->sg); > > return hash_walk_new_entry(walk); > } > > > > --------- Original Message --------- > > Sender : Gilad Ben-Yossef <gilad@xxxxxxxxxxxxx> > > Date : 2018-08-16 11:53 (GMT+9) > > Title : Fwd: (2) FW: Use-after-free while dm-verity > > > > I missed sending my reply to the list by myself mistake so forwarding. > ---------- Forwarded message ---------- > From: Gilad Ben-Yossef <gilad@xxxxxxxxxxxxx> > Date: Thu, 16 Aug 2018 09:48:19 +0700 > Subject: Re: (2) FW: Use-after-free while dm-verity > To: hyeon.chung@xxxxxxxxxxx > > On 8/16/18, 정혜연 <hyeon.chung@xxxxxxxxxxx> wrote: >> Thank you for reply and sorry for inturrupting your vacation. >> > > Eh, don't worry about it. I enjoy kernel development work. Just don't > tell my wife... :-) >> >> >> I guess kunmap_atomic / kmap_atomic could be protected from scheduling but >> the operation using the resource like memcopy cannot be protected, right? >> >> Please let me know if I was wrong. > > kmap_atomic() disables preemption until the kunmap operation and so > scheduling should not be possible while the memory is mapped. > > >> >> >> >> I am testing after reverting and resolving confict now. But as I said >> before >> I'm not sure it is ok just revert only this one. >> >> Is it ok? > > The patch stands on its own so that should be ok if that patch is the > root cause of the problem bit if the problem is some where else it > could also be hiding the problem so it would be better to understand > what is going on first or it could bite you elsewhere. > > Gilad >> >> >> >> Best Regards, >> >> Hyeyeon Chung >> >> >> >> --------- Original Message --------- >> >> Sender : Gilad Ben-Yossef <gilad@xxxxxxxxxxxxx> >> >> Date : 2018-08-16 09:57 (GMT+9) >> >> Title : Re: FW: Use-after-free while dm-verity >> >> >> >> Hi Hyeon, >> >> On Thu, Aug 16, 2018 at 3:35 AM, 정혜연 <hyeon.chung@xxxxxxxxxxx> wrote: >>> >>> Hi, >>> >>> >>> >>> We found some async commit and are wondering if this could be make this >>> issue or not. >>> >>> I want to test after reverting this but I'm not sure it is ok to revert >>> only this commit and when I revert it there's a conflict. >>> >>> >>> >>> Could you please help me? >> >> >> I am on vacation, so please excuse any delay in handling this. Luckily I >> have my laptop with me and will try to do me best to help. >> >> The description below relates to "use-after-free". I just want to make >> sure >> I understand: does this relate to the memcpy after unmap, right? >> >> Normally, I would say that since we use kmap_atomic to map the data, >> preemption should be disabled and scheduling should not happen, I believe. >> >> I will only be able to take a look at the code in the evening, but in the >> mean time, can you verify the area the memcpy is applied to is mapped with >> kmap_atomic? >> >> thanks! >> Gilad >> >>> >>> >>> commit d1ac3ff008fb9a48f91fc15920b4c8db24c0f03e >>> Author: Gilad Ben-Yossef <gilad@xxxxxxxxxxxxx> >>> Date: Sun Feb 19 14:46:07 2017 +0200 >>> >>> dm verity: switch to using asynchronous hash crypto API >>> >>> Use of the synchronous digest API limits dm-verity to using pure >>> CPU based algorithm providers and rules out the use of off CPU >>> algorithm providers which are normally asynchronous by nature, >>> potentially freeing CPU cycles. >>> >>> This can reduce performance per Watt in situations such as during >>> boot time when a lot of concurrent file accesses are made to the >>> protected volume. >>> >>> Signed-off-by: Gilad Ben-Yossef <gilad@xxxxxxxxxxxxx> >>> CC: Eric Biggers <ebiggers3@xxxxxxxxx> >>> CC: Ondrej Mosn찼훾ek <omosnacek+linux-crypto@xxxxxxxxx> >>> Tested-by: Milan Broz <gmazyland@xxxxxxxxx> >>> Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx> >>> >>> >>> >>> >>> >>> 감사합니다. >>> >>> 정혜연 드림 >>> >>> >>> >>> >>> >>> --------- Original Message --------- >>> >>> Sender : 정혜연 <hyeon.chung@xxxxxxxxxxx> Principal >>> Engineer/Platform개발팀(S.LSI)/삼성전자 >>> >>> Date : 2018-08-14 17:34 (GMT+9) >>> >>> Title : Use-after-free while dm-verity >>> >>> >>> >>> Hi, >>> >>> I'm a samsung engineer who is responsible for dm-verity. >>> >>> >>> >>> We just enabled dm-verity for android verfied boot 2.0 (using linux >>> 4.14.56). >>> >>> After enabling, we met kernel panic issue caused by dm-verity frequently >>> and it always showed same call stack like below. >>> >>> >>> >>> Could you please let me know If you know any similar issue or have any >>> opinion? >>> >>> If this mail list is not right, please forward proper person or let me >>> know. >>> >>> >>> >>> Thanks in advance. >>> >>> >>> >>> < 4>[ 4686.384679] [5: kworker/u16:1: 58] Workqueue: kverityd >>> verity_work >>> < 4>[ 4686.384691] [5: kworker/u16:1: 58] task: ffffffc8742a0080 >>> task.stack: ffffff800b1d8000 >>> < 4>[ 4686.384705] [5: kworker/u16:1: 58] PC is at >>> __memcpy+0x70/0x180 >>> < 4>[ 4686.384718] [5: kworker/u16:1: 58] LR is at >>> crypto_sha1_update+0x94/0xe4 >>> >>> ... >>> >>> <4>[ 4686.387488] [5: kworker/u16:1: 58] [<ffffff8008b07370>] >>> __memcpy+0x70/0x180 >>> < 4>[ 4686.387503] [5: kworker/u16:1: 58] [<ffffff80083a0f6c>] >>> crypto_shash_update+0x2c/0x34 >>> < 4>[ 4686.387514] [5: kworker/u16:1: 58] [<ffffff80083a12c0>] >>> shash_ahash_update+0x3c/0x80 >>> < 4>[ 4686.387525] [5: kworker/u16:1: 58] [<ffffff80083a1638>] >>> shash_async_update+0x10/0x18 >>> < 4>[ 4686.387538] [5: kworker/u16:1: 58] [<ffffff8008824974>] >>> verity_hash_update+0x50/0xdc >>> < 4>[ 4686.387549] [5: kworker/u16:1: 58] [<ffffff80088247c0>] >>> verity_hash+0x58/0xa0 >>> < 4>[ 4686.387560] [5: kworker/u16:1: 58] [<ffffff8008824c98>] >>> verity_verify_level+0xc4/0x190 >>> < 4>[ 4686.387571] [5: kworker/u16:1: 58] [<ffffff8008824bc4>] >>> verity_hash_for_block+0xf0/0x100 >>> < 4>[ 4686.387581] [5: kworker/u16:1: 58] [<ffffff80088244c8>] >>> fec_read_bufs+0x144/0x30c >>> < 4>[ 4686.387592] [5: kworker/u16:1: 58] [<ffffff8008823730>] >>> fec_decode_rsb+0x170/0x4bc >>> < 4>[ 4686.387603] [5: kworker/u16:1: 58] [<ffffff8008823524>] >>> verity_fec_decode+0xe8/0x184 >>> < 4>[ 4686.387613] [5: kworker/u16:1: 58] [<ffffff8008824d38>] >>> verity_verify_level+0x164/0x190 >>> < 4>[ 4686.387623] [5: kworker/u16:1: 58] [<ffffff8008824bc4>] >>> verity_hash_for_block+0xf0/0x100 >>> < 4>[ 4686.387634] [5: kworker/u16:1: 58] [<ffffff80088264cc>] >>> verity_work+0xf8/0x308 >>> < 4>[ 4686.387646] [5: kworker/u16:1: 58] [<ffffff80080cb5ec>] >>> process_one_work+0x2d4/0x608 >>> < 4>[ 4686.387656] [5: kworker/u16:1: 58] [<ffffff80080cbbf4>] >>> worker_thread+0x220/0x340 >>> < 4>[ 4686.387667] [5: kworker/u16:1: 58] [<ffffff80080d0498>] >>> kthread+0x11c/0x12c >>> < 4>[ 4686.387678] [5: kworker/u16:1: 58] [<ffffff800808557c>] >>> ret_from_fork+0x10/0x18 >>> < 0>[ 4686.387690] [5: kworker/u16:1: 58] Code: 54000080 540000ab >>> a8c12027 a88120c7 (a8c12027) >>> < 4>[ 4686.387839] [5: kworker/u16:1: 58] ---[ end trace >>> 54664349b0f9422c ]--- >>> >>> >>> >>> >>> >>> In memcopy function, >>> >>> load x1 register successfully then context switched by scheduler. >>> >>> After returning to same memcopy function, failed another x1 load becuase >>> this is unmapped.(walk->data) >>> >>> <1>[ 889.732488] [2: kworker/u16:3:15332] Unable to handle kernel >>> paging request at virtual address ffffffc00b6ad000 >>> >>> >>> >>> I suspect that this is not protected propely becuase callstack and >>> problem >>> (walk->data: use-after-free) is alwasys same. >>> >>> 1. shrink_node is executed on onther core at that time. Is it >>> possible this make this problem? >>> >>> 2. I found that walk->data is unmapped by ahahs.c but I don't know >>> whether >>> this cause this unmapped situation or not. >>> >>> if (walk->flags & CRYPTO_ALG_ASYNC) >>> kunmap(walk->pg); >>> else >>> kunmap_atomic(walk->data); >>> >>> >>> >>> >>> >>> Best Regards, >>> >>> HyeYeon Chung >>> >>> >>> >>> >>> >>> [image] >> >> >> >> >> -- >> Gilad Ben-Yossef >> Chief Coffee Drinker >> >> values of β will give rise to dom! >> >> >> >> [image] >> >> [image] > > > -- > Gilad Ben-Yossef > Chief Coffee Drinker > > values of β will give rise to dom! > > > > -- > Gilad Ben-Yossef > Chief Coffee Drinker > > values of β will give rise to dom! > > > > -- Gilad Ben-Yossef Chief Coffee Drinker values of β will give rise to dom! -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel