On Mon, Apr 16, 2018 at 1:44 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote: > On Thu, Apr 12, 2018 at 8:02 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote: >> On Thu, Apr 12, 2018 at 3:47 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote: >>> After fixing up some build issues in the middle of the 4.16 cycle, I >>> get an unhelpful bisect result of commit 0a4b6e2f80aa ("Merge branch >>> 'for-4.16/block'"). Instead of letting the test run longer, I'm going >>> to switch to doing several shorter test boots per kernel and see if >>> that helps. One more bisect coming... >> >> Okay, so I can confirm the bisect points at the _merge_ itself, not a >> specific patch. I'm not sure how to proceed here. It looks like some >> kind of interaction between separate trees? Jens, do you have >> suggestions on how to track this down? > > Turning off HARDENED_USERCOPY and turning on KASAN, I see the same report: > > [ 38.274106] BUG: KASAN: slab-out-of-bounds in _copy_to_user+0x42/0x60 > [ 38.274841] Read of size 22 at addr ffff8800122b8c4b by task smartctl/1064 > [ 38.275630] > [ 38.275818] CPU: 2 PID: 1064 Comm: smartctl Not tainted 4.17.0-rc1-ARCH+ #266 > [ 38.276631] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 > [ 38.277690] Call Trace: > [ 38.277988] dump_stack+0x71/0xab > [ 38.278397] ? _copy_to_user+0x42/0x60 > [ 38.278833] print_address_description+0x6a/0x270 > [ 38.279368] ? _copy_to_user+0x42/0x60 > [ 38.279800] kasan_report+0x243/0x360 > [ 38.280221] _copy_to_user+0x42/0x60 > [ 38.280635] sg_io+0x459/0x660 > ... > > Though we get slightly more details (some we already knew): > > [ 38.301330] Allocated by task 329: > [ 38.301734] kmem_cache_alloc_node+0xca/0x220 > [ 38.302239] scsi_mq_init_request+0x64/0x130 [scsi_mod] > [ 38.302821] blk_mq_alloc_rqs+0x2cf/0x370 > [ 38.303265] blk_mq_sched_alloc_tags.isra.4+0x7d/0xb0 > [ 38.303820] blk_mq_init_sched+0xf0/0x220 > [ 38.304268] elevator_switch+0x17a/0x2c0 > [ 38.304705] elv_iosched_store+0x173/0x220 > [ 38.305171] queue_attr_store+0x72/0xb0 > [ 38.305602] kernfs_fop_write+0x188/0x220 > [ 38.306049] __vfs_write+0xb6/0x330 > [ 38.306436] vfs_write+0xe9/0x240 > [ 38.306804] ksys_write+0x98/0x110 > [ 38.307181] do_syscall_64+0x6d/0x1d0 > [ 38.307590] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 38.308142] > [ 38.308316] Freed by task 0: > [ 38.308652] (stack is not available) > [ 38.309060] > [ 38.309243] The buggy address belongs to the object at ffff8800122b8c00 > [ 38.309243] which belongs to the cache scsi_sense_cache of size 96 > [ 38.310625] The buggy address is located 75 bytes inside of > [ 38.310625] 96-byte region [ffff8800122b8c00, ffff8800122b8c60) With a hardware watchpoint, I've isolated the corruption to here: bfq_dispatch_request+0x2be/0x1610: __bfq_dispatch_request at block/bfq-iosched.c:3902 3900 if (rq) { 3901 inc_in_driver_start_rq: 3902 bfqd->rq_in_driver++; 3903 start_rq: 3904 rq->rq_flags |= RQF_STARTED; 3905 } Through some race condition(?), rq_in_driver is also sense_buffer, and it can get incremented. Specifically, I am doing: diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 25c14c58385c..f50d5053d732 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -6,6 +6,8 @@ #include <linux/kernel.h> #include <linux/module.h> #include <linux/blk-mq.h> +#include <scsi/scsi_request.h> +#include <linux/hw_breakpoint.h> #include <trace/events/block.h> @@ -428,6 +430,18 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *const hctx) } } +static void sample_hbp_handler(struct perf_event *bp, + struct perf_sample_data *data, + struct pt_regs *regs) +{ + printk(KERN_INFO "sense_buffer value is changed\n"); + dump_stack(); + printk(KERN_INFO "Dump stack from sample_hbp_handler\n"); +} + +struct perf_event * __percpu *sample_hbp; +int ok_to_go; + void blk_mq_sched_insert_request(struct request *rq, bool at_head, bool run_queue, bool async) { @@ -435,6 +449,16 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = rq->mq_ctx; struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct perf_event_attr attr; + struct scsi_request *req = scsi_req(rq); + + if (ok_to_go) { + hw_breakpoint_init(&attr); + attr.bp_addr = (uintptr_t)&(req->sense); + attr.bp_len = HW_BREAKPOINT_LEN_8; + attr.bp_type = HW_BREAKPOINT_W; + sample_hbp = register_wide_hw_breakpoint(&attr, sample_hbp_handler, NULL); + } /* flush rq in flush machinery need to be dispatched directly */ if (!(rq->rq_flags & RQF_FLUSH_SEQ) && op_is_flush(rq->cmd_flags)) { @@ -461,6 +485,9 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, run: if (run_queue) blk_mq_run_hw_queue(hctx, async); + + if (ok_to_go) + unregister_wide_hw_breakpoint(sample_hbp); } void blk_mq_sched_insert_requests(struct request_queue *q, Where "ok_to_go" is wired up to a sysctl so I don't trip other things (?) at boot-time. I still haven't figured this out, though... any have a moment to look at this? -Kees -- Kees Cook Pixel Security