On Mon, Jan 06, 2020 at 03:35:10PM +0800, Yufen Yu wrote: > When delete partition executes concurrently with IOs issue, > it may cause use-after-free on part in disk_map_sector_rcu() > as following: > > blk_account_io_start(req1) delete_partition blk_account_io_start(req2) > > rcu_read_lock() > disk_map_sector_rcu > part = rcu_dereference(ptbl->part[4]) > rcu_assign_pointer(ptbl->part[4], NULL); > rcu_assign_pointer(ptbl->last_lookup, NULL); > rcu_assign_pointer(ptbl->last_lookup, part); > > hd_struct_kill(part) > !hd_struct_try_get > part = &rq->rq_disk->part0; > rcu_read_unlock() > __delete_partition > call_rcu > rcu_read_lock > disk_map_sector_rcu > part = rcu_dereference(ptbl->last_lookup); > > delete_partition_work_fn > free(part) > hd_struct_try_get(part) > BUG_ON use-after-free > > req1 try to get 'ptbl->part[4]', while the part is beening > deleted. Although the delete_partition() will set last_lookup > as NULL, req1 can overwrite it as 'part[4]' again. > > After calling call_rcu() and free() for the part, req2 can > access the part by last_lookup, resulting in use after free. > > In fact, this bug has been reported by syzbot: > https://lkml.org/lkml/2019/1/4/357 > > To fix the bug, we try to cache index of part[] instead of > part[i] itself in last_lookup. Even if the index may been > re-assign, others can either get part[i] as value of NULL, > or get the new allocated part[i] after call_rcu. Both of > them is okay. > > Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx> Looks good, Reviewed-by: Christoph Hellwig <hch@xxxxxx>