On Mon, Aug 19, 2019 at 11:06 AM Martijn Coenen <maco@xxxxxxxxxxx> wrote: > One idea to fix is to call blk_queue_logical_block_size() as part of > LOOP_SET_FD, to match the block size of the backing fs in case the > backing file is opened with O_DIRECT; you could argue that if the > backing file is opened with O_DIRECT, this is what the user wanted > anyway. This would allow us to get rid of the latter two ioctl's and > already save quite some time. Basically: diff --git a/drivers/block/loop.c b/drivers/block/loop.c index ab7ca5989097a..ad3db72fbd729 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -994,6 +994,12 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, if (!(lo_flags & LO_FLAGS_READ_ONLY) && file->f_op->fsync) blk_queue_write_cache(lo->lo_queue, true, false); + if(io_is_direct(lo->lo_backing_file) && inode->i_sb->s_bdev) { + /* In case of direct I/O, match underlying block size */ + blk_queue_logical_block_size(lo->lo_queue, + bdev_logical_block_size(inode->i_sb->s_bdev)); + } + loop_update_rotational(lo); loop_update_dio(lo); > > Thanks, > Martijn > > > > > Something like the following patch: > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index a7461f482467..8791f9242583 100644 > > --- a/drivers/block/loop.c > > +++ b/drivers/block/loop.c > > @@ -1015,6 +1015,9 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, > > */ > > bdgrab(bdev); > > mutex_unlock(&loop_ctl_mutex); > > + > > + percpu_ref_switch_to_percpu(&lo->lo_queue->q_usage_counter); > > + > > if (partscan) > > loop_reread_partitions(lo, bdev); > > if (claimed_bdev) > > @@ -1171,6 +1174,8 @@ static int __loop_clr_fd(struct loop_device *lo, bool release) > > lo->lo_state = Lo_unbound; > > mutex_unlock(&loop_ctl_mutex); > > > > + percpu_ref_switch_to_atomic(&lo->lo_queue->q_usage_counter, NULL); > > + > > /* > > * Need not hold loop_ctl_mutex to fput backing file. > > * Calling fput holding loop_ctl_mutex triggers a circular > > @@ -2003,6 +2008,12 @@ static int loop_add(struct loop_device **l, int i) > > } > > lo->lo_queue->queuedata = lo; > > > > + /* > > + * cheat block layer for not switching to q_usage_counter's > > + * percpu mode before loop becomes Lo_bound > > + */ > > + blk_queue_flag_set(QUEUE_FLAG_INIT_DONE, lo->lo_queue); > > + > > blk_queue_max_hw_sectors(lo->lo_queue, BLK_DEF_MAX_SECTORS); > > > > /* > > > > > > thanks, > > Ming