On 1/25/21 12:29 PM, Dongsheng Yang wrote: > commit ad0d9e76(bcache: use bio op accessors) makes the bi_opf > modified by bio_set_op_attrs(). But there is a logical > problem in this commit: > > trace_bcache_cache_insert(k); > bch_keylist_push(&op->insert_keys); > > - n->bi_rw |= REQ_WRITE; > + bio_set_op_attrs(n, REQ_OP_WRITE, 0); > bch_submit_bbio(n, op->c, k, 0); > } while (n != bio); > > The old code add REQ_WRITE into bio n and keep other flags; the > new code set REQ_OP_WRITE to bi_opf, but reset all other flags. > > This problem is discoverd in our performance testing: > (1) start a fio with 1M x 128depth for read in /dev/nvme0n1p1 > (2) start a fio with 1M x 128depth for write in /dev/escache0 (cache > device is /dev/nvme0n1p2) > > We found the BW of reading is 2000+M/s, but the BW of writing is > 0-100M/s. After some debugging, we found the problem is io submit in > writting is very slow. > > bch_data_insert_start() insert a bio to /dev/nvme0n1p1, but as > cached_dev submit stack bio will be added into current->bio_list, and > return.Then __submit_bio_noacct() will submit the new bio in bio_list > into /dev/nvme0n1p1. This operation would be slow in > blk_mq_submit_bio() -> rq_qos_throttle(q, bio); > > The rq_qos_throttle() will call wbt_should_throttle(), > static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio) > { > switch (bio_op(bio)) { > case REQ_OP_WRITE: > /* > * Don't throttle WRITE_ODIRECT > */ > if ((bio->bi_opf & (REQ_SYNC | REQ_IDLE)) == > (REQ_SYNC | REQ_IDLE)) > return false; > ... ... > } > > As the bio_set_op_attrs() reset the (REQ_SYNC | REQ_IDLE), so this write > bio will be considered as non-direct write. > > After this fix, bio to nvme will flaged as (REQ_SYNC | REQ_IDLE), > then fio for writing will get about 1000M/s bandwidth. > > Fixes: ad0d9e76a4124708dddd00c04fc4b56fc86c02d6 It should be, Fixes: ad0d9e76a412 ("bcache: use bio op accessors") > Signed-off-by: Dongsheng Yang <dongsheng.yang@xxxxxxxxxxxx> Please CC the fixed patch author Mike Christie <mchristi@xxxxxxxxxx>. > --- > drivers/md/bcache/request.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c > index c7cadaafa947..eb734f7ddaac 100644 > --- a/drivers/md/bcache/request.c > +++ b/drivers/md/bcache/request.c > @@ -244,7 +244,7 @@ static void bch_data_insert_start(struct closure *cl) > trace_bcache_cache_insert(k); > bch_keylist_push(&op->insert_keys); > > - bio_set_op_attrs(n, REQ_OP_WRITE, 0); > + n->bi_opf |= REQ_OP_WRITE; > bch_submit_bbio(n, op->c, k, 0); > } while (n != bio); The fix is OK to me, I'd like to see opinion from Mike Christie too. Thanks for the catch. Coly Li