On Tue, 3 Dec 2019, Coly Li wrote: > On 2019/12/3 3:34 上午, Eric Wheeler wrote: > > On Mon, 2 Dec 2019, Coly Li wrote: > >> On 2019/12/2 6:24 下午, kungf wrote: > >>> data may lost when in the follow scene of writeback mode: > >>> 1. client write data1 to bcache > >>> 2. client fdatasync > >>> 3. bcache flush cache set and backing device > >>> if now data1 was not writed back to backing, it was only guaranteed safe in cache. > >>> 4.then cache writeback data1 to backing with only REQ_OP_WRITE > >>> So data1 was not guaranteed in non-volatile storage, it may lost if power interruption > >>> > >> > >> Hi, > >> > >> Do you encounter such problem in real work load ? With bcache journal, I > >> don't see the possibility of data lost with your description. > >> > >> Correct me if I am wrong. > >> > >> Coly Li > > > > If this does become necessary, then we should have a sysfs or superblock > > flag to disable FUA for those with RAID BBUs. > > Hi Eric, > > I doubt it is necessary to add FUA tag for all writeback bios, it is > unnecessary. If power failure happens after dirty data written to > backing device and the bkey turns into clean, a following read request > will go to cache device because the LBA can be indexed no matter it is > dirty or clean. Unless the bkey is invalidated from the B+tree, read > will always go to cache device firstly in writeback mode. If a power > failure happens before the cached bkey turns from dirty to clean, just > an extra writeback bio flushed from cache device to backing device with > identical data. Comparing the FUA tag for all writeback bios (it will be > really slow), the extra writeback IOs after a power failure is more > acceptable to me. I agree. FWIW, I just learned about /sys/block/sdX/queue/write_cache from Nikos Tsironis <ntsironis@xxxxxxxxxxx>. Thus, my flag request for a FUA bypass isn't necessary anyway, even if you did want an FUA there, because FUAs are stripped when a blockdev is set to "write back" (QUEUE_FLAG_WC). ---------------------------------------------------------------------- This happens in generic_make_request_checks(): /* * Filter flush bio's early so that make_request based * drivers without flush support don't have to worry * about them. */ if (op_is_flush(bio->bi_opf) && !test_bit(QUEUE_FLAG_WC, &q->queue_flags)) { bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA); if (!nr_sectors) { status = BLK_STS_OK; goto end_io; } } ---------------------------------------------------------------------- -Eric > > Coly Li > > > > >>> Signed-off-by: kungf <wings.wyang@xxxxxxxxx> > >>> --- > >>> drivers/md/bcache/writeback.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c > >>> index 4a40f9eadeaf..e5cecb60569e 100644 > >>> --- a/drivers/md/bcache/writeback.c > >>> +++ b/drivers/md/bcache/writeback.c > >>> @@ -357,7 +357,7 @@ static void write_dirty(struct closure *cl) > >>> */ > >>> if (KEY_DIRTY(&w->key)) { > >>> dirty_init(w); > >>> - bio_set_op_attrs(&io->bio, REQ_OP_WRITE, 0); > >>> + bio_set_op_attrs(&io->bio, REQ_OP_WRITE | REQ_FUA, 0); > >>> io->bio.bi_iter.bi_sector = KEY_START(&w->key); > >>> bio_set_dev(&io->bio, io->dc->bdev); > >>> io->bio.bi_end_io = dirty_endio; > >>> > >> >