On Thu, 5 Dec 2019, Nikos Tsironis wrote: > On 12/4/19 10:17 PM, Mike Snitzer wrote: > > On Wed, Dec 04 2019 at 2:58pm -0500, > > Eric Wheeler <dm-devel@xxxxxxxxxxxxxxxxxx> wrote: > > > > > On Wed, 4 Dec 2019, Nikos Tsironis wrote: > > > > > > > The thin provisioning target maintains per thin device mappings that map > > > > virtual blocks to data blocks in the data device. > > > > > > > > When we write to a shared block, in case of internal snapshots, or > > > > provision a new block, in case of external snapshots, we copy the shared > > > > block to a new data block (COW), update the mapping for the relevant > > > > virtual block and then issue the write to the new data block. > > > > > > > > Suppose the data device has a volatile write-back cache and the > > > > following sequence of events occur: > > > > > > For those with NV caches, can the data disk flush be optional (maybe as a > > > table flag)? > > > > IIRC block core should avoid issuing the flush if not needed. I'll have > > a closer look to verify as much. > > > > For devices without a volatile write-back cache block core strips off > the REQ_PREFLUSH and REQ_FUA bits from requests with a payload and > completes empty REQ_PREFLUSH requests before entering the driver. > > This happens in generic_make_request_checks(): > > /* > * Filter flush bio's early so that make_request based > * drivers without flush support don't have to worry > * about them. > */ > if (op_is_flush(bio->bi_opf) && > !test_bit(QUEUE_FLAG_WC, &q->queue_flags)) { > bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA); > if (!nr_sectors) { > status = BLK_STS_OK; > goto end_io; > } > } > > If I am not mistaken, it all depends on whether the underlying device > reports the existence of a write back cache or not. > > You could check this by looking at /sys/block/<device>/queue/write_cache > If it says "write back" then flushes will be issued. > > In case the sysfs entry reports a "write back" cache for a device with a > non-volatile write cache, I think you can change the kernel's view of > the device by writing to this entry (you could also create a udev rule > for this). > > This way you can set the write cache as write through. This will > eliminate the cache flushes issued by the kernel, without altering the > device state (Documentation/block/queue-sysfs.rst). Interesting, I'll remember that. I think this is a documentation bug, isn't this backwards: 'This means that it might not be safe to toggle the setting from "write back" to "write through", since that will also eliminate cache flushes issued by the kernel.' [https://www.kernel.org/doc/Documentation/block/queue-sysfs.rst] How does this work with stacking blockdevs? Does it inherit from the lower-level dev? If an upper-level is misconfigured, would a writeback at higher levels would clear the flush for lower levels? -- Eric Wheeler > Nikos > > > Mike > > > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel