Re: Questions on block drivers, REQ_FLUSH and REQ_FUA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tejun,

--On 25 May 2011 10:59:50 +0200 Tejun Heo <tj@xxxxxxxxxx> wrote:

Yeap, that's correct.  Ordering between flush and other writes are now
completely the responsibility of filesystems.  Block layer just
doesn't care.
...
A FLUSH command means "flush out all data from writes upto this
point".  If a driver has indicated completion of a write and then
received a FLUSH, the data from the write should be written to disk.

So to be clear

a) If I do not complete a write command, I may avoid writing it to disk
  indefinitely (despite completing subsequently received FLUSH
  commands). The only flushes to disk that I am obliged to flush
  are those that I've actually told the block layer that I have done.

b) If I receive a flush command, and prior to completing that flush
  command, I receive subsequent write commands, I may execute
  (and, if I like, write, to disk) write commands received AFTER that
  flush command. I presume if the subsequent write commands write to
  blocks that I am meant to be flushing, I can just forget about
  the blocks I am meant to be flushing (because they would be
  overwritten) provided *something* overwritten what was there before.


If my understanding is correct, then for future readers of the archive
(perhaps I should put this list in Documentation/ ?) the semantics are
something like:

1. Block drivers may handle requests received in any order, and may
  issue completions in any order, subject only to the rules below.

2. If a read covering a given block X is received after one or more writes
  for that block, then irrespective of the order in which the read
  and write(s) are handled/completed, the read shall return the
  value written by the immediately preceding write to that block.

  Therefore whilst the following is legal...

       Driver sends                        Driver replies

       WRITE BLOCK 1 = X
WRITE BLOCK 1 COMPLETED
       .... time passes ...
       READ BLOCK 1
       WRITE BLOCK 1 = Y
                                           WRITE BLOCK 1 COMPLETED
                                           READ BLOCK 1 COMPLETED

  ...the read from block 1 should return X and not Y, even if it was
  handled by the driver after the write.

3. If a flush request is received, then before completing it (and,
  in the case of a make_request_function driver) before initiating
  any attached write, the driver MUST have written to non-volatile
  storage any writes which were COMPLETED prior to the reception
  of the flush. This does not affect any writes received, but
  not completed, prior to the flush, nor does it prevent a block driver
  from completing subsequently issued writes before completion of the
  flush. IE the flush does not act as a barrier, it merely ensures that
  on completion of the flush non-volatile storage contains either the
  blocks written to prior to the flush or blocks written to in commands
  issued subsequent to the flush, but completed prior to it.

4. Requests marked FUA should be written to non-volatile storage prior
  to completion, but impose no restrictions on ordering.

--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux