Re: Questions on block drivers, REQ_FLUSH and REQ_FUA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 24, 2011 at 10:29:09PM +0100, Alex Bligh wrote:

[..]
> Q3: Apparently there are no longer concepts of barriers, just REQ_FLUSH
> and REQ_FUA. REQ_FLUSH guarantees all "completed" I/O requests are written
> to disk prior to that BIO starting. However, what about non-completed I/O
> requests? For instance, is the following legitimate:
> 
>        Receive        Send to disk         Reply
>        =======        ============         =====
>        WRITE1
>        WRITE2
>                                            WRITE2 (cached)
>        FLUSH+WRITE3
>                       WRITE2
>                       WRITE3
>                                            WRITE3
>        WRITE4
>                       WRITE4
>                                            WRITE4
>                       WRITE1
>                                            WRITE1
> 
> Here WRITE1 was not 'completed', and thus by the text of
> Documentation/writeback_cache_control.txt, need not be written to disk
> before starting WRITE3 (which had REQ_FLUSH attached).
> 
> >The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
> >the filesystem and will make sure the volatile cache of the storage device
> >has been flushed before the actual I/O operation is started.  This
> >explicitly guarantees that previously completed write requests are on
> >non-volatile storage before the flagged bio starts.
> 
> I presume this is illegal and is a documentation issue.

I know very little about flush semantics but still try to answer two
of your questions.

CCing Tejun.

Tejun, please correct me if I got this wrong.

I think documentation is fine. It specifically talks about completed
requests. The requests which have been sent to drive (and may be in
controller's cache). 

So in above example, if driver holds back WRITE1 and never signals
the completion of request, then I think it is fine to complete
the WRITE3+FLUSH ahead of WRITE1.

I think issue will arise only if you signaled that WRITE1 has completed
and cached it in driver (as you seem to indicating) and never sent to the
drive and then you received WRITE3 + FLUSH requests. In that case you shall
have to make sure that by the time WRITE3 + FLUSH completion is signaled,
WRITE1 is on the disk.
 
> 
> Q4. Can I reorder forwards write requests across flushes? IE, can I do
> this:
> 
>        Receive        Send to disk         Reply
>        =======        ============         =====
>        WRITE1
>                                            WRITE2 (cached)
>        WRITE2
>                                            WRITE2 (cached)
>        FLUSH+WRITE3
>        WRITE4
>                       WRITE4
>                                            WRITE4
>                       WRITE2
>                       WRITE3
>                                            WRITE3
> 
> Again this does not appear to be illegal, as the FLUSH operation is
> not defined as a barrier, meaning it should in theory be possible
> to handle (and write to disk) requests received after the
> FLUSH request before the FLUSH request finishes, provided that the
> commands received before the FLUSH request itself complete before
> the FLUSH request is replied to. I really don't know what the answer
> is to this one. It makes a big difference to me as I can write multiple
> blocks in parallel, and would really rather not slow up future write
> requests until everything is flushed unless I need to.

IIUC, you are right. You can finish WRITE4 before completing FLUSH+WRITE3
here.

We just need to make sure that any request completed by the driver
is on disk by the time FLUSH+WRITE3 completes.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux