Re: O_DIRECT and barriers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Aug 22, 2009 at 01:50:06AM +0100, Jamie Lokier wrote:
> Oh, I agree with that.  That comes from observing that quasi-portable
> code using O_DIRECT needs to use O_DSYNC too because several OSes and
> filesystems on those OSes revert to buffered writes under some
> circumstances, in which case you want O_DSYNC too.  That has nothing
> to do with hardware caches, but it's a lucky coincidence that
> fdatasync() would form a nice barrier function, and O_DIRECT|O_DSYNC
> would then make sense as an FUA equivalent.

I agree.  I do however fear about everything using O_DIRECT that is
around now.  Less so about the databases and HPC workloads on expensive
hardware because they usually run on vendor approved scsi disks that
have the write back cache disabled, but rather things like
virtualization software or other things that get run on commodity
hardware.

Then again they already don't get what they expect and never did,
so if we clear document and communicate the O_SYNC (that is Linux
O_SYNC) requirement we might be able to go with this.

> Perhaps in the same way that fsync/fdatasync aren't clear on disk
> cache behaviour either.  On Linux and some other OSes.

The disk write cache really is an implementation detail, it has no
business in Posix.

Posix seems to define the semantics for fdatasync and cor relatively
well (that is if you like the specification speak in there):

"The fdatasync() function forces all currently queued I/O operations
 associated with the file indicated by file descriptor fildes to the
 synchronised I/O completion state."

"synchronised I/O data integrity completion

 o For read, when the operation has been completed or diagnosed if
   unsuccessful. The read is complete only when an image of the data has
   been successfully transferred to the requesting process. If there were
   any pending write requests affecting the data to be read at the time
   that the synchronised read operation was requested, these write
   requests shall be successfully transferred prior to reading the
   data."
 o For write, when the operation has been completed or diagnosed if
   unsuccessful. The write is complete only when the data specified in the
   write request is successfully transferred and all file system
   information required to retrieve the data is successfully transferred."

Given that it talks about data retrievable an volatile cache does not
seem to meet the above criteria.  But yeah, it's a horrible language.

> What does IRIX do?  Does O_DIRECT on IRIX write through the drive's
> cache?  What about Solaris?

IRIX only came pre-packaged with SGI MIPS systems.  Which as most of
the more expensive hardware was not configured with write through
caches.  Which btw is still the case for all more expensive hardware
I have.  The whole issue with volatile write back cache is just too
much of a data integrity nightmare as that you would enable it where
your customers actually care about their data.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux