Re: Questions about filesystems from SQLite author presentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 06, 2020 at 05:40:20PM +0200, Amir Goldstein wrote:
> On Mon, Jan 6, 2020 at 9:26 AM Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote:
> > If a write occurs on one or two bytes of a file at about the same time as a power
> > loss, are other bytes of the file guaranteed to be unchanged after reboot?
> > Or might some other bytes within the same sector have been modified as well?
> 
> I don't see how other bytes could change in this scenario, but I don't
> know if the
> hardware provides this guarantee. Maybe someone else knows the answer.

The question is nonsense because there is no way to write less than one
sector to a hardware device, by definition.  So, treating this question
as being a read-modify-write of a single sector (assuming the "two bytes"
don't cross a sector boundary):

Hardware vendors are reluctant to provide this guarantee, but it's
essential to constructing a reliable storage system.  We wrote the NVMe
spec in such a way that vendors must provide single-sector-atomicity
guarantees, and I hope they haven't managed to wiggle some nonsense
into the spec that allows them to not make that guarantee.  The below
is a quote from the 1.4 spec.  For those not versed in NVMe spec-ese,
"0's based value" means that putting a zero in this field means the
value of AWUPF is 1.

  Atomic Write Unit Power Fail (AWUPF): This field indicates the size of
  the write operation guaranteed to be written atomically to the NVM across
  all namespaces with any supported namespace format during a power fail
  or error condition.

  If a specific namespace guarantees a larger size than is reported in
  this field, then this namespace specific size is reported in the NAWUPF
  field in the Identify Namespace data structure. Refer to section 6.4.

  This field is specified in logical blocks and is a 0’s based value. The
  AWUPF value shall be less than or equal to the AWUN value.

  If a write command is submitted with size less than or equal to the
  AWUPF value, the host is guaranteed that the write is atomic to the
  NVM with respect to other read or write commands. If a write command
  is submitted that is greater than this size, there is no guarantee of
  command atomicity. If the write size is less than or equal to the AWUPF
  value and the write command fails, then subsequent read commands for the
  associated logical blocks shall return data from the previous successful
  write command. If a write command is submitted with size greater than
  the AWUPF value, then there is no guarantee of data returned on
  subsequent reads of the associated logical blocks.

I take neither blame nor credit for what other storage standards may
implement; this is the only one I had a hand in, and I had to fight
hard to get it.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux