Re: [libfdisk]: gpt_write_disklabel function robustness to sudden power off

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 24, 2015 at 03:05:36PM +0100, Ronan CHAUVIN wrote:
>
> On 03/23/2015 07:31 PM, Peter Cordes wrote:
>> On Fri, Mar 20, 2015 at 12:18:12PM +0100, Karel Zak wrote:
>>> Conclusion: be pessimistic and verify all you read from disk and be
>>> optimistic when you write to the disk, and when when someone is talking
>>> about write guaranty and run far away. That's all the story.
>> The whole GPT is what, 16kiB or so?  On most storage, you could
>> force data to persistent storage with a granularity of 4kiB, with
>> fdatasync(2) (assuming that works on block devices, not just files).

> The whole GPT is 16kiB (MBR+GPT header+partition array). There is two  
> GPT systems, one at the beginning and another one at the end. The  
> bootloader verifies the integrity of the header and the partition array  
> with a CRC32.

>>    write() everything, then fsync() so it all hits the disk in
>>
>>   So I'd agree with Karel that the current method is probably
>> ideal.  write() everything, then fsync() so it all hits the disk in
>> one multi-sector write op.  Not necessarily atomic, but probably.
> As the block will not be consecutive (primary and backup), the operation  
> cannot be done in one write operation....

So at least one of the four 4kiB sectors doesn't get written at all?
Because if all the sectors are getting written, regardless of order,
Linux will merge the IOs into one write request to send over the SATA
(or whatever) wire.  Write request merging is useful even on SSDs, so
Linux does it.

 Even if there is a sector that doesn't get written, it's probably
still academic.  Sending a request in a single write OP doesn't make
it atomic.  On a magnetic disk, the data will still probably all
hit the platter on the same rotation, just by powering down the write
head as it flies over the sector you aren't writing, so the window for
a power failure to cause a problem is quite small.  I'm sure SSDs are
far more complicated.

> I agree that we should wait confirmation of a storage expert but the  
> fsync() and sleep() combination should guaranty the operation order on  
> most hardware.

 Probably 1/10th of a second is long enough, but still short enough to
not be annoying.  If you're editting the partition table of a disk
that isn't idle (in which case even 1 sec might not be long enough for
the write to hit disk after fdatasync()), and you don't have the
system on a UPS, I think we maybe don't need to waste 0.9 seconds of
everyone's time just for this hypothetical user.


-- 
#define X(x,y) x##y
Peter Cordes ;  e-mail: X(peter@cor , des.ca)

"The gods confound the man who first found out how to distinguish the hours!
 Confound him, too, who in this place set up a sundial, to cut and hack
 my day so wretchedly into small pieces!" -- Plautus, 200 BC
--
To unsubscribe from this list: send the line "unsubscribe util-linux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux