Re: [PATCH 1/2] block: Add support for atomic writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Jeff Moyer (2013-11-07 11:14:02)
> Chris Mason <chris.mason@xxxxxxxxxxxx> writes:
> 
> >> Well, we have control over dm and md, so I'm not worried about that.
> >> For the storage vendors, we'll have to see about influencing the
> >> standards bodies.
> >> 
> >> The way I see it, there are 3 pieces of information that are required:
> >> 1) minimum size that is atomic (likely the physical block size, but
> >>    maybe the logical block size?)
> >> 2) maximum size that is atomic (multiple of minimum size)
> >> 3) whether or not discontiguous ranges are supported
> >> 
> >> Did I miss anything?
> >
> > It'll vary from vendor to vendor.  A discontig range of two 512KB areas
> > is different from 256 distcontig 4KB areas.
> 
> Sure.
> 
> > And it's completely dependent on filesystem fragmentation.  So, a given
> > IO might pass for one file and fail for the next.
> 
> Worse, it could pass for one region of a file and fail for a different
> region of the same file.
> 
> I guess you could export the most conservative estimate, based on
> completely non-contiguous smallest sized segments.  Things larger may
> work, but they may not.  Perhaps this would be too limiting, I don't
> know.

Depends on the workload.  For mysql, they really only need ~16KB in the
default config.  I'd rather not restrict things to that one case, but
it's pretty easy to satisfy.

> 
> > In a DM/MD configuration, an atomic IO inside a single stripe on raid0
> > could succeed while it will fail if it spans two stripes to two
> > different devices.
> 
> I'd say that if you are spanning multiple devices, you don't support
> O_ATOMIC.  You could write a specific dm target that allows it, but I
> don't think it's a priority to support it in the way your example does.
> 
> Given that there are applications using your implementation, what did
> they determine was a sane way to do things?  Only access the block
> device?  Preallocate files?  Fallback to non-atomic writes + fsync?
> Something else?

Admin opt-in on single drives only.  mysql exits with errors if the
atomics aren't supported.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux