Re: [PATCH v2 0/7] large atomic writes for xfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Yeah, at the low end, it may make sense to do the 512B write via DIO. But
OTOH sync'ing many redo log FS blocks at once at the high end can be more
efficient.

 From what I have heard, this was attempted before (using DIO) by some
vendor, but did not come to much.

So it seems that we are stuck with this redo log limitation.

Let me know if you have any other ideas to avoid large atomic writes...

 From the description it sounds like the redo log consists of 512b blocks
that describe small changes to the 16k table file pages.  If they're
issuing 16k atomic writes to get each of those 512b redo log records to
disk it's no wonder that cranks up the overhead substantially.

They are not issuing the redo log atomically. They do 512B buffered writes and then periodically fsync.

Also,
replaying those tiny updates through the pagecache beats issuing a bunch
of tiny nonlocalized writes.

For the first case I don't know why they need atomic writes -- 512b redo
log records can't be torn because they're single-sector writes.  The
second case might be better done with exchange-range.


As for exchange-range, that would very much pre-date any MySQL port. Furthermore, I can't imagine that exchange-range support is portable to other FSes, which is probably quite important. Anyway, they are not issuing the redo log atomically, so I don't know if mentioning exchange-range is relevant.

Regardless of what MySQL is specifically doing here, there are going to be other users/applications which want to keep a 4K FS blocksize and do larger atomic writes.

Thanks,
John




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux