On Wed, Jul 30, 2014 at 1:41 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Tue, Jul 29, 2014 at 08:38:16AM -0400, Brian Foster wrote: >> On Tue, Jul 29, 2014 at 10:53:09AM +0200, Frank . wrote: >> > Hello. >> > >> > I just wanted to have more information about the delaylog feature. >> > From what I understood it seems to be a common feature from different FS. It's supposed to retain information such as metadata for a time ( how much ?). Unfortunately, I could not find further information about journaling log section in the XFS official documentation. >> > I just figured out that delaylog feature is now included and there is no way to disable it (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/xfs.txt?id=HEAD). >> > >> >> There is a design document for XFS delayed logging co-located with the >> xfs doc: >> >> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/xfs-delayed-logging-design.txt?id=HEAD > > Or, indeed, here: > > http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs-documentation.git;a=blob;f=design/xfs-delayed-logging-design.asciidoc > >> I'm not an expert on the delayed logging infrastructure so I can't give >> details, but it's basically a change to aggregate logged items into a >> list (committed item list - CIL) and "local" areas of memory (log >> vectors) at transaction commit time rather than logging directly into >> the log buffers. The benefits and tradeoffs of this are described in the >> link above. One tradeoff is that more items can be aggregated before a >> checkpoint occurs, so that naturally means more items are batched in >> memory and written to the log at a time. >> >> This in turn means that in the event of a crash, more logged items are >> lost than the older, less efficient implementation. This doesn't effect >> the consistency of the fs, which is the purpose of the log. > > In a nutshell. > > Basically, logging in XFS is asynchronous unless directed by the > user application, specific operational constraints or mount options > to be synchronous. > >> > Whatever the information it could be, I understood that this is a temporary memory located in RAM. >> > Recently, I had a crash on a server and I had to execute the repair procedure which worked fine. >> > >> >> A crash should typically only require a log replay and that happens >> automatically on the next mount. If you experience otherwise, it's a >> good idea to report that to the list with the data listed here: >> >> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F >> >> > But I would like to disable this feature to prevent any temporary data not to be written do disk. (Write cache is already disabled on both hard drive and raid controller). >> > >> > Perhaps it's a bad idea disabling it. If so, I would like to have your opinion about where memory corruption could happen. >> > >> >> Delayed logging is not configurable these days. The original >> implementation was optional via a mount option, but my understanding is >> that might have been more of a precaution for a new feature than a real >> tuning option. >> >> If you want to ensure consistency of certain operations, those >> applications should issue fsync() calls as appropriate. You could also >> look into the 'wsync' mount option (and probably expect a significant >> performance hit). > > Using the 'wsync' or 'dirsync' mount options effectively cause the > majority of transactions to be synchronous - it always has, even > before delayed logging was implemented - so that once a user visible > namespace operation completes, it is guaranteed to be on stable > storage. This is necessary for HA environments so that failover from > one server to another doesn't result in files appearing or > disappearing on failover... > > Note that this does not change file data behaviour. In this case you > need to add the "sync" mount option, which forces all buffered IO to > be synchronous and so will be *very slow*. But if you've already > turned off the BBWC on the RAID controller then your storage is > already terribly slow and so you probably won't care about making > performance even worse... Dave, excuse my ignorant questions I know the Linux kernel keeps data in cache up to 30 seconds before a kernel daemon flushes it to disk, unless the configured dirty ratio (which is 40% of RAM, iirc) is reached before these 30 seconds so the flush is done before it What I did is lower these 30 seconds to 5 seconds so every 5 seconds data is flushed to disk (I've set the dirty_expire_centisecs to 500). So, are there any drawbacks in doing this? I mean, I don't care *that* much for performance but I do want my dirty data to be on storage in a reasonable amount of time. I looked at the various sync mount options but they all are synchronous so it is my impression they'll be slower than giving the kernel 5 seconds to keep data and then flush it. >From XFS perspective, I'd like to know if this is not recommended or if it is? I know that with setting the above to 500 centisecs means that there will be more writes to disk and potentially may result in tear & wear, thus shortening the lifetime of the storage This is a regular desktop system with a single Seagate Constellation SATA disk so no RAID, LVM, thin provision or anything else What do you think? :) > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs -- Yours truly _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs