On Tue 24-04-18 14:09:53, Robert Dorr wrote: > Hope you are feeling better. Yes :) Thanks. > 1. How can we effect similar changes in ext4? So ext4 currently does not use the iomap infrastructure for direct IO and thus cannot directly use the optimization of using FUA. We'd like to switch ext4 direct IO to use iomap instead of the blkdev_direct_IO() helper. It should be pretty straightforward but not completely trivial. I'll look into it but it will take a while. > 2. How do we get these changes pushed into all common release builds for > both xfs or ext4 optimizations? I'm not sure what are you speaking about here. Do you mean how do you get it to distributions like SLES, RHEL, etc? Generally when they update to new enough kernel version. Which is rather fast (weeks after Linus releases a kernel with this improvement) for community distros such as Fedora or openSUSE Tumbleweed, it will take much longer for enterprise distros (generally only the next major release will pick up a new kernel so a year or two). Honza > -----Original Message----- > From: Robert Dorr > Sent: Thursday, March 22, 2018 7:38 AM > To: Jan Kara <jack@xxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxx>; Dave Chinner <david@xxxxxxxxxxxxx>; Dan Williams <dan.j.williams@xxxxxxxxx>; linux-xfs@xxxxxxxxxxxxxxx; linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>; Theodore Ts'o <tytso@xxxxxxx>; Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>; Scott Konersmann <scottkon@xxxxxxxxxxxxx>; Slava Oks <slavao@xxxxxxxxxxxxx>; Jasraj Dange <jasrajd@xxxxxxxxxxxxx>; Michael Nelson <micn@xxxxxxxxxxxxx> > Subject: RE: [PATCH] [RFC] iomap: Use FUA for pure data O_DSYNC DIO writes > > Sorry to hear that. Get well soon. > > Thanks for the response. > > > -----Original Message----- > From: Jan Kara <jack@xxxxxxx> > Sent: Thursday, March 22, 2018 9:36 AM > To: Robert Dorr <rdorr@xxxxxxxxxxxxx> > Cc: Jan Kara <jack@xxxxxxx>; Christoph Hellwig <hch@xxxxxx>; Dave Chinner <david@xxxxxxxxxxxxx>; Dan Williams <dan.j.williams@xxxxxxxxx>; linux-xfs@xxxxxxxxxxxxxxx; linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>; Theodore Ts'o <tytso@xxxxxxx>; Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>; Scott Konersmann <scottkon@xxxxxxxxxxxxx>; Slava Oks <slavao@xxxxxxxxxxxxx>; Jasraj Dange <jasrajd@xxxxxxxxxxxxx>; Michael Nelson <micn@xxxxxxxxxxxxx> > Subject: Re: [PATCH] [RFC] iomap: Use FUA for pure data O_DSYNC DIO writes > > On Mon 19-03-18 16:14:16, Robert Dorr wrote: > > Awesome news on the spin. > > > > Are you going to be able to make changes to EXT4 to accommodate > > REQ_FUA without generic_write_flush to improve performance like was done for xfs? > > I'm currently on a sick leave so I'm slow on replies. Sorry. I don't see a reason why ext4 could not have the same optimization as XFS for that case of direct IO. However I'm not sure when I get to implementing that. > > Honza > > > -----Original Message----- > > From: Jan Kara <jack@xxxxxxx> > > Sent: Monday, March 19, 2018 11:07 AM > > To: Robert Dorr <rdorr@xxxxxxxxxxxxx> > > Cc: Christoph Hellwig <hch@xxxxxx>; Dave Chinner > > <david@xxxxxxxxxxxxx>; Dan Williams <dan.j.williams@xxxxxxxxx>; > > linux-xfs@xxxxxxxxxxxxxxx; linux-fsdevel > > <linux-fsdevel@xxxxxxxxxxxxxxx>; Jan Kara <jack@xxxxxxx>; Theodore > > Ts'o <tytso@xxxxxxx>; Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>; Scott > > Konersmann <scottkon@xxxxxxxxxxxxx>; Slava Oks <slavao@xxxxxxxxxxxxx>; > > Jasraj Dange <jasrajd@xxxxxxxxxxxxx>; Michael Nelson > > <micn@xxxxxxxxxxxxx> > > Subject: Re: [PATCH] [RFC] iomap: Use FUA for pure data O_DSYNC DIO > > writes > > > > On Tue 13-03-18 18:52:57, Robert Dorr wrote: > > > I tried the RWF_ODSYNC and xfs seems to honor properly but it causes > > > ext4 to go spin out a single CPU and hang the system. 😊 > > > > Yeah, that's a bug in generic direct IO code that I've just recently > > fixed > > (d9c10e5b8863 "direct-io: Fix sleep in atomic due to sync AIO" in 4.16-rc4). > > > > Honza > > > > > > > > > > > -----Original Message----- > > > From: Christoph Hellwig <hch@xxxxxx> > > > Sent: Tuesday, March 13, 2018 11:13 AM > > > To: Robert Dorr <rdorr@xxxxxxxxxxxxx> > > > Cc: Dave Chinner <david@xxxxxxxxxxxxx>; Dan Williams > > > <dan.j.williams@xxxxxxxxx>; linux-xfs@xxxxxxxxxxxxxxx; Christoph > > > Hellwig <hch@xxxxxx>; linux-fsdevel <linux-fsdevel@xxxxxxxxxxxxxxx>; > > > Jan Kara <jack@xxxxxxx>; Theodore Ts'o <tytso@xxxxxxx>; Matthew > > > Wilcox <mawilcox@xxxxxxxxxxxxx>; Scott Konersmann > > > <scottkon@xxxxxxxxxxxxx>; Slava Oks <slavao@xxxxxxxxxxxxx>; Jasraj > > > Dange <jasrajd@xxxxxxxxxxxxx>; Michael Nelson <micn@xxxxxxxxxxxxx> > > > Subject: Re: [PATCH] [RFC] iomap: Use FUA for pure data O_DSYNC DIO > > > writes > > > > > > > I think the answer is the iomap* logic makes sure to issue generic_write_sync for O_DSYNC W1 after W1 is completed by hardware and then waits for completion of the flush request (REQ_PREFLUSH) before W1 is returned to the AIO completion ring, preventing io_getevents from processing W1 before the flush occurs and completes. I just need proper confirmation from the experts on this code that this is the expected behavior. > > > > > > Yes. that is the case. > > > > > > > For SQL Server using O_DIRECT | O_DSYNC on current kernels is very performance impacting. Instead we enable a mode for SQL that opens O_DIRECT only and issues fsync/fdatasync when we are hardening log files or checkpointing data files. This reduces the write, flush, write, flush pattern allowing for write, write, write, ... then flush as we only issue flush requests when required to maintain the data integrity boundaries of SQL Server. The performance is significantly better then the device flush for each write as you can imagine. > > > > > > > > Testing shows the FUA enhancement is better then the write, flush pattern. For SQL Server we want to dynamically open with O_DIRECT | O_DSYNC when REQ_FUA can be properly used and open with O_DIRECT and leverage SQL Server's alternate flush scheme when running on older kernel or a system that does not support FUA (SATA, IDE, ...) > > > > > > There is no really good way to figure this out except for benchmarking, given that the FUA use an implementation detail. > > > > > > Btw, another feature that might be interesting to you is the RWF_DSYNC flag to the pwritev2 syscall, which allows to apply O_DSYNC semantics on a per-I/O basis instead of using it at open time. > > -- > > Jan Kara <jack@xxxxxxxx> > > SUSE Labs, CR > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR